HTTP caching is enough to read this article

HTTP caching is enough to read this article

Introduction

HTTP caching mechanism is an important means to optimize web performance and an important part of optimizing user experience. Understanding and being familiar with HTTP caching mechanism has become an essential skill for front-end workers.

HTTP caching is a technique used to temporarily store web resources (such as HTML pages, images, etc.) to reduce server latency. The HTTP caching system saves copies of documents that pass through it; if certain conditions are met, subsequent requests can be satisfied from the cache. The HTTP caching system can refer to both a device and a computer program.

1. Categories of HTTP Cache

HTTP cache can be divided into mandatory cache and negotiated cache.

Forced caching: Use the client cache directly, do not pull new resources from the server, and do not verify whether the cached resources are expired. The returned status code is 200 (OK).

Negotiated cache: Verify the validity of resources through the server. If the resource is valid, return 304 (Not Modified). If the resource is invalid, return the latest resource file.

There are three mainstream versions of HTTP: HTTP/1.0, HTTP/1.1, and HTTP/2.0. Among them, HTTP/1.0 and HTTP/1.1 are the most widely used. HTTP/2.0 is different from HTTP/1.0 and HTTP/1.1 because of the changes in the cache mechanism, so the relevant content of HTTP/2.0 will be introduced in the summary section at the end of this article. HTTP/1.0 and HTTP/1.1 can be distinguished according to the cache category as follows:

HTTP Version

Force caching

Negotiation Cache

HTTP/1.0

Expires

Last-Modified

HTTP/1.1

Cache-Control

ETag

2. Mainstream HTTP Cache Parameters

2.1 Forced caching

2.1.1 HTTP/1.0 - Expires

The value of Expires​ is the expiration time returned by the server, which is an absolute time in GMT​ (Greenwich Mean Time), such as: Tue, 17 Jan 2023 03:48:45 GMT​. The next time the client requests, it determines whether the current system GMT​ time is less than the GMT time carried in the cache. If it is less, the cached data is used directly, otherwise a new file is requested from the server.

However, the problems with Expires are also obvious.

First, compare the GMT time obtained by the client with the server GMT time. If the client actively modifies the system time, a cache hit error will occur.

Secondly, GMT time is based on the Greenwich Observatory's time measurement, and the time adjustment information is sent to the world every hour. The errors in the observation itself and the non-real-time synchronization mechanism may lead to cache hit errors.

So in HTTP/1.1, use max-age in Cache-Control instead.

2.1.2 HTTP/1.1 - Cache-Control

Cache-Control is an important cache rule in HTTP/1.1. It can be used in HTTP request headers and response headers, and provides a variety of configuration parameters. It can also be applied to a wider range of complex scenarios.

The instruction format has the following effective rules:

  • It is not case sensitive, but lowercase is recommended.
  • Multiple directives are separated by commas.
  • Has optional arguments, which can be passed in either token or quoted string syntax.

Commonly used instructions are as follows:

  • no-store: Do not use any form of cache. Has the highest priority for HTTP cache.

  • no-cache: Do not use mandatory cache. Verify cache validity with the server before each response.

  • public: Public cache. Any node from the origin server to the client can cache the resource.
  • Private: Private cache. Only the client can cache the resource.
  • max-age: The maximum time for client cache storage, in seconds. The judgment priority is higher than Expires​. The client will judge whether the resource has been cached for less than the set max-age​. If it is, the cached data will be used directly, otherwise the Expires judgment process will be carried out.

  • s-maxage: The maximum cache time of the proxy cache server, in seconds. It has a higher priority than max-age and Expires and is only applicable to cache servers.

2.2 Negotiation Cache

When the client cache is invalid, it will verify the cache validity with the server. This cache validity verification process is called negotiated cache. If the resource is valid, 304 (Not Modified) is returned. After the client gets the 304 status code, it will get the resource from the local cache. The entire request response process is the same as the no-cache process. Compared with the no-cache process, the advantage is that after only responding to the status code, the client directly gets the file from the local cache without downloading the file. The file size of the network response is reduced, thereby speeding up the network response speed.

The request and response of the negotiation cache need to cooperate with each other and can be used in combination. See the following table:

Version/Phase

ask

response

HTTP/1.0

If-Modified-Since/If-Unmodified-Since

Last-Modified

HTTP/1.1

If-None-Match/If-Match

ETag

Negotiation cache will first check whether the request header contains no-store. If so, it will directly return the latest server file.

2.2.1 HTTP/1.0 - Last-Modified

When the client requests a resource from the server for the first time, the server will return the resource. At the same time, the Last-Modified​ field will be added to the response header to indicate the last modification time of the resource. When the client forces the cache to expire, it will re-verify the cache validity with the server. In the verified request header, the If-Modified-Since​ field will be added. The server will compare the If-Modified-Since​ in the request header with the Last-Modified​ of the resource it stores. If the time of If-Modified-Since​ is not less than Last-Modified​, the resource is valid and 304 (Not Modified)​ is returned. Otherwise, the resource itself is returned and the Last-Modified of the file is re-recorded.

Last-Modified: The time when the resource was last modified as carried in the response header. The format is last-modified:GMT.

 For example: last - modified : Sat , 14 Jan 2023 08 : 40 : 00 GMT

If-Modified-Since​: Whether the resource carried in the request header has been modified since a certain time. The server will use this value to compare with the time stored in itself. The format is: If-Modified-Since:GMT​. It can only be used in GET​ or HEAD requests.

If-Unmodified-Since​: Whether the resource carried in the request header has not been modified after a certain time. The format is: if-unmodified-since:GMT​. Different from If-Modified-Since, If-Unmodified-Since​ is used for POST​ or other non-simple requests. If there is any modification within the time specified by If-Unmodified-Since​, 412 (Precondition Failed) is returned.

Last-Modified also has serious problems.

First, Last-Modified only focuses on the last modification time of the file, and has nothing to do with the file content. Therefore, if the file content is modified and then restored, the last modification time of the file will also change. At this time, the client request cannot use the cache.

Secondly, Last-Modified​ can only monitor file modifications at the second level. If the file is modified multiple times within 1 second, the Last-Modified​ time returned by the response header will remain unchanged. At this time, the client will receive a 304 response, which will cause the resource to be unable to be updated in time and use the cached resource file.

Therefore, HTTP/1.1 uses ETag for cache negotiation.

2.2.1 HTTP/1.1 - ETag

In order to solve the inaccuracy problem of Last-Modified mentioned above, HTTP/1.1 introduced a new response field ETag to negotiate cache. ETag has a higher priority than Last-Modified.

After receiving the browser request, the server will first compare the If-None-Match with the ETag value. If they are equal, the resource is valid and 304 (Not Modified) is returned. Otherwise, the resource itself is returned and the file's ETag is re-recorded.

ETag: The resource identifier carried in the response header. The format is ETag:ETag-value. It can be generated by the server's own algorithm, usually using a hash of the content or simply using a version number.

 For example: etag : "I82YRPyDtSi45r0Ps/eo8GbnDfg="

If-None-Match​: Whether there is no matching file field in the request header. It has a higher priority than Last-Modified​. When the server does not have any resource whose ETag​ is exactly the same as the ETag value carried in the request header, the latest resource is returned, otherwise the server will return 304.

 For example : if - none - match: "I82YRPyDtSi45r0Ps/eo8GbnDfg="

If-Match​: The request header contains the field that indicates whether there is a matching file. For simple requests, it needs to be used with the Range​ header. For non-simple requests, such as PUT​, it can be used to upload ETag.

 For example : if - match: "I82YRPyDtSi45r0Ps/eo8GbnDfg="

Conclusion

From the previous article, we know that HTTP cache is mainly divided into: mandatory cache and negotiated cache. Mandatory cache is controlled by Exipres (HTTP/1.0) and Cache-Control (HTTP/1.1). The client directly reads the local cache and does not interact with the server again. The status code is 200.

The negotiation cache is verified by Last-Modified / If-Modified-Since (HTTP/1.0) and Etag /If-None-Match (HTTP/1.1). Each request requires the server to determine whether the resource has been updated to decide whether the client should use the cache. If so, it returns 304, otherwise it returns the latest file.

A new caching method, Push Server, is designed in HTTP/2.0. Different from mandatory caching and negotiated caching, it belongs to push caching. This new caching method is mainly to solve the problem of client cache timeliness, that is, the server pushes various resources to the client before receiving the client's request. For example, the client only requested a.html, but the server sent a.html, a.css, and a.png to the client. In this way, the client only needs one request to update the cache of all files, which improves the timeliness of the cache.

refer to:

GMT (Wikipedia): https://en.wikipedia.org/wiki/Greenwich_Mean_Time

HTTP Caching (MDN): https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Caching


<<:  Physical layer security technology for industrial wireless networks

>>:  Can this be considered? TCP is awesome.

Blog    

Recommend

80VPS: 350 yuan/month Korean server 2*E5-2450L/8GB/1TB/10M CN2/support upgrade

A few days ago, we shared the promotional VPS inf...

Riverbed officially releases SaaS solutions for on- and off-cloud

The hottest word in the technology field in 2016 ...

From "4G+5G" to "5G+5G": Dual SIM dual standby evolves quickly

Driven by market demand and the country's &qu...

5G and machine learning: Transforming cell towers from smart to genius

[[335632]] 5G ushers in new “genius” networks to ...

Why edge computing is central to the development of the Internet of Things

Many connected devices today are able to take adv...

Wi-Fi Sense: Your home's next sensor may not be a sensor

Part 01 How Wi-Fi Sensing Works Wi-Fi sensing is ...

iWebFusion: $9.38/month-4GB/30GB/2TB/Los Angeles & North Carolina data centers

iWebFusion (iWFHosting) was founded in 2001. It i...

What is Intelligent Edge Computing?

You’ve heard of edge computing. You may have even...