HTTP cache is very critical for front-end performance optimization. Reading data from the cache and directly requesting data from the server are completely different. What we are most familiar with is the HTTP server response returning status code 304. 304 means telling the browser that there is cached data locally and it can be obtained directly from the local without wasting time obtaining it from the server. As for why it is cached, how to hit the cache, and when the cache takes effect, we rarely understand it in actual development. Today, Xiaolu uses animation to understand the HTTP cache mechanism and principle from the root.
Why is there a cache? Speaking purely from a computer perspective is a bit abstract, so let's look at a practical example. For example, we usually like to put unfinished books on the bookshelf, and put the finished and unread books in a box. If we store all the books in a box, we have to look for them in the box every time we want to read a book, which is very troublesome and time-consuming (the box here can be imagined as a server). When we start reading a new book, we take it out of the box for the first time, read half of it, and then put it directly on the bookshelf. When we read the book next time, we take it directly from the bookshelf. The bookshelf here is the cache (a cache warehouse) we will talk about below. The cached turtle The process from when the browser sends a request to when the data request comes back is like the book retrieval process mentioned above. When loading resources, the browser determines whether to hit the strong cache based on the Expires and Cache-control in the request header. If so, the browser reads the resource directly from the cache without sending a request to the server. If the strong cache is not hit, the browser will definitely send a request to the server to verify whether the resource hits the negotiated cache through Last-Modified and Etag. If it hits, the server will return the request, but will not return the data of the resource, and will still read the resource from the cache. If neither of the previous two are hit, load the resource directly from the server. Animation HTTP Cache Classification As mentioned above, HTTP has a "turtle" rule, which is divided into strong cache and negotiated cache based on whether the browser initiates a request to the server. 1. Strong Cache Strong caching means caching without initiating requests to the server, that is, local forced caching. When the browser wants to obtain specific data, it will first check whether the data exists in the local cache. If it exists, it will directly obtain it locally. If not, it will request the data from the server. The detailed request process is shown in the following animation: So the question is, if we want to use strong caching, how do we determine when the cached data is invalid? When the browser requests data from the server, the server returns the data and cache rules. In the response header, there are two fields: Expires and Cache-Control. (1) Expires
The Expires field in the response header means the cache expiration timestamp of the currently returned data. When the browser makes a request, it will compare the local browser time with this time to determine whether the resource has expired. But there is a problem with the above. If I manually change the computer time, then there will be problems. This is also a problem in HTTP1.0. (2) Cache-Control To solve this problem, the Cache-Control field was added to HTTP1.1.
The server and the client say that this resource cache can only exist for 7200 seconds. During this period of time, you can get the resource from the cache. If both Expire and Cache-control appear at the same time, Cache-control takes precedence. In addition, cache-control has other fields that can be used.
2. Negotiation Cache When the browser requests data for the first time, the server returns the cache identifier and the data to the client, and the client backs up both in the cache database. When requesting data again, the client sends the backup cache identifier to the server. The server makes a judgment based on the cache identifier. If the judgment is successful, it returns a 304 status code to inform the client that the request is successful and the cached data can be used.
How to identify the negotiated cache? Mainly through the Last-Modified, If-Modified-Since, ETag, and If-None-Match fields in the message header. (1) Last-Modified The Last-Modified field means the last modification time of the server resource. When requesting the server for the first time, the server header field can add this field to set the negotiation cache.
When the browser initiates the request again, the header field adds the If-Modified-Since local timestamp field and sends it to the server.
After receiving the request, the server compares the value of the If-Modified-Since field with its own expiration time. If the value in the request header is less than the last modified time, a 304 response is returned to let the browser retrieve the data from the local browser cache. If the time expires, a new Last-Modified value is added to the Response Headers and returned to the browser. However, Last-Modified has a limitation, which includes the following two situations:
(2) ETag ETag stands for identification string. Due to the defects of the Last-Modified field mentioned above, we encode the content of the resource in HTTP/1.1. As long as the content is changed, the encoding will be different. Similar to the above request principle, the browser initiates the request for the first time, and then the server returns an identification string in the response header.
The browser makes another request with a string that has the same value.
When the server receives the string, it will compare it. If they are the same, it will read the local cache. Otherwise, it will return the new resource to the browser. Cache location The cache locations are based on the priority of resource requests, and the cache locations are as follows:
(1) Memory Cache Memory is the memory cache, which is the first cache that the browser tries to hit and the fastest to respond to. However, it has the shortest survival time. When the process ends and the tab is closed, the cache no longer exists. Because the memory space is relatively small, usually smaller resources are placed in the memory cache, such as base64 images and other resources. (2) Service Worker Service Worker is a Javascript thread that is independent of the main thread. It is separated from the browser window and therefore cannot directly access the DOM. It can help us realize offline caching, message push and network proxy functions. (3) Disk Cache The priority of memory means that large files cannot be cached in memory, but disk cache is different. Although the storage efficiency is slower than memory cache, it has advantages in storage capacity and storage market. (4) Push Cache It is the last cache hit and belongs to HTTP2 content. If you are interested, you can learn about it first. |
>>: Global spending on 5G network infrastructure nearly doubled in 2020
This article is organized as follows: Cookies and...
5G is finally here. But what are the benefits of ...
[[188583]] "E-government is currently in a p...
The networking technology industry is in a consta...
The fifth generation of mobile communication tech...
Megalayer's regular VPS half-price promotion ...
What to do on the weekend? The weather is so cold...
New network deployments and enterprise momentum a...
On November 15, the "Huawei Smart City Summi...
In the modern Internet era, the highest productiv...
With the empowerment of 5G+AI in the security ind...
The American Forbes website recently published an...
5G is 100 times faster than today’s mobile 4G, an...
Tribe once shared information about Edgevirt in J...