How is the ETag value in the HTTP response header generated?

How is the ETag value in the HTTP response header generated?

The generation of etag needs to meet several conditions, at least loosely meet

  1. When a file changes, the etag value must change.
  2. Try to be as computationally efficient as possible and not particularly CPU intensive. Using digest algorithms (MD5, SHA128, SHA256) in this way should be considered carefully because they are CPU intensive.
  3. It must be expanded horizontally, and the etag values ​​generated on multiple server nodes must be consistent during distributed deployment. This eliminates the need for child inodes.

The above conditions are theoretical conditions for validity. How should they be handled in actual practice?

Let's see how it is done in nginx

ETag generation in nginx

I looked through the nginx source code and translated it into pseudo code as follows: concatenated from last_modified and content_length

 etag = header .last_modified + header .content_lenth

The source code is located at: ngx_http_core_modules.c

 etag -> value .len = ngx_sprintf ( etag -> value .data , "\"%xT-%xO\"" ,
r -> headers_out .last_modified_time ,
r -> headers_out .content_length_n )
-etag- > value.data ;

Summary: In nginx, etag is a hexadecimal combination of the Last-Modified and Content-Length fields in the response header.

Just find an nginx service in my k8s cluster to test it

 $ curl --head 10.97.109.49
HTTP / 1.1 200 OK
Server : nginx / 1.16.0
Date : Tue , 10 Dec 2019 06:45:24 GMT
Content - Type : text / html
Content - Length : 612
Last - Modified : Tue , 23 Apr 2019 10 : 18 : 21 GMT
Connection : keep - alive
ETag : "5cbee66d-264"
Accept - Ranges : bytes

Calculate Last-Modified and Content-Length from etag, use js to calculate as follows, the results are consistent

 > new Date ( parseInt ( '5cbee66d' , 16 ) * 1000 ) .toJSON ( )
"2019-04-23T10:18:21.000Z"
> parseInt ( '264' , 16 )
612

ETag algorithm in Nginx and its shortcomings

Negotiation cache is used to calculate whether the resource returns 304. We know that there are two ways to negotiate cache

  • Last-Modified/if-Modified-Since
  • ETag/If-None-Match

Since ETag in nginx consists of Last-Modified and Content-Length, it is considered an enhanced version of Last-Modified. So where is the enhancement?

Last-Modified is represented by a unix timestamp, which means it can only act on changes in seconds, while ETag in nginx adds an additional condition of file size.

The next question is: if the ETag value in the http response header changes, does it mean that the file content has definitely changed?

Answer: No.

Therefore, using nginx to calculate 304 has certain limitations: the file is modified within 1 second and the file size remains unchanged. However, the probability of this happening is extremely low, so under normal circumstances, an imperfect but efficient algorithm can be tolerated.

The article comes from: Front-end Restaurant. If you wish to reprint this article, please contact the Front-end Restaurant ReTech Toutiao account.

github: https://github.com/zuopf769

<<:  Distributed ID Solution Detailed Explanation

>>:  Seven steps to easy network segmentation

Recommend

Why use MAC address when we have IP address?

IP address and MAC address are both very importan...

More than 1,100 projects! These fields are being quietly changed by 5G

The number of terminal connections exceeds 180 mi...

Deep dive into the Kubernetes network model and network communication

Kubernetes defines a simple and consistent networ...

China Radio and Television faces three major challenges on its 5G journey

Since the Ministry of Industry and Information Te...

Blockchain, IoT and 5G

5G networks are starting to roll out across the U...

Network Access Control-Network Address Translation

With the development of the Internet and the incr...

What is Intelligent Edge Computing?

You’ve heard of edge computing. You may have even...

How to use gdb to accurately locate deadlock problems in multithreading

[[337631]] This article is reprinted from the WeC...

How Fiber Optic Cable Helps Data Centers Save Money

In a data center, reliable fiber optic cables are...

Is wireless mesh networking viable for the enterprise?

Wireless mesh networks have gained widespread att...

WebHorizon: $10.56/year-256MB/5G SSD/200GB/Japan VPS

WebHorizon is a foreign VPS hosting company estab...

In the DT era, what is the trend of data center cabling?

As enterprises realize that structured cabling is...