BirthWhen talking about http, we must first understand the World Wide Web, abbreviated as WWW. WWW is a technical integration based on the client <=> server method of 'using links to jump sites' and 'transmitting Hypertext Markup Language (HTML)'. On a midsummer night in 1989, Tim Berners-Lee successfully developed the world's first Web server and the first Web client. At that time, all they could make was an electronic phone book. HTTP (HyperText Transfer Protocol) is the basic protocol of the World Wide Web, which establishes the communication rules between browsers and servers. Commonly used networks (including the Internet) operate on the basis of the TCP/IP protocol suite, and HTTP is a subset of it. HTTP continues to implement more functions and has evolved from HTTP 0.9 to HTTP 3.0. HTTP/0.9HTTP was not established as a standard when it first came out, and it was officially established as a standard in the HTTP/1.0 protocol announced in 1996. Therefore, the protocol before that was called HTTP/0.9. The request consists of only one line and only one GET command, followed by the resource path. GET /index.$html$ The response contains only the file contents itself. < html > HTTP/0.9 has no concept of header or content-type, and can only deliver HTML files. Also, since there is no status code, when an error occurs, it is handled by delivering back an HTML file containing an error description. HTTP/1.0With the rapid development of Internet technology, HTTP protocol is being used more and more widely. The limitations of the protocol itself can no longer meet the diversity of Internet functions. Therefore, in May 1996, HTTP/1.0 was born, and its content and functions were greatly increased. Compared with HTTP/0.9, the new version includes the following functions:
GET /index.html HTTP / 1.0 contentSimple text pages naturally cannot meet the needs of users, so 1.0 added more file types:
It can also be used in HTML. < meta http - equiv = "Content-Type" content = "text/html; charset=UTF-8" /> Content-encodingSince it supports sending any data format, the data can be compressed before sending. HTTP/1.0 introduces Content-Encoding to indicate the compression method of data.
The client sends a request with the message "I can accept both gzip and deflate compression". Accept - Encoding : gzip , deflate The server provides the actual compression mode used in the Content-Encoding response header. Content - Encoding : gzip HTTP/1.0 Disadvantages
HTTP/1.1HTTP/1.1 was released just a few months after HTTP/1.0 was announced. So far, HTTP1.1 has been the mainstream version, so much so that no new HTTP protocol version has been released in the next 10 years. Compared with the previous version, the main updates are as follows:
keep-aliveSince the process of establishing a connection requires the DNS resolution process and the TCP three-way handshake, the resources and time required to continuously establish and disconnect connections to obtain resources from the same server are huge. In order to improve the efficiency of the connection, the timely emergence of HTTP/1.1 added long connections to the standard and made them the default implementation. The server also maintains the client's long connection status in accordance with the protocol. Multiple resources on a server can be obtained through this connection with multiple requests. The following information can be introduced in the request header to tell the server not to close the connection after completing a request. Connection : keep - alive The server will also reply with the same message indicating that the connection is still valid, but at the time this was just a custom behavior of the programmer and was not included in the standard in 1.0. This improvement almost doubles the efficiency of communication. This also lays the foundation for pipelining. PipelineHTTP/1.1 attempts to solve the performance bottleneck through HTTP pipelining technology, which gave birth to the pipeline mechanism. As shown in the figure, the next request can only be made after the result of each response is returned, and it is changed to a technology that multiple http requests can be sent continuously on one connection without waiting for the response. Unfortunately, because HTTP is a stateless protocol, a large or slow response will still block all subsequent requests. Each request cannot know which response is returned to it, and the server can only return responses in order. This is head-of-line blocking, which causes this feature to be turned off by default in mainstream browsers. This problem will be solved in http2.0. Host header fieldIn HTTP1.0, it is assumed that each server is bound to a unique IP address. Therefore, the URL in the request message does not pass the hostname. The host newly added in 1.1 is used to handle the situation of multiple virtual hosts on one IP address. A new Host field was added to the request header field, which is used to specify the domain name of the server. With the Host field, different websites can be built on the same server, which also laid the foundation for the subsequent development of virtualization. Host : www.alibaba-inc.com Cache mechanismCache can not only improve the user's access rate, but also greatly save traffic for users on mobile devices. Therefore, many new header fields related to cache are added in HTTP/1.1, and a more flexible and richer cache mechanism is designed around these header fields. The problems that the cache mechanism needs to solve include:
chunked mechanismAfter the link is established, the client can use the link to send multiple requests. Users usually use the Content-Length returned in the response header to determine the size of the data returned by the server. However, with the continuous development of network technology, more and more dynamic resources are being introduced. At this time, the server cannot know the size of the resources to be transmitted before transmission, and cannot inform the user of the resource size through Content-Length. The server can dynamically generate resources while passing them to the user. This mechanism is called "Chunked Transfer Encoding", which allows the data sent by the server to the client to be divided into multiple parts. At this time, the server needs to add the "Transfer-Encoding: chunked" header field in the header to replace the traditional "Content-Length. Transfer - Encoding : chunked HTTP caching mechanismCompared with HTTP 1.0, HTTP 1.1 adds several new caching mechanisms: Strong Cache Strong cache is the cache that the browser hits first and is the fastest. When we see (from memory disk) after the status code, it means that the browser reads the cache from the memory. When the process ends, that is, when the tab is closed, the data in the memory will no longer exist. Only when the strong cache is not hit will the negotiated cache be searched. Pragma The Pragma header field is a product of HTTP/1.0. It is currently defined only for backward compatibility with HTTP/1.0. It now only appears in the request header, indicating that all intermediate servers are required not to return cached resources, which has the same meaning as Cache-Control: no-cache. Pragma : no - cache Expires Expires only appears in the response header field, indicating the timeliness of the resource. When a request occurs, the browser will compare the Expires value with the local time. If the local time is less than the set time, the cache is read. Expires values are in standard GMT format: Expires : Wed , 21 Oct 2015 07:28:00 GMT It should be noted here that: when both Cache-Control: max-age=xx and Expires exist in the header, the time of Cache-Control: max-age shall prevail. Cache-Control Due to the limitations of Expires, Cache-Control came into play. Here are some commonly used fields:
Negotiation Cache When the browser does not hit the strong cache, it will hit the negotiation cache, which is controlled by the following HTTP fields. Last-Modified When the server transmits the resource to the client, it will return the last modification time of the resource in the form of Last-Modified: GMT in the entity header. Last - Modified : Fri , 22 Jul 2019 01 : 47 : 00 GMT After receiving the resource information, the client will mark it. The next time it requests the resource, it will bring the time information to the server for inspection. If the passed value is consistent with the value on the server, it will return 304, indicating that the file has not been modified. If the time is inconsistent, it will re-request the resource and return 200. Priority Strong cache --> Negotiated cache Cache-Control -> Expires -> ETag -> Last-Modified. Five new request methods have been addedOPTIONS: A pre-request made by the browser to determine the security of cross-origin resource requests. PUT: Data sent from the client to the server replaces the contents of the specified document. DELETE: Request the server to delete the specified page. TRACE: Echoes the requests received by the server, mainly used for testing or diagnosis. CONNECT: The HTTP/1.1 protocol is reserved for proxy servers that can change the connection to a pipeline mode. Added a series of status codesYou can refer to the complete list of status codes Http1.1 Defects
HTTP/2.0Web pages have become more complex over time. Some of them are even applications themselves. More visual media is displayed, and the number and size of scripts that increase interactivity have also increased. More data is transmitted through more HTTP requests, which brings more complexity and overhead to HTTP/1.1 connections. For this reason, Google implemented an experimental protocol SPDY in the early 2010s. Given the success of SPDY, HTTP/2 also adopted SPDY as a blueprint for the entire solution. HTTP/2 was officially standardized in May 2015. Differences between HTTP/2 and HTTP/1.1:
Header CompressionHTTP1.x headers carry a lot of information and need to be sent repeatedly each time. HPACK, which is tailor-made for HTTP/2, is an extension of this idea. It uses an index table to define commonly used HTTP headers, and both communicating parties cache a header fields table, which not only avoids the transmission of duplicate headers, but also reduces the size of the required transmission. It seems that the format of the protocol is completely different from HTTP1.x. In fact, HTTP2 does not change the semantics of HTTP1.x. It just re-encapsulates the original HTTP1.x header and body with a frame. MultiplexingIn order to solve the head-of-line blocking problem in HTTP/1.x, HTTP/2 proposes the concept of multiplexing, which treats a request/response as a stream, and divides a stream into multiple types of frames (such as header frame, data frame, etc.) according to the load. Frames belonging to different streams can be mixed and sent on the same connection, thus realizing the function of sending multiple requests at the same time. Multiplexing means that head-of-line blocking will no longer be a problem. Although HTTP/2 solves the head-of-line blocking problem at the HTTP layer through multiplexing, head-of-line blocking at the TCP layer still exists. server pushThe service can actively send messages to the client. When the browser just requests HTML, the server will actively send static resources such as JS, CSS and other files that have certain relevance to the client, so that the client can directly load these resources from the local without having to request them again through the network, thereby saving the browser from sending request requests. Using Server Push Link : </css/styles.css > ; rel = preload ; as = style, You can see that the push status in the server initiator indicates that this is an active push by the server. Actively pushed files will inevitably bring redundant files or files that the browser already has. The client uses a concise Cache Digest to tell the server what is already in the cache, so the server will also know what the client needs. flowThe server and client use independent bidirectional sequences to exchange frame data within an HTTP/2 connection. HTTP/2 virtualizes multiple streams on a single TCP connection. Multiple streams multiplex a TCP connection in order to make rational use of the transmission link and optimize the transmission performance within limited resources. All communications are established over a single TCP connection, which can carry a large number of bidirectional traffic streams. Each stream has a unique flag and priority. Each message is a logical request and response message, consisting of one or more frames. Frames from different streams can be associated and assembled using flags in the frame header. The concept of stream is proposed to achieve multiplexing, which is to transmit data of multiple business units simultaneously on a single connection. Binary Frame LayerIn HTTP/1.x, users establish multiple TCP connections to improve performance, which can lead to head-of-line blocking and instability of important TCP connections. The binary frame layer in HTTP/2 allows request and response data to be split into smaller frames, and they are encoded in binary (http1.0 is based on text format). Multiple frames can be sent out of order and can be reassembled according to the stream in the frame header (for example, each stream has its own id). Obviously, this is very friendly to binary computers. There is no need to convert the received plaintext message into binary. Instead, the binary message can be directly parsed to further improve the efficiency of data transmission. Each frame can be regarded as a student, and a stream is a group (the stream identifier is the attribute value of the frame). Students in a class (a connection) are divided into several groups, and each group is assigned different specific tasks. Multiple group tasks can be executed in parallel in the class at the same time. Once a group task is time-consuming, it will not affect the normal execution of other group tasks. Finally, let's take a look at the improvements brought by http2 under ideal conditions. shortcoming
HTTP/3.0 (HTTP-over-QUIC)Under limited conditions, it is quite difficult to solve the head-of-line blocking problem under TCP. However, with the explosive development of the Internet, higher stability and security need to be met. In November 2016, Google held the first QUIC (Quick UDP Internet Connections) working group meeting of the Internet Engineering Task Force (IETF) to develop a low-latency Internet transport layer protocol based on UDP. HTTP-over-QUIC was renamed HTTP/3 in November 2018. 0-RTT handshakeIn TCP, the client sends a SYN packet (SYN seq = x) to the server. The server receives it and needs to send a (SYN seq = y; ACK x+1) packet to the client. The client sends an ACK packet (seq = x+1; ack=y+1) to the server. At this point, the client and server enter the ESTABLISHED state and complete the three-way handshake. 1-RTT
sequenceDiagram Therefore, the key here is the ECDH algorithm. a and b are the private keys of the client and server, which are not public. Even if A and X are known, a cannot be derived through the formula A = a*X, which ensures the security of the private key. 0-RTT 0-RTT means that the client caches ServerConfig (B=b*X), and the next time a connection is established, the cached data is used to calculate the communication key: sequenceDiagram
The client can send application data directly through the cached B to generate the key without going through the handshake. Let's think about another question: suppose the attacker records all communication data and public parameters A1 and A2. Once the server's random number b (private key) is leaked, all previous communication data can be cracked. To solve this problem, a new communication key needs to be created for each session to ensure forward security. Orderly deliveryQUIC is based on the UDP protocol, which is an unreliable transmission protocol. QUIC has an offset field (offset) in each data packet. The receiving end can sort the asynchronously arriving data packets according to the offset field to ensure orderliness. sequenceDiagram Head of line blockingThe reason why HTTP/2 has head-of-line blocking at the TCP layer is that all request streams share a sliding window, while QUIC allocates an independent sliding window to each request stream. Packet loss on request stream A will not affect data transmission on request stream B. However, for each request stream, there is also a head-of-line blocking problem, that is, although QUIC solves the head-of-line blocking at the TCP layer, there is still head-of-line blocking on a single stream. This is the head-of-line blocking-free multiplexing declared by QUIC. Connection MigrationConnection migration: When the client switches networks, the connection with the server will not be disconnected and normal communication can still be achieved. This is impossible for the TCP protocol. Because the TCP connection is based on a 4-tuple: source IP, source port, destination IP, destination port, as long as one of them changes, the connection needs to be re-established. However, the QUIC connection is based on a 64-bit Connection ID. Network switching does not affect the change of the Connection ID, and the connection is still logically connected. Assume that the client first uses IP1 to send packets 1 and 2, then switches the network, changes IP to IP2, and sends packets 3 and 4. The server can determine that these four packets are from the same client based on the Connection ID field in the packet header. The fundamental reason why QUIC can achieve connection migration is that the underlying UDP protocol is connectionless. Finally, let’s take a look at the upgrade of http. |
<<: Practicing ESG, Shengye continues to promote sustainable development of supply chain technology
>>: What does 5G mean for enterprise business?
DesiVPS recently sent a new promotional package, ...
In response to calls to limit global warming to 1...
The Symantec Information Centric Security solutio...
Before discussing automotive Ethernet, let’s take...
On October 28, according to the latest 5G service...
At 10:00 am on December 16, F5, the world's l...
As science and technology develops at an increasi...
[[349279]] The United States is creating obstacle...
[[332555]] 1. The Use of API Gateway API Gateway ...
At the Global Terminal Summit held recently, Chin...
[[402368]] This article is reprinted from the WeC...
Port reuse is a classic problem in network progra...
Despite turbulent times, Wi-Fi has had a stellar ...
There are generally two formats for storing IP ad...
A recent analysis by Frost & Sullivan shows ...