[[379542]] This article is reprinted from the WeChat public account "Front-end Log", author Meng Sixing. Please contact the front-end log public account for reprinting this article. After the birth of the World Wide Web in 1989, HTTP quickly became the world's leading application layer protocol. Today, HTTP is used in almost every scenario to some extent. In its 30-year history, the HTTP protocol has evolved significantly, and some major changes are also in the works. These evolutions have made the protocol more expressive, with better performance, and more able to meet the ever-changing application needs. This article will review and look forward to the history and future of HTTP. - HTTP/0.9
- HTTP/1.0
- HTTP/1.1
- HTTP/2
- HTTP/3
HTTP/0.9 HTTP/0.9 was born in 1991. It is the first version of the HTTP protocol and has a very simple structure: - The requester only supports GET requests
- The response end can only return HTML text data
- GET /index.html
- <html>
- <body>
- Hello World
- </body>
- </html>
The request diagram is as follows: HTTP/0.9 As you can see, HTTP/0.9 can only send GET requests, and each request creates a separate TCP connection. The responder can only return data in HTML format. After the response is completed, the TCP request is disconnected. Although this request method could meet the usage needs at the time, it still exposed some problems. HTTP/0.9 pain points: - The request method is unique and the response format is unique
- TCP connections cannot be reused
HTTP/1.0 HTTP/1.0 was born in 1996. It added HTTP header fields based on HTTP/0.9, greatly expanding the use scenarios of HTTP. This version of HTTP can not only transmit text, but also transmit images, videos, and binary files, laying a solid foundation for the rapid development of the Internet. The core features are as follows: - The request side adds the HTTP protocol version, and the response side adds the status code.
- Added POST and HEAD to request methods.
- The requester and responder add header fields.
- Content-Type allows the response data to be more than just hypertext.
- Expires, Last-Modified cache headers.
- Authorization identity authentication.
- Connection: keep-alive supports long connections, but is non-standard.
- GET /mypage.html HTTP/1.0
- User -Agent: NCSA_Mosaic/2.0 (Windows 3.1)
- 200 OK
- Date : Tue, 15 Nov 1994 08:12:31 GMT
- Server: CERN/3.0 libwww/2.17
- Content-Type: text/html
-
- <html>
- <body>
- Hello World
- </body>
- </html>
The request diagram is as follows: HTTP/1.0 As you can see, HTTP/1.0 extends the request method and response status code, and supports the definition of HTTP header fields. Through the Content-Type header, we can transmit data in any format. At the same time, it can be seen that HTTP/1.0 still corresponds to one TCP connection for one request, and multiplexing cannot be formed. HTTP/1.0 pain points: - TCP connections cannot be reused.
- HTTP head-of-line blocking means that the next HTTP request can only be initiated after the response to one HTTP request is completed.
- A server can only provide one HTTP service.
HTTP/1.1 HTTP/1.1 was born in 1999. It further improved the HTTP protocol and has been used until today, more than 20 years later, and is still the most widely used HTTP version. The core features are as follows: - Persistent connections.
- HTTP/1.1 enables persistent connections by default, which means that the TCP connection is not closed immediately after it is established, allowing multiple HTTP requests to be reused.
- Pipeline technology.
- In HTTP/1.1, multiple HTTP requests do not need to be queued and can be sent in batches, which solves the HTTP head-of-line blocking problem. However, HTTP requests sent in batches must return responses in the order in which they were sent, which is equivalent to solving half of the problem, but it is still not the best experience.
- Support for response chunking.
- HTTP/1.1 implements streaming rendering. The responder does not need to return all the data at once. Instead, the data can be split into multiple modules and a block of data is sent as it is generated. In this way, the client can process the data synchronously, reducing response delays and shortening white screen time.
- The implementation of Bigpipe is based on this feature, which is achieved by defining the Transfer-Encoding header.
- Add Host header.
- HTTP/1.1 implements virtual host technology, which divides a server into several hosts, so that multiple websites can be deployed on one server.
- By configuring the Host domain name and port number, you can support multiple HTTP services: Host: :
- Other extensions.
- Add Cache-Control and E-Tag cache headers.
- Added PUT, PATCH, HEAD, OPTIONS, and DELETE request methods.
- GET /en-US/docs/Glossary/Simple_header HTTP/1.1
- Host: developer.mozilla.org
- User -Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:50.0) Gecko/20100101 Firefox/50.0
- Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
- Accept-Language: en-US,en;q=0.5
- Accept-Encoding: gzip, deflate, br
- Referer: https://developer.mozilla.org/en-US/docs/Glossary/Simple_header
- 200 OK
- Connection : Keep-Alive
- Content-Encoding: gzip
- Content-Type: text/html; charset=utf-8
- Date : Wed, 20 Jul 2016 10:55:30 GMT
- Etag: "547fa7e369ef56031dd3bff2ace9fc0832eb251a"
- Keep-Alive: timeout=5, max =1000
- Last -Modified: Tue, 19 Jul 2016 00:59:33 GMT
- Server: Apache
- Transfer-Encoding: chunked
- Vary: Cookie, Accept-Encoding
-
- <html>
- <body>
- Hello World
- </body>
- </html>
The request diagram is as follows: HTTP/1.1 As you can see, HTTP/1.1 can initiate multiple requests in parallel and reuse the same TCP connection, which improves transmission efficiency. However, the responder can only return in the order in which it was sent. For this reason, many browsers will open up to 6 connections for each domain name, increasing the queue to reduce HTTP head-of-line blocking. HTTP/1.1 pain points: - HTTP head-of-line blocking has not been completely resolved. The responder must return in the order in which HTTP is sent. If the responses at the top of the list are particularly time-consuming, all responses at the bottom of the list will be blocked.
HTTP/2 HTTP/2 was born in 2015. Its biggest feature is All in binary. Based on the binary characteristics, it deeply optimizes the HTTP transmission efficiency. HTTP/2 divides an HTTP request into three parts: - Frame: A piece of binary data, which is the smallest unit of HTTP/2 transmission.
- Message: One or more frames corresponding to a request or response.
- Data stream: A bidirectional flow of bytes within an established connection that can carry one or more messages.
HTTP/2 data streams, messages, and frames As can be seen in the figure, there are multiple data streams on a TCP connection. A data stream carries bidirectional messages. A message contains multiple frames. Each frame has a unique identifier that points to the data stream it belongs to. Frames from different data streams can be sent interleaved and then reassembled according to the data stream identifier of each frame header, thus realizing data transmission. The core features of HTTP/2 are as follows: - Request Priority
- When multiple HTTP requests are sent simultaneously, multiple data streams are generated. There is a priority mark in the data stream, and the server can determine the priority of the response based on this mark.
- Multiplexing
- When TCP is transmitting, there is no need to respond in the order of HTTP sending. The messages can be sent in an interleaved manner. The receiving end can find the corresponding stream based on the identifier in the frame header and then reassemble it to obtain the final data.
- Server Push
- HTTP/2 allows the server to actively send resources to the client without request and cache them on the client to avoid secondary requests.
- When requesting a page in HTTP/1.1, the browser will first send an HTTP request, then get the HTML content of the response and start parsing. If it finds a <script src="xxxx.js"> tag, it will initiate another HTTP request to obtain the corresponding JS content. However, HTTP/2 can return the required JS, CSS and other content to the client at the same time as returning HTML. When the browser parses the corresponding tag, it does not need to initiate a request again.
- Header Compression
- The HTTP/1.1 header field contains a lot of information and must be included in every request, taking up a lot of bytes.
- In HTTP/2.0, each communicating party caches a header field table, such as storing Content-Type: text/html in the index table. If this header is to be used later, only the corresponding index number needs to be sent.
In addition, although HTTP/2 does not require the use of the TLS security protocol, all web browsers that implement HTTP/2 only support websites that have TLS configured. This is to encourage everyone to use the more secure HTTPS. The request diagram is as follows: HTTP/2 As you can see, when sending a request in HTTP/2, there is no need to queue up for sending or queuing for returning, which completely solves the HTTP head-of-line blocking problem. The pain points such as header information and resource caching have also been optimized, which seems to be a perfect solution. HTTP/2 has been optimized to the extreme on the HTTP + TCP architecture. If you want to continue to optimize, you can only start from this architecture. The first thing that needs to be optimized is TCP, because the core of TCP is to ensure the reliability of the transport layer, and the transmission efficiency is actually not very good. - TCP also has head-of-line blocking. TCP uses sequence numbers to identify the order of data during transmission. Once a piece of data is lost, the subsequent data needs to wait for this data to be retransmitted before it can be processed further.
- Each TCP connection establishment requires three handshakes, and releasing the connection requires four handshakes, which invisibly increases the transmission time.
- TCP has congestion control, built-in algorithms such as slow start and congestion avoidance, and the transmission efficiency is not stable.
If you want to solve these problems, you need to replace TCP, which is also the solution of HTTP/3. Let's continue reading. HTTP/3 HTTP/3 is still in the draft stage. Its main feature is that it optimizes the transport layer and uses QUIC to replace TCP, completely avoiding the efficiency issues of TCP transmission. QUIC is a multiplexing transport protocol based on UDP proposed by Google. QUIC has no concept of connection and does not require a three-way handshake. At the application level, it implements the reliability of TCP, the security of TLS, and the concurrency of HTTP2. At the device support level, only the client and server applications need to support the QUIC protocol, without restrictions on operating systems and intermediate devices. The core features of HTTP/3 are as follows: - Transport layer connections are faster.
- HTTP/3 is based on the QUIC protocol and can achieve 0-RTT connection establishment, while TCP requires 3-RTT to establish a connection.
HTTPS and QUIC connection process - Transport layer multiplexing.
QUIC Multiplexing The Streams in the above figure are independent of each other. If Stream2 loses a packet, it will not affect the normal reading of Stream3 and Stream4. - The HTTP/3 transport layer uses the QUIC protocol. The data will be split into multiple packets during transmission. Each packet can be sent independently and interleaved without being sent in sequence, thus avoiding TCP head-of-line blocking.
Improved congestion control. Ack Delay - Monotonically increasing Packet Number. In TCP, each data packet has a sequence number (seq). If the receiving end times out and does not receive the packet, it will request to resend the packet identified as seq. If the timed-out packet is also received at this time, it will be impossible to distinguish which is the timed-out packet and which is the retransmitted packet. The identifier (Packet Number) of each packet in QUIC is monotonically increasing. The retransmitted Packet Number must be greater than the timed-out Packet Number, so they can be distinguished.
- Reneging is not allowed. In TCP, if the receiver does not have enough memory or the buffer overflows, the received packets may be discarded (Reneging). This behavior greatly interferes with data retransmission and is explicitly prohibited in QUIC. In QUIC, as long as a packet is confirmed, it must have been received correctly.
- More ACK blocks. Generally speaking, after receiving a message from the sender, the receiver will send an ACK flag to indicate that the data has been received. However, sending an ACK for each data received is too inefficient. Usually, multiple data are received before a unified ACK reply is sent. TCP returns an ACK for every 3 data packets received, while QUIC can return an ACK after receiving up to 256 packets. In a network with a high packet loss rate, more ACK blocks can reduce the amount of retransmissions and improve network efficiency.
- Ack Delay. TCP does not consider the delay of the receiver processing data when calculating RTT. As shown in the figure below, this delay is the ACK Delay. QUIC takes this delay into account, making the RTT calculation more accurate.
Optimized flow control. - In stream-level flow control, the receive window = maximum receive window - received data.
- In the flow control at the Connection level, the receive window = Stream1 receive window + Stream2 receive window + ... + StreamN receive window.
- TCP controls traffic through a sliding window. If a packet is lost, the sliding window cannot continue to slide across the lost packet, but will be stuck at the lost position and wait for the data to be retransmitted before it can continue to slide.
- The core of QUIC flow control is: do not establish too many connections, so that the responder cannot handle them; do not let a certain connection occupy a large amount of resources, leaving other connections with no resources available. For this reason, QUIC flow control is divided into two levels: connection level and stream level.
Encrypted and authenticated messages - The TCP header is not encrypted or authenticated, and can be easily tampered with, injected, and eavesdropped by intermediate network devices during transmission.
- The messages in QUIC are encrypted and authenticated, ensuring the security of data during transmission.
Connection Migration - A TCP connection is composed of (source IP, source port, destination IP, destination port). If any one of these four changes, the connection will be unusable. If we switch from a 5G network to a WiFi network, the IP address will change, and the TCP connection will naturally be disconnected.
- QUIC uses a 64-bit ID generated by the client to represent a connection. As long as the ID remains unchanged, the connection will be maintained without interruption.
Forward Error Correction Mechanism - The sender needs to send three packets. QUIC will calculate the XOR value of these three packets during transmission and send a separate verification packet, which means a total of four packets are sent.
- If a packet (non-check packet) is lost during transmission, the content of the lost data packet can be calculated from the other three packets.
- Of course, this technology can only be used when one packet is lost. If multiple packets are lost, retransmission is the only option.
- When sending data in QUIC, in addition to sending the data packet itself, a verification packet is also sent to reduce retransmissions caused by data loss.
- For example:
It can be seen that QUIC has dropped the burden of TCP and implemented a secure, efficient and reliable HTTP communication protocol based on UDP. With features such as 0-RTT connection establishment, transport layer multiplexing, connection migration, improved congestion control, and flow control, QUIC has achieved better results than HTTP/2 in most scenarios. HTTP/3 is really promising in the future. Thoughts and Conclusions This article goes through the history of Internet development, from HTTP/0.9 to HTTP/3, and gradually introduces the core features of each version, and finally summarizes them in one sentence. - HTTP/0.9 implements basic request and response.
- HTTP/1.0 adds HTTP headers, enriches the types of transmitted resources, and lays the foundation for the development of the Internet.
- HTTP/1.1 adds persistent connections, pipelining, and response chunking, improving HTTP transmission efficiency.
- HTTP/2 uses a binary transmission format and maximizes transmission efficiency on the HTTP + TCP architecture through HTTP multiplexing, header compression, and server-side push.
- HTTP/3 replaces the transport layer with QUIC, and further improves HTTP transmission efficiency through improved congestion control, flow control, 0-RTT connection establishment, transport layer multiplexing, connection migration and other features.
As we can see, starting from HTTP/1.1, the development direction of HTTP is to continuously improve the transmission efficiency. We expect that the future HTTP will bring us a faster transmission experience. |