HTTP/3 Principles and Practices

HTTP/3 Principles and Practices

After the HTTP/2 standard was published in 2015, most mainstream browsers also supported the standard by the end of that year. Since then, with advantages such as multiplexing, header compression, and server push, HTTP/2 has been favored by more and more developers. Unknowingly, HTTP has developed to the third generation. Tencent has also kept up with the technology trend, and many projects are gradually using HTTP/3. This article is based on the practice of QQ Interest Tribe accessing HTTP/3, and talks about the principles of HTTP/3 and the way to access services.

1. HTTP/3 Principle

1. HTTP History

Before introducing HTTP/3, let's take a brief look at the history of HTTP and understand the background of the emergence of HTTP/3.

With the development of network technology, HTTP/1.1 designed in 1999 can no longer meet the needs, so Google designed SPDY based on TCP in 2009. Later, the SPDY development team promoted SPDY to become an official standard, but it failed to pass in the end. However, the SPDY development team participated in the formulation process of HTTP/2 throughout the process and referred to many designs of SPDY, so we generally believe that SPDY is the predecessor of HTTP/2. Both SPDY and HTTP/2 are based on TCP. TCP has a natural disadvantage in efficiency compared with UDP, so in 2013 Google developed a transport layer protocol called QUIC based on UDP. QUIC stands for Quick UDP Internet Connections, hoping that it can replace TCP and make web page transmission more efficient. After the proposal, the Internet Engineering Task Force officially renamed HTTP (HTTP over QUIC) based on the QUIC protocol as HTTP/3.

2. QUIC Protocol Overview

TCP has always been an important protocol in the transport layer, while UDP is relatively unknown. When asked about the difference between TCP and UDP in an interview, the answer about UDP is often very limited. For a long time, UDP has given people the impression of being a very fast but unreliable transport layer protocol. But sometimes, from another perspective, disadvantages may also be advantages. QUIC (Quick UDP Internet Connections) is based on UDP, and it is precisely because of the speed and efficiency of UDP. At the same time, QUIC also integrates the advantages of TCP, TLS and HTTP/2, and optimizes them. A picture can clearly show the relationship between them.

So what is the relationship between QUIC and HTTP/3? QUIC is a transport layer protocol used to replace TCP and SSL/TLS. There is an application layer above the transport layer. The application layer protocols we are familiar with include HTTP, FTP, IMAP, etc. These protocols can theoretically run on top of QUIC. The HTTP protocol running on top of QUIC is called HTTP/3. This is the meaning of "HTTP over QUIC, i.e. HTTP/3".

Therefore, if you want to understand HTTP/3, you cannot avoid QUIC. The following mainly uses several important features to give you a deeper understanding of QUIC.

3. Zero RTT connection establishment

A picture can vividly show the difference between establishing a connection between HTTP/2 and HTTP/3.

HTTP/2 connection requires 3 RTTs. If session reuse is considered, that is, the symmetric key calculated in the first handshake is cached, then 2 RTTs are also required. Furthermore, if TLS is upgraded to 1.3, then HTTP/2 connection requires 2 RTTs, and 1 RTT is required if session reuse is considered. Some people will say that HTTP/2 does not necessarily require HTTPS, and the handshake process can be simplified. This is correct. The HTTP/2 standard does not need to be based on HTTPS, but in fact all browser implementations require HTTP/2 to be based on HTTPS, so HTTP/2 encrypted connection is essential. HTTP/3 only requires 1 RTT for the first connection, and only 0 RTT for subsequent connections, which means that the first packet sent by the client to the server carries the request data, which is difficult for HTTP/2 to match. So what is the principle behind this? Let's take a closer look at the connection process of QUIC.

  • Step 1: When connecting for the first time, the client sends Inchoate Client Hello to the server to request a connection;
  • Step 2: The server generates g, p, a, calculates A based on g, p and a, then puts g, p and A into Server Config and sends a Rejection message to the client;
  • Step 3: After receiving g, p, and A, the client generates b by itself, calculates B based on g, p, and b, and calculates the initial key K based on A, p, and b. After B and K are calculated, the client will use K to encrypt the HTTP data and send it to the server together with B;
  • Step 4: After receiving B, the server generates the same key as the client based on a, p, and B, and then uses this key to decrypt the received HTTP data. For further security (forward security), the server will update its own random number a and public key, generate a new key S, and then send the public key to the client through Server Hello. Together with the Server Hello message, there is also HTTP return data;
  • Step 5: After receiving Server Hello, the client generates a new key S that is consistent with the server, and all subsequent transmissions are encrypted using S.

In this way, QUIC spends a total of 1 RTT from requesting the connection to actually sending and receiving HTTP data. This 1 RTT is mainly for obtaining the Server Config. If the client caches the Server Config in the subsequent connection, it can directly send HTTP data and achieve 0 RTT connection establishment.

The DH key exchange algorithm is used here. The core of the DH algorithm is that the server generates three random numbers a, g, and p. a is kept by itself, and g and p are transmitted to the client. The client will generate a random number b. Through the DH algorithm, the client and the server can calculate the same key. In this process, a and b do not participate in network transmission, and security is greatly improved. Because p and g are large numbers, even if p, g, A, and B transmitted in the network are hijacked, the key cannot be cracked by current computer computing power.

4. Connection Migration

TCP connection is based on four tuples (source IP, source port, destination IP, destination port). When switching networks, at least one factor will change, causing the connection to change. When the connection changes, if the original TCP connection is still used, the connection will fail, and you have to wait for the original connection to time out and re-establish the connection. So sometimes we find that when switching to a new network, even if the new network is in good condition, the content still takes a long time to load. If implemented well, a new TCP connection is established immediately when a network change is detected. Even so, it still takes hundreds of milliseconds to establish a new connection.

QUIC connections are not affected by the four-tuple. When these four elements change, the original connection is still maintained. How is this done? The reason is simple. QUIC connections are not identified by the four-tuple, but by a 64-bit random number, which is called the Connection ID. Even if the IP or port changes, as long as the Connection ID does not change, the connection can still be maintained.

5. Head-of-line blocking/multiplexing

HTTP/1.1 and HTTP/2 both have the problem of head of line blocking. So what is head of line blocking?

TCP is a connection-oriented protocol, which means that after sending a request, an ACK message needs to be received to confirm that the other party has received the data. If each request must be made after receiving the ACK message of the previous request, the efficiency is undoubtedly very low. Later, HTTP/1.1 proposed the Pipelining technology, which allows a TCP connection to send multiple requests at the same time, which greatly improves the transmission efficiency.

In this context, let's talk about HTTP/1.1 head-of-line blocking. In the figure below, a TCP connection transmits 10 requests at the same time, of which the 1st, 2nd, and 3rd requests have been received by the client, but the 4th request is lost. Then the 5th to 10th requests are blocked and need to wait until the 4th request is processed before they can be processed, which wastes bandwidth resources.

Therefore, HTTP generally allows each host to establish 6 TCP connections, which can make more effective use of bandwidth resources. However, the problem of head-of-line blocking in each connection still exists.

HTTP/2's multiplexing solves the head-of-line blocking problem mentioned above. Unlike HTTP/1.1, where the next request can only be transmitted after all packets of the previous request have been transmitted, each request in HTTP/2 is split into multiple frames and transmitted simultaneously through a TCP connection, so that even if one request is blocked, it will not affect other requests. As shown in the figure below, different colors represent different requests, and blocks of the same color represent frames into which the request is segmented.

The story is not over yet. Although HTTP/2 can solve the blocking of the "request" granularity, the basic TCP protocol of HTTP/2 itself also has the problem of head-of-line blocking. Each request of HTTP/2 will be split into multiple frames, and the frames of different requests are combined into a stream. Stream is the logical transmission unit on TCP. In this way, HTTP/2 achieves the goal of sending multiple requests simultaneously on one connection. This is the principle of multiplexing. Let's take a look at an example. Four streams are sent simultaneously on a TCP connection. Stream1 has been delivered correctly, and the third frame in Stream2 is lost. TCP has a strict order when processing data. The first frame sent must be processed first. This will require the sender to resend the third frame. Although Stream3 and Stream4 have arrived, they cannot be processed. Then the entire connection is blocked.

In addition, HTTP/2 must use HTTPS, and the TLS protocol used by HTTPS also has a head-of-line blocking problem. TLS organizes data based on Records, encrypts a bunch of data together (i.e., a Record), and then splits it into multiple TCP packets for transmission. Generally, each Record is 16K and contains 12 TCP packets. If any of the 12 TCP packets is lost, the entire Record cannot be decrypted.

Head-of-line blocking can cause HTTP/2 to be slower than HTTP/1.1 in weak network environments where packet loss is more likely!

So how does QUIC solve the head-of-line blocking problem? There are two main points.

  • The transmission unit of QUIC is Packet, and the encryption unit is also Packet. The entire encryption, transmission, and decryption are based on Packet, which can avoid the head-of-line blocking problem of TLS.
  • QUIC is based on UDP. UDP data packets have no processing order at the receiving end. Even if a packet is lost in the middle, the entire connection will not be blocked, and other resources will be processed normally.

6. Congestion Control

The purpose of congestion control is to prevent too much data from flooding into the network at once, causing the network to exceed its maximum load. QUIC's congestion control is similar to TCP, and has been improved on this basis. So let's first briefly introduce TCP's congestion control.

TCP congestion control consists of four core algorithms: slow start, congestion avoidance, fast retransmit and fast recovery. If you understand these four algorithms, you will have a general understanding of TCP congestion control.

  • Slow start: The sender sends 1 unit of data to the receiver. After receiving confirmation from the other party, it will send 2 units of data, then 4, 8, and so on, increasing exponentially. This process is to constantly test the congestion level of the network. If it exceeds the threshold, it will cause network congestion.
  • Congestion avoidance: Exponential growth cannot be infinite. After reaching a certain limit (slow start threshold), exponential growth becomes linear growth.
  • Fast retransmission: The sender sets a timeout timer every time it sends a message. After the timeout, the message is considered lost and needs to be resent.
  • Fast recovery: On the basis of the above fast retransmission, when the sender resends the data, it will also start a timeout timer. If an acknowledgment message is received, it will enter the congestion avoidance phase. If it still times out, it will return to the slow start phase.

QUIC re-implements the Cubic algorithm of the TCP protocol for congestion control and makes a lot of improvements on this basis. The following introduces some of the features of QUIC's improved congestion control.

(1) Hot plug

If you want to modify the congestion control policy in TCP, you need to operate at the system level. QUIC only needs to modify the congestion control policy at the application layer, and QUIC will dynamically select the congestion control algorithm based on different network environments and users.

(2) Forward Error Correction (FEC)

QUIC uses forward error correction (FEC) technology to increase the fault tolerance of the protocol. After a piece of data is divided into 10 packets, each packet is XORed in turn, and the result of the operation will be transmitted together with the data packet as an FEC packet. If a data packet is lost during the transmission process, the data of the lost packet can be inferred based on the remaining 9 packets and the FEC packet, which greatly increases the fault tolerance of the protocol.

This is a solution that complies with current network technology. At this stage, bandwidth is no longer the bottleneck of network transmission, but round-trip time is. Therefore, the new network transmission protocol can appropriately increase data redundancy and reduce retransmission operations.

(3) Monotonically increasing Packet Number

In order to ensure reliability, TCP uses Sequence Number and ACK to confirm whether messages arrive in order, but this design has flaws.

After the timeout occurs, the client initiates a retransmission and later receives an ACK confirmation message. However, because the ACK messages received by the original request and the retransmission request are the same, the client is frustrated and does not know whether the ACK corresponds to the original request or the retransmission request. If the client thinks it is an ACK for the original request, but it is actually the situation on the left, the calculated sampled RTT is too large; if the client thinks it is an ACK for the retransmission request, but it is actually the situation on the right, the sampled RTT is too small. There are several terms in the figure. RTO refers to the timeout retransmission timeout, which is very similar to the familiar RTT (Round Trip Time). The sampled RTT will affect the RTO calculation. It is very important to accurately grasp the timeout time. It is not appropriate to be too long or too short.

QUIC solves the above ambiguity problem. Unlike Sequence Number, Packet Number is strictly monotonically increasing. If Packet N is lost, the Packet ID will not be N when it is retransmitted, but a number larger than N, such as N + M. In this way, when the sender receives the confirmation message, it can easily know whether the ACK corresponds to the original request or the retransmission request.

(4) ACK Delay

When TCP calculates RTT, it does not take into account the delay between the receiver receiving the data and sending the confirmation message, as shown in the figure below. This delay is the ACK Delay. QUIC takes this delay into account, making the RTT calculation more accurate.

(5) More ACK blocks

Generally speaking, after receiving a message from the sender, the receiver should send an ACK reply to indicate that the data has been received. However, it is too troublesome to return an ACK reply for each data received, so generally the reply is not immediate, but after receiving multiple data. TCP SACK provides up to 3 ACK blocks. However, in some scenarios, such as downloading, the server only needs to return data, but according to the design of TCP, an ACK must be returned "politely" for every 3 data packets received. QUIC can carry up to 256 ACK blocks. In a network with a severe packet loss rate, more ACK blocks can reduce the amount of retransmission and improve network efficiency.

(6) Flow Control

TCP will perform flow control on each TCP connection. Flow control means that the sender should not send too fast, so that the receiver can receive in time, otherwise it will cause data overflow and loss. TCP flow control is mainly implemented through sliding windows. It can be seen that congestion control mainly controls the sending strategy of the sender, but does not take into account the receiving capacity of the receiver. Flow control is to supplement this part of the capacity.

QUIC only needs to establish a connection and transmit multiple streams on this connection at the same time, just like there is a road with a warehouse at both ends and many vehicles transporting goods on the road. QUIC has two levels of flow control: connection level and stream level. It is like controlling the total flow of the road, so that many vehicles do not rush in at once and the goods cannot be processed in time, and one vehicle cannot transport a lot of goods at once, so that the goods cannot be processed in time.

So how does QUIC implement flow control? Let's first look at the flow control of a single stream. Before the stream transmits data, the receive window (flow control receive window) is the maximum receive window (flow control receive window). As the receiver receives data, the receive window continues to shrink. Among the received data, some data has been processed, while some data has not yet been processed. As shown in the figure below, the blue block represents processed data, and the yellow block represents unprocessed data. The arrival of this part of data shrinks the receive window of the stream.

As data is processed, the receiver is able to process more data. When (flow control receive offset - consumed bytes) < (max receive window / 2) is satisfied, the receiver will send a WINDOW_UPDATE frame to tell the sender that you can send more data. At this time, the flow control receive offset will shift, the receive window will increase, and the sender can send more data to the receiver.

The Stream level has limited effect on preventing the receiving end from receiving too much data, and it is more necessary to use the flow control at the Connection level. If you understand the Stream flow, it is also easy to understand the Connection flow control. In Stream, the receive window (flow control receive window) = the maximum receive window (max receive window) - the received data (highest received byte offset), while for Connection: the receive window = Stream1 receive window + Stream2 receive window + ... + StreamN receive window.

2. HTTP/3 Practice

1. X5 core and STGW

X5 kernel is a browser kernel developed by Tencent for Android system. It is a unified browser kernel developed to solve the problems of high adaptation cost, insecurity and instability of traditional Android system browser kernel. STGW is the abbreviation of Secure Tencent Gateway, which means Tencent Security Cloud Gateway. Both of them supported QUIC protocol as early as two years ago.

So how do we connect to QUIC for services running on X5? Thanks to X5 and STGW, the changes required for services to connect to QUIC are very small, and only two steps are required.

  • Step 1. Enable the whitelist on STGW to allow the business domain name to access the QUIC protocol;
  • Step 2. Add the alt-svc attribute to the Response Header of the business resource. For example: alt-svc: quic=":443"; ma=2592000; v="44,43,39".

When accessing QUIC, the advantage of STGW is very obvious. STGW communicates with the client that supports QUIC (here is X5), while the business backend and STGW still use HTTP/1.1 to communicate. The cache information such as Server Config required by QUIC is also maintained by STGW.

2. Negotiation and escalation

The business domain name is added to the whitelist of STGW, and the alt-svc attribute is added to the Response Header of the business resource. How does QUIC establish a connection? There is a key step here: negotiation upgrade. The client is not sure whether the server supports QUIC. If it rashly requests to establish a QUIC connection, it may fail, so it needs to go through the negotiation upgrade process to decide whether to use QUIC.

When making the first request, the client will use HTTP/1.1 or HTTP/2. If the server supports QUIC, it will return the alt-svc header in the response data to tell the client that the next request can go through QUIC. alt-svc mainly contains the following information:

  • quic: listening port;
  • ma: validity period, in seconds, during which QUIC is promised to be supported;
  • Version number: QUIC is updated very quickly, and all supported version numbers are listed here.

After confirming that the server supports QUIC, the client initiates a QUIC connection and a TCP connection to the server at the same time, compares the speeds of the two connections, and then selects the faster protocol. This process is called "racing", and QUIC usually wins.

3. QUIC Performance

The success rate of QUIC connection establishment is over 90%, the race success rate is also close to 90%, and the 0 RTT rate is around 55%.

When using the QUIC protocol, the time it takes to reach the first screen of a page is reduced by 10% compared to non-QUIC protocols.

From the perspective of different stages of resource acquisition, the time saved by the QUIC protocol in the connection stage is quite obvious.

From the page first screen interval ratio chart, we can see that after using the QUIC protocol, the proportion of first screen time within 1 second has increased significantly, to about 12%.

3. Conclusion

QUIC has dropped the burden of TCP and TLS, and is based on UDP. It draws on and improves on the experience of TCP, TLS, and HTTP/2 to achieve a secure, efficient, and reliable HTTP communication protocol. With excellent features such as 0 RTT to establish a connection, smooth connection migration, basically eliminating head-of-line blocking, and improved congestion control and flow control, QUIC achieves better results than HTTP/2 in most scenarios.

A week ago, Microsoft announced the open source of its internal QUIC library, MsQuic, and will fully recommend the QUIC protocol to replace the TCP/IP protocol.

HTTP/3 is promising in the future.

<<:  The interviewer asked me to turn left because of the thread pool!

>>:  5G construction enters the fast lane, with base stations, applications, and terminals in full bloom

Recommend

Six free network latency testing tools worth recommending

As a network administrator or network engineer, i...

SDN changes data center architecture

Software-defined networking (SDN) is seen as havi...

BuyVM: $3.5/month KVM-1GB/20GB/1Gbps unlimited traffic/Las Vegas data center

BuyVM Las Vegas has currently restocked a large n...

Virtono: €29.95/year KVM-1GB/30GB/2TB/10 data centers available

Virtono recently launched the SPRING SALE 2022 ev...

A Preliminary Study on Microsecond-Level High-Performance Network

If we expect to reduce network latency from 10ms ...

...

5G technology and its impact on the Internet of Things

5G is the latest generation of cellular network t...

5G technology has just emerged, so don’t rush to pour cold water on it

After 3G and 4G have successively gone from unfam...