Understanding the new features of HTTP/2 and HTTP/3 (recommended)

Understanding the new features of HTTP/2 and HTTP/3 (recommended)

Compared with HTTP/1.1, HTTP/2 can be said to have greatly improved the performance of web pages. Simply upgrading to this protocol can reduce a lot of performance optimization work that needed to be done before. Of course, compatibility issues and how to gracefully degrade should be one of the reasons why it is not widely used in China.

Although HTTP/2 improves the performance of web pages, it does not mean that it is perfect. HTTP/3 was introduced to solve some problems existing in HTTP/2.

1. What has changed since the invention of HTTP/1.1?

If you look closely at the resources that need to be downloaded to open the homepages of the most popular websites, you will find a very obvious trend. In recent years, the amount of data that needs to be downloaded to load the homepage of a website has gradually increased and has exceeded 2100K. But what we should be more concerned about here is that the average number of resources that need to be downloaded for each page to complete display and rendering has exceeded 100.

As shown in the figure below, the size of data transferred and the average number of resources requested have continued to grow since 2011, with no signs of slowing down. The green straight line in the chart shows the growth of the size of data transferred, and the red straight line shows the growth of the average number of resources requested.

Since the release of HTTP/1.1 in 1997, we have been using HTTP/1.x for quite a long time. However, with the explosive development of the Internet in the past decade, web pages have changed from being mainly text-based to being mainly rich media (such as pictures, sounds, and videos). In addition, there are more and more applications that require real-time page content (such as chatting and live video streaming). Therefore, some of the features specified in the protocol at that time can no longer meet the needs of modern networks.

2. HTTP/1.1 flaws

1. High latency – resulting in slower page loading speed

Although network bandwidth has grown rapidly in recent years, we have not seen a corresponding reduction in network latency. The network latency problem is mainly due to head-of-line blocking, which results in bandwidth not being fully utilized.

Head-of-line blocking means that when one request in a sequence of sequentially sent requests is blocked for some reason, all requests queued behind it are also blocked, causing the client to be unable to receive data for a long time. People have tried the following methods to solve head-of-line blocking:

  • Distribute the resources of the same page to different domain names to increase the connection limit. Chrome has a mechanism that allows 6 TCP persistent connections to be established simultaneously for the same domain name by default. When using persistent connections, although a TCP pipeline can be shared, only one request can be processed at a time in a pipeline. Before the current request is completed, other requests can only be blocked. In addition, if 10 requests occur simultaneously under the same domain name, 4 of them will enter a queue waiting state until the ongoing request is completed.
  • Spriting is a technique that combines multiple small images into one large image, and then uses JavaScript or CSS to "cut" the small images out again.
  • Inlining is another technique to avoid sending many small image requests. The original image data is embedded in the URL inside the CSS file to reduce the number of network requests.
  1. .icon1 {  
  2. background: url(data:image/png;base64, < data > ) no-repeat;  
  3. }  
  4. .icon2 {  
  5. background: url(data:image/png;base64, < data > ) no-repeat;  
  6. }
  • Concatenation uses tools such as webpack to package multiple smaller JavaScript files into a larger JavaScript file, but if one of the files is changed, a large amount of data will be downloaded again.

2. Stateless feature - resulting in huge HTTP header

Since the message header generally carries many fixed header fields such as "User Agent", "Cookie", "Accept", and "Server" (as shown in the figure below), up to hundreds or even thousands of bytes, but the body is often only tens of bytes (such as GET requests, 204/301/304 responses), it has become a veritable "big-headed son". The content carried in the header is too large, which increases the transmission cost to a certain extent. What's more, many field values ​​in thousands of request and response messages are repeated, which is very wasteful.

3. Plain text transmission-the insecurity it brings

When HTTP/1.1 transmits data, all transmitted content is in plain text, and neither the client nor the server can verify the identity of the other party, which to a certain extent cannot guarantee the security of the data.

Have you ever heard of news like "free WiFi traps"? Hackers take advantage of the shortcomings of HTTP plain text transmission and set up a WiFi hotspot in a public place to "fish" and trick netizens into going online. Once you connect to this WiFi hotspot, all traffic will be intercepted and saved. If there is sensitive information such as bank card numbers and website passwords in it, it will be dangerous. Hackers can use this data to impersonate you and do whatever they want.

4. Server push messages are not supported

3. Introduction to SPDY Protocol and HTTP/2

1. SPDY protocol

As mentioned above, due to the defects of HTTP/1.x, we will introduce sprites, inline small images, use multiple domain names, etc. to improve performance. However, these optimizations bypass the protocol. It was not until 2009 that Google publicly released its self-developed SPDY protocol, which mainly solved the problem of low efficiency of HTTP/1.1. Google's launch of SPDY was considered a formal transformation of the HTTP protocol itself. Reducing latency, compressing headers, etc., the practice of SPDY proved the effectiveness of these optimizations, and eventually led to the birth of HTTP/2.

HTTP/1.1 has two major shortcomings: insufficient security and low performance. Due to the huge historical burden of HTTP/1.x, the modification of the protocol and compatibility are the primary considerations, otherwise it will destroy countless existing assets on the Internet. As shown in the figure above, SPDY is located below HTTP and above TCP and SSL, so it can easily be compatible with the old version of the HTTP protocol (encapsulating the content of HTTP1.x into a new frame format) and can use the existing SSL function.

After the SPDY protocol was proven to be feasible on the Chrome browser, it was used as the basis for HTTP/2, and its main features were inherited in HTTP/2.

2. Introduction to HTTP/2

In 2015, HTTP/2 was released. HTTP/2 is a replacement for the current HTTP protocol (HTTP/1.x), but it is not a rewrite. The HTTP methods/status codes/semantics are the same as HTTP/1.x. HTTP/2 is based on SPDY and focuses on performance. One of its biggest goals is to use only one connection between the user and the website. From the current situation, some of the top-ranked sites at home and abroad have basically implemented the deployment of HTTP/2. Using HTTP/2 can bring 20%~60% efficiency improvement.

HTTP/2 consists of two specifications:

  • Hypertext Transfer Protocol version 2 - RFC7540
  • HPACK - Header Compression for HTTP/2 - RFC7541

IV. New Features of HTTP/2

1. Binary transfer

HTTP/2 significantly reduces the amount of data transmitted, mainly for two reasons: binary transmission and header compression. Let's first introduce binary transmission. HTTP/2 uses binary format to transmit data instead of plain text messages in HTTP/1.x. Binary protocols are more efficient to parse. HTTP/2 splits request and response data into smaller frames, and they are encoded in binary.

It moves some features of the TCP protocol to the application layer, breaking up the original "Header+Body" message into several small binary "frames", using "HEADERS" frames to store header data and "DATA" frames to store entity data. After HTP/2 data is framed, the "Header+Body" message structure disappears completely, and the protocol only sees "fragments".

In HTTP/2, all communications under the same domain name are completed on a single connection, which can carry any number of bidirectional data streams. Each data stream is sent in the form of a message, which in turn consists of one or more frames. Multiple frames can be sent out of order and can be reassembled based on the stream identifier in the frame header.

2. Header Compression

HTTP/2 does not use traditional compression algorithms, but instead developed a special "HPACK" algorithm that establishes a "dictionary" on both the client and server ends, uses index numbers to represent repeated strings, and uses Huffman coding to compress integers and strings, achieving a high compression rate of 50% to 90%.

Specifically:

  • Use "header tables" on the client and server to track and store previously sent key-value pairs so that the same data is no longer sent with each request and response;
  • The header table always exists during the HTTP/2 connection and is progressively updated by the client and server;
  • Each new header key-value pair is either appended to the end of the current table or replaces the previous value in the table.

For example, in the following figure, the first request sends all the header fields, while the second request only needs to send the difference data, which can reduce redundant data and reduce overhead.

3. Multiplexing

Multiplexing technology is introduced in HTTP/2. Multiplexing solves the problem of browsers limiting the number of requests under the same domain name. It also makes it easier to achieve full-speed transmission. After all, opening a new TCP connection requires slowly increasing the transmission speed.

You can use this link to get a visual feel of how much faster HTTP/2 is than HTTP/1.

In HTTP/2, with binary framing, HTTP/2 no longer relies on TCP connections to implement multi-stream parallelism. In HTTP/2,

  • All communications under the same domain name are completed over a single connection.
  • A single connection can carry any number of bidirectional data streams.
  • The data stream is sent in the form of a message, which consists of one or more frames. Multiple frames can be sent out of order because they can be reassembled based on the stream identifier in the frame header.

This feature greatly improves performance:

  • The same domain name only needs to occupy one TCP connection, and one connection is used to send multiple requests and responses in parallel. In this way, the download process of the entire page resource only requires one slow start, and the problem caused by multiple TCP connections competing for bandwidth is also avoided.
  • Multiple requests/responses are sent in parallel and interleaved without affecting each other.
  • In HTTP/2, each request can carry a 31-bit priority value, where 0 represents the highest priority and the larger the value, the lower the priority. With this priority value, the client and server can adopt different strategies when processing different streams and send streams, messages, and frames in the best way.

As shown in the figure above, multiplexing technology can transmit all request data through only one TCP connection.

4. Server Push

HTTP2 has also changed the traditional "request-response" working mode to a certain extent. The server is no longer completely passive in responding to requests, but can also create a new "stream" to actively send messages to the client. For example, when the browser just requests HTML, the JS and CSS files that may be used are sent to the client in advance to reduce the waiting delay. This is called "Server Push" (also called Cache push).

For example, as shown in the figure below, the server actively pushes JS and CSS files to the client without the client sending these requests when parsing HTML.

It should also be noted that the server can actively push, and the client also has the right to choose whether to receive. If the resource pushed by the server has been cached by the browser, the browser can reject it by sending a RST_STREAM frame. Active push also complies with the same-origin policy. In other words, the server cannot push third-party resources to the client casually, but must be confirmed by both parties.

5. Improved security

For compatibility reasons, HTTP/2 continues the "plain text" feature of HTTP/1. It can use plain text to transmit data as before, and does not force the use of encrypted communication. However, the format is still binary, but it does not require decryption.

However, since HTTPS is the general trend, and mainstream browsers such as Chrome and Firefox have publicly announced that they only support encrypted HTTP/2, "in fact" HTTP/2 is encrypted. In other words, the HTTP/2 commonly seen on the Internet uses the "https" protocol name and runs on TLS. The HTTP/2 protocol defines two string identifiers: "h2" for encrypted HTTP/2 and "h2c" for plaintext HTTP/2.

6. HTTP/3 New Features

1. Disadvantages of HTTP/2

Although HTTP/2 solves many problems of previous versions, it still has a huge problem, mainly caused by the underlying TCP protocol. The main disadvantages of HTTP/2 are as follows:

(1) TCP and TCP+TLS connection establishment delay

HTTP/2 is transmitted using the TCP protocol. If HTTPS is used, the TLS protocol is also required for secure transmission. Using TLS also requires a handshake process, so there are two handshake delay processes:

  • When establishing a TCP connection, a three-way handshake is required with the server to confirm that the connection is successful, which means that data transmission can only be carried out after consuming 1.5 RTTs.
  • For TLS connection, there are two versions of TLS - TLS1.2 and TLS1.3. Each version takes a different amount of time to establish a connection, roughly 1 to 2 RTTs.

In short, we need to spend 3 to 4 RTTs before transmitting data.

(2) TCP head-of-line blocking has not been completely resolved

As mentioned above, in HTTP/2, multiple requests are run in one TCP pipeline. But when packet loss occurs, HTTP/2 performs worse than HTTP/1. Because TCP has a special "packet loss retransmission" mechanism to ensure reliable transmission, lost packets must wait for retransmission confirmation. When HTTP/2 packet loss occurs, the entire TCP must start waiting for retransmission, which will block all requests in the TCP connection (as shown below). For HTTP/1.1, multiple TCP connections can be opened. This situation will only affect one of the connections, and the remaining TCP connections can still transmit data normally.

After reading this, some people may wonder why we don’t just modify the TCP protocol? In fact, this is already an impossible task. Because TCP has existed for too long, it has been used in various devices, and this protocol is implemented by the operating system, so it is not realistic to update it.

2. Introduction to HTTP/3

Google was aware of these problems when it was promoting SPDY, so it started a new protocol called "QUIC" based on UDP, allowing HTTP to run on QUIC instead of TCP. This "HTTP over QUIC" is the next major version of the HTTP protocol, HTTP/3. It has achieved a qualitative leap on the basis of HTTP/2, and truly "perfectly" solves the "head of line blocking" problem.

Although QUIC is based on UDP, it has added many new features. Next, we will focus on introducing several new QUIC features. However, HTTP/3 is still in the draft stage and may change before its official release, so this article will try not to cover those unstable details.

3. New Features of QUIC

As mentioned above, QUIC is based on UDP, which is "connectionless" and does not require "handshakes" or "waves", so it is faster than TCP. In addition, QUIC also implements reliable transmission to ensure that data can reach its destination. It also introduces "streams" and "multiplexing" similar to HTTP/2. A single "stream" is ordered and may be blocked due to packet loss, but other "streams" will not be affected. Specifically, the QUIC protocol has the following characteristics:

(1) Implemented flow control and transmission reliability functions similar to TCP.

Although UDP does not provide reliable transmission, QUIC adds a layer on top of UDP to ensure reliable data transmission. It provides packet retransmission, congestion control, and other features that exist in TCP.

(2) Implemented the fast handshake function.

Since QUIC is based on UDP, QUIC can use 0-RTT or 1-RTT to establish a connection, which means that QUIC can send and receive data at the fastest speed, which can greatly improve the speed of opening a page for the first time. 0RTT connection establishment can be said to be the biggest performance advantage of QUIC compared to HTTP2.

(3) Integrated TLS encryption function.

Currently, QUIC uses TLS1.3, which has more advantages than the earlier version TLS1.3. The most important one is that it reduces the number of RTTs spent on handshakes.

(4) Multiplexing completely solves the TCP head-of-line blocking problem

Unlike TCP, QUIC implements multiple independent logical data streams on the same physical connection (as shown below). The separate transmission of data streams solves the problem of head-of-line blocking in TCP.

VII. Conclusion

  • HTTP/1.1 has two major drawbacks: insufficient security and low performance.
  • HTTP/2 is fully compatible with HTTP/1 and is "more secure HTTP and faster HTTPS". Technologies such as header compression and multiplexing can fully utilize bandwidth and reduce latency, thus greatly improving the Internet experience.
  • QUIC is implemented based on UDP and is the underlying supporting protocol in HTTP/3. This protocol is based on UDP and takes the essence of TCP to achieve a fast and reliable protocol.

<<:  Java Server Model - TCP Connection/Flow Optimization

>>:  Huawei: 5G+AI opens a new era of smart city twins

Recommend

Five steps to modernize your enterprise network

The business value of the network has never been ...

Gartner Releases: Top Ten Wireless Technology Development Trends

Wi-Fi will continue to dominate the industry over...

Advantages of 5G technology in future US military networks

The article shows that the United States is incre...

Configure HTTPS for React applications running locally

If you build an application with create-react-app...

5G toB: The next battle between operators and OTT?

In the 5G era, will the battle between operators ...

Communication styles in microservices architecture

In a microservices architecture, communication is...