Understanding HTTP/1, HTTP/2, and HTTP/3 in one article

Understanding HTTP/1, HTTP/2, and HTTP/3 in one article

1. HTTP1.1 and HTTP2

1. HTTP1.1 flaws

  • High Latency — Head-Of-Line Blocking
  • Stateless nature — hinders interaction
  • Plain text transmission — insecure
  • Does not support server push

[[317799]]

2. Head-of-line blocking

Head-of-line blocking means that when a request in a sequence of sequentially sent requests is blocked for some reason, all requests queued behind it are also blocked, causing the client to delay receiving data.

For head-of-line blocking:

  • Distribute the resources of the same page to different domain names to increase the connection limit. Although a TCP pipeline can be shared, only one request can be processed at a time in a pipeline. Before the current request is completed, other requests can only be blocked.
  • Reduce the number of requests
  • Inline some resources: css, base64 images, etc.
  • Merge small files to reduce the number of resources

3. Stateless Features

Stateless means that the protocol has no memory of the connection status. Pure HTTP does not have mechanisms such as cookies, and each connection is a new connection. The last request verified the username and password, but the server does not know how the next request is related to the previous request. In other words, the login state is lost.

4. Insecurity

The transmitted content is not encrypted and may be tampered with or hijacked in transit.

5. SPDY Protocol

SPDY is an improved version of HTTP1.1 promoted by Google (there was no HTTP2 at that time).

characteristic:

  • Multiplexing - Solving head-of-line blocking
  • Header compression — Solve the huge HTTP header
  • Request priority — get important data first
  • Server Push — Filling the Gap
  • Improved safety

6. Multiplexing

SPDY allows unlimited concurrent streams on a single connection. Because requests are on a single channel, TCP is more efficient (see Slow Start in TCP Congestion Control https://zhuanlan.zhihu.com/p/37379780). Fewer network connections send more intensive packets.

7. Header Compression

Using the specialized HPACK algorithm, only the difference header is sent for each request and response, which can generally achieve a high compression rate of 50% to 90%.

8. Request Priority

Although unlimited concurrent streams solve the problem of head-of-line blocking, if bandwidth is limited, the client may block requests to prevent blocking the channel. When the network channel is blocked by non-critical resources, high-priority requests will be processed first.

9. Server Push

Server Push allows the server to actively push resource files to the client. Of course, the client also has the right to choose whether to receive it.

10. Improved security

Supports HTTPS for encrypted transmission.

HTTP2

HTTP2 is based on SPDY and focuses on performance. One of its biggest goals is to use only one connection between the user and the website.

New features:

  • Binary framing - the core of HTTP2 performance enhancements
  • Multiplexing - Solve serial file transfer and excessive number of connections

1. Binary framing

First, HTTP2 does not change the semantics of HTTP1, but uses binary framing for transmission at the application layer. Therefore, new communication units are introduced: frames, messages, and streams.

What are the benefits of framing? The server receives more requests per unit time, which can increase the number of concurrent requests. Most importantly, it provides underlying support for multiplexing.

2. Multiplexing

A domain name corresponds to a connection, and a stream represents a complete request-response process. A frame is the smallest unit of data. Each frame identifies which stream it belongs to. A stream is a data stream composed of multiple frames. Multiplexing means that multiple streams can exist in a TCP connection. Demonstration

3. HTTP2 flaws

  • TCP and TCP+TLS connection establishment delay
  • TCP head-of-line blocking has not been completely solved
  • Multiplexing causes increased server pressure
  • Multiplexing Easy Timeout

1. Connection establishment delay

The TCP connection requires a three-way handshake with the server, which means that data transmission can only be carried out after consuming 1.5 RTTs.

There are two versions of TLS connections - TLS1.2 and TLS1.3. Each version takes a different amount of time to establish a connection, roughly 1 to 2 RTTs.

RTT (Round-Trip Time): Round-trip delay. It refers to the total delay from the start of sending data to the sender receiving confirmation from the receiver (the receiver sends confirmation immediately after receiving the data).

2. Head-of-line blocking has not been completely resolved

In order to ensure reliable transmission, TCP has a "timeout retransmission" mechanism, and lost packets must wait for retransmission confirmation. When HTTP2 loses packets, the entire TCP must wait for retransmission, which will block all requests in the TCP connection.

RTO: The full name of the English name is Retransmission TimeOut, which is the retransmission timeout. RTO is a dynamic value that changes according to changes in the network. RTO is calculated based on the round-trip time RTT of a given connection. The ack returned by the receiver is the sequence number of the next group of packets that it hopes to receive.

3. Multiplexing causes increased server pressure

Multiplexing does not limit the number of simultaneous requests. The average number of requests is the same as usual, but there will be short bursts of many requests, resulting in a sudden increase in QPS.

4. Multiplexing is prone to timeout

A large number of requests are sent at the same time. Since there are multiple parallel streams in the HTTP2 connection, and the network bandwidth and server resources are limited, the resources of each stream will be diluted. Although their start times are shorter, they may all time out.

Even with a load balancer like Nginx, throttling can be tricky to do correctly. Second, even if you introduce or adjust queuing mechanisms to your application, there are only so many connections that can be processed at a time. If you queue requests, also be careful to drop requests after the response times out to avoid wasting unnecessary resources. (Quote: https://www.lucidchart.com/techblog/2019/04/10/why-turning-on-http2-was-a-mistake/)

QUIC

1. Introduction

Google was aware of these problems when it was promoting SPDY, so it started from scratch and developed a QUIC protocol based on the UDP protocol. This is HTTP3. It truly "perfectly" solves the "head of line blocking" problem.

2. Main features

  • Improved congestion control, reliable transmission
  • Quick handshake
  • Integrated TLS 1.3 encryption
  • Multiplexing
  • Connection Migration

3. Improved congestion control and reliable transmission

From the perspective of congestion algorithm and reliable transmission itself, QUIC is just a re-implementation of the TCP protocol. So what are the improvements of the QUIC protocol? The main points are as follows:

(1) Pluggable - Different congestion control algorithms can be implemented at the application level.

Different connections of an application can also support different congestion control configurations. The application can change the congestion control without downtime and upgrades, and different congestion control algorithms can be used for different services, different network standards, and even different RTTs.

Regarding the pluggable congestion control simulation at the application layer, you can experiment with streams on sockets.

(2) Monotonically increasing Packet Number - Packet Number is used instead of TCP's seq.

Each Packet Number is strictly increasing, that is, even if Packet N is lost, the Packet Number of the retransmitted Packet N is no longer N, but a value greater than N. However, the TCP retransmission strategy is ambiguous. For example, a client sends a request and initiates a retransmission after an RTO, but in fact the server has received the first request and the response is on the way. When the client receives the response, the RTT obtained will be smaller than the actual RTT. When Packet N is unique, the correct RTT can be calculated.

(3) Reneging is not allowed - once a packet is Acked, it is assumed to have been received correctly.

Reneging means that the receiver has the right to discard the data in the SACK (Selective Acknowledgment) that has been reported to the sender (for example, discarding out-of-order packets due to insufficient receiving window).

The ACK in QUIC contains information equivalent to the SACK in TCP, but QUIC does not allow any data packets (including those that have been confirmed to be accepted) to be discarded. This not only simplifies the implementation difficulty of the sender and receiver, but also reduces the memory pressure on the sender.

(4) Forward Error Correction (FEC)

Early versions of QUIC had a packet loss recovery mechanism, but it was later abandoned due to increased bandwidth consumption and mediocre results. In FEC, the data of the QUIC data frame is mixed with original data and redundant data to ensure that no matter what the content of the n transmissions that reach the receiving end is, the receiving end can recover all n original data packets. The essence of FEC is XOR. Schematic diagram:

(5) More Ack blocks and increased Ack Delay time.

QUIC can provide 256 Ack Blocks at the same time, so QUIC is more flexible than TCP (using SACK) in reordering, which also allows QUIC to retain more in-transit bytes on the network when reordering or loss occurs (https://blog.csdn.net/u014023993/article/details/85299434). In networks with high packet loss rates, it can improve network recovery speed and reduce retransmissions.

There is a problem with TCP's Timestamp option: the sender sets the sending timestamp when sending a message, and the receiver copies the timestamp field value to the confirmation message timestamp when confirming the message segment, but does not calculate the time from the receiver receiving the packet to sending the Ack. This time can be referred to as Ack Delay, which will cause RTT calculation errors. Now this thing is added into the calculation of RTT.

(6) Flow control based on stream and connection levels.

Why do we need two types of flow control? Mainly because QUIC supports multiplexing. A Stream can be considered as an HTTP request. A Connection can be compared to a TCP connection. Multiplexing means that multiple Streams can exist on a Connection at the same time.

A QUIC receiver advertises the absolute byte offset of the maximum data it is willing to receive on each stream. As data is sent, received, and delivered on a particular stream, the receiver sends WINDOW_UPDATE frames that increase the advertised offset limit for that stream, allowing the peer to send more data on that stream.

In addition to per-stream flow control, QUIC also implements connection-level flow control to limit the total buffer a QUIC receiver is willing to allocate for a connection. Connection flow control works just like stream flow control, but the bytes transferred and maximum receive offset are summed across all streams.

Most importantly, we can limit the transmission rate and ensure service availability through flow control when there is insufficient memory or problems with upstream processing performance.

4. Fast handshake

Since QUIC is based on UDP, QUIC can implement 0-RTT or 1-RTT to establish a connection, which can greatly improve the speed of opening a page for the first time.

5. Integrated TLS 1.3 encryption

TLS 1.3 supports three basic key exchange modes:

  1. (EC)DHE (Diffie-Hellman over finite fields or elliptic curves)
  2. PSK-only
  3. PSK with (EC)DHE

In the case of a full handshake, 1-RTT is required to establish a connection. TLS1.3 resumes the session and can directly send encrypted application data without the need for an additional TLS handshake, which is 0-RTT.

TLS 1.3 0-RTT simple principle diagram (based on DHE):

But TLS 1.3 is not perfect. TLS 1.3's 0-RTT cannot guarantee forward secrecy. In simple terms, if an attacker obtains the Session Ticket Key through some means, the attacker can decrypt previously encrypted data.

To alleviate this problem, you can set the DH static parameters related to the Session Ticket Key to expire in a short period of time (usually a few hours).

6. Multiplexing

QUIC is designed from the ground up for multiplexing, and when a packet carrying data for an individual stream is lost, it usually only affects that stream. There are no dependencies between multiple streams on a QUIC connection, and there are no underlying protocol limitations. If stream2 loses a packet, it only affects the processing of stream2.

7. Connection Migration

TCP determines a connection based on four elements (client IP, port, server IP, port). QUIC, on the other hand, lets the client generate a Connection ID (64 bits) to distinguish between different connections. As long as the Connection ID remains unchanged, the connection does not need to be reestablished, even if the client's network changes. Since the migrating client continues to use the same session key to encrypt and decrypt data packets, QUIC also provides automatic encryption verification of the migrating client.

5. Challenges

1. NAT Concept

In order to solve the problem of insufficient IP addresses, NAT only assigns one IP address to a local area network. The hosts in this network are assigned private addresses. These private addresses are invisible to the outside world. Their external communications must rely on the uniquely assigned IP address. The source IP address of all datagrams leaving the local network and going to the Internet must be replaced by the same NAT, the only difference is the port number.

2. Reasons

The difference between TCP and UDP message headers causes NAT problems.

(1) Port memory problem of NAT device

For HTTP and HTTPS transmission based on TCP, the NAT device can know when the communication starts and ends according to the SYN/FIN status bit of the TCP message header, and remember the start and end of the NAT mapping accordingly.

However, HTTP3 based on UDP transmission does not have SYN/FIN status bits. If the memory of the NAT device is shorter than the user session time, the user session will be interrupted. If the memory time of the NAT device is longer than the user session time, it means that the port resources of the NAT device will be occupied in vain.

The most direct solution is to mimic the TCP SYN/FIN state in the QUIC header, so that the NAT devices along the way know when the session starts and ends. However, this requires upgrading the software of all NAT devices around the world.

Another feasible solution is to let QUIC periodically send Keepalive messages to refresh the memory of the NAT device and avoid automatic release of the NAT device.

(2) NAT device disables UDP

In some NAT network environments (such as some campus networks), the UDP protocol will be prohibited by intermediate network devices such as routers. At this time, the client will directly downgrade and choose alternative channels such as HTTPS to ensure normal business requests.

(3) NGINX load balancing problem concept

The QUIC client has network standard switching. Even if it is in the same mobile data center, the first service request may fall to server A. The subsequent connection will fall to instance B, repeating the complete 1-RTT handshake process.

(4) Global handshake cache

A global handshake cache is established for all QUIC server instances. When the user network switches, the next service request will be established with 0-RTT regardless of which computer room or instance it falls on.

6. HTTP speed tests over the years

VII. Conclusion

Since ancient times, real-time data transmission (audio, video, games, etc.) has faced problems such as freezes and delays. QUIC, based on UDP architecture and improved retransmission features, can effectively improve user experience. Currently, Station B has also connected to QUIC.

If you want to experience QUIC yourself, you can use Libquic, Caddy, etc. In addition, there is also a C++ version of QUIC implementation on github. Using the C++ module of Nodejs, front-end engineers can also quickly implement a node-quic.

<<:  2020 IT Salary Survey: What are the higher-paying positions?

>>:  In the 5G era, how do the three major operators fight the edge war?

Blog    

Recommend

Computer Network Architecture

[[416546]] The formation of computer network arch...

Someone finally explains the true value of 5G

Since 2019, the pace of 5G commercialization has ...

Will 5G replace WiFi? Not in the short term

As global operators begin to pave the way for com...

POTN - the only way for network integration in the new era

In the 21st century, the communication network on...

Five hybrid video conferencing tips for effective collaboration

[[417444]] When Pixoul, a Dallas-based web design...

Distinguish between fat AP and thin AP, full WiFi signal coverage will be easy

Wireless AP is an access point for users who use ...

Top 10 economic predictions for the tech industry in 2024

As we stand on the cusp of a new year, the tech i...

What kind of sparks will be created when 5G meets the power grid?

In the past, electricity changed the way of produ...

What is the situation of my country’s Internet network security in 2017?

Recently, Yun Xiaochun, deputy director and chief...

DMIT.IO Christmas recharge/renewal rebate, high-defense CN2 GIA line VPS 20% off

DMIT.io has launched a Christmas promotion, inclu...

AT&T to use open source SDN technology to prove 5G

As the next generation of mobile communication te...