HTTPS 7-way handshake and 9 times delay

HTTPS 7-way handshake and 9 times delay

HTTP (Hypertext Transfer Protocol) has become the most commonly used application layer protocol on the Internet. However, it is only a network protocol for transmitting hypertext and does not provide any security guarantees. Using plain text to transmit data packets on the Internet makes eavesdropping and man-in-the-middle attacks possible. Transmitting passwords via HTTP is actually the same as running naked on the Internet.

[[317084]]

https-banner

Figure 1 - HTTPS protocol

Netscape designed the HTTPS protocol in 1994, using the Secure Sockets Layer (SSL) to ensure the security of data transmission[^1]. With the development of the Transport Layer Security (TLS) protocol, we have now used TLS to replace the obsolete SSL protocol, but we still use the term SSL certificate[^2].

HTTPS is an extension of the HTTP protocol. We can use it to securely transmit data on the Internet[^3]. However, the initiator of an HTTPS request needs to go through 4.5 times the round-trip time (RTT) to get the first response from the receiver. This article will introduce the process of request initiation and response in detail, and analyze why the HTTPS protocol needs 4.5-RTT time to get the response from the service provider:

  • TCP protocol - The communicating parties establish a TCP connection through a three-way handshake[^4];
  • TLS protocol — The communicating parties establish a TLS connection through a four-way handshake[^5];
  • HTTP protocol — the client sends a request to the server, and the server sends back a response;

The analysis here is based on a specific version of the protocol implementation and common scenarios. With the development of network technology, we can reduce the number of network communications required. This article will mention some common optimization solutions in the corresponding chapters.

TCP

As an application layer protocol, HTTP requires the underlying transport layer protocol to provide basic data transmission functions. HTTP generally uses TCP as the underlying protocol. In order to prevent the wrong establishment of historical connections, the two parties communicating in TCP protocol will establish a TCP connection through a three-way handshake[^6]. Here we briefly review the entire process of TCP connection establishment.

TCP-3-way-handshake

Figure 2 - TCP three-way handshake

(1) The client sends a data segment with SYN to the server and the initial sequence number of the client starting to send the data segment (Segment) SEQ = 100;

(2) When the server receives the data segment, it sends a data segment with SYN and ACK to the client;

  • Acknowledge the initial sequence number of the client data segment by returning ACK = 101;
  • By sending SEQ = 300, the server notifies the client that it has started sending the initial sequence number of the data segment;

(3) The client sends a data segment with ACK to the server, confirming the server's initial sequence number, which includes ACK = 301;

The two parties of a TCP connection will determine the initial sequence number, window size, and maximum data segment of the TCP connection through a three-way handshake. In this way, the communicating parties can use the initial sequence number in the connection to ensure that the data segments of both parties are not duplicated or missed, control the flow through the window size, and use the maximum data segment to avoid the fragmentation of data packets by the IP protocol[^7].

The original version of the TCP protocol does establish a TCP connection through three communications. In most current scenarios, the three-way handshake is unavoidable. However, TCP Fast Open (TFO), proposed in 2014, can establish a TCP connection through one communication in certain scenarios[^8].

tcp-fast-open

Figure 3 - TCP Fast Start

The TCP fast start strategy uses the TFO Cookie stored on the client to quickly establish a connection with the server. When the TCP client sends a SYN message to the server, it carries the fast start option. The server generates a cookie and sends it to the client. The client caches the cookie. When it reestablishes a connection with the server, it uses the stored cookie to directly establish a TCP connection. After the server verifies the cookie, it sends SYN and ACK to the client and starts transmitting data, which can also reduce the number of communications.

TLS

The role of TLS is to build a secure transmission channel on the reliable TCP protocol. It does not provide reliability guarantees by itself. We still need a reliable transport layer protocol at the lower layer. After the two communicating parties have established a reliable TCP connection, we need to exchange the keys of both parties through the TLS handshake. Here we will introduce the connection establishment process of TLS 1.2[^9]:

tls-1-2-handshake

Figure 4 - TLS 1.2 connection establishment

(1) The client sends a Client Hello message to the server, which carries the protocol version, encryption algorithm, compression algorithm supported by the client, and a random number generated by the client;

(2) After the server receives the protocol version and encryption algorithm supported by the client;

  • Send a Server Hello message to the client, carrying the selected specific protocol version, encryption method, session ID, and a random number generated by the server;
  • Send a Certificate message to the client, which is the server's certificate chain, including information such as the domain names supported by the certificate, the issuer, and the validity period;
  • Send a Server Key Exchange message to the client, passing the public key and signature information;
  • Send an optional CertificateRequest message to the client to verify the client's certificate;
  • Send a Server Hello Done message to the client to notify the server that all relevant information has been sent;

(3) After receiving the server's protocol version, encryption method, session ID, and certificate information, the client verifies the server's certificate;

  • Send a Client Key Exchange message to the server, which contains a random string encrypted with the server's public key, the Pre Master Secret.
  • Send a Change Cipher Spec message to the server to inform it that the following data segments will be transmitted encrypted;
  • Send a Finished message to the server, which contains the encrypted handshake information;

(4) After the server receives the Change Cipher Spec and Finished messages;

  • Send a Change Cipher Spec message to the client to inform the client that the following data segments will be transmitted encrypted;
  • Send a Finished message to the client, verify the client's Finished message and complete the TLS handshake;

The key to the TLS handshake is to use the random string generated by the communicating parties and the public key of the server to generate a key negotiated by both parties. The communicating parties can use this symmetric key to encrypt messages to prevent middleman eavesdropping and attacks, thereby ensuring the security of communication.

In TLS 1.2, we need 2-RTT to establish a TLS connection[^10], but TLS 1.3 optimizes the protocol to reduce the two round-trip delays to one, greatly reducing the time required to establish a TLS connection, allowing the client to transmit application layer data to the server after 1-RTT.

I will not go into detail here about the process of establishing a connection with TLS 1.3. In addition to reducing the network overhead under the conventional handshake, TLS 1.3 also introduces the 0-RTT connection establishment process; 60% of network connections are established when users visit a website for the first time or after a period of time. The remaining 40% can be solved by the 0-RTT strategy of TLS 1.3[^11]. However, this strategy is similar to the implementation principle of TFO, both of which are implemented by reusing sessions and caching, so there are certain security risks. When using it, it should also be combined with the specific scenarios of the business.

HTTP

It is relatively simple to transmit data on established TCP and TLS channels. The HTTP protocol can directly use the reliable and secure channels established by the lower layer to transmit data. The client writes data to the server through the TCP socket interface. After receiving and processing the data, the server returns it through the same channel. Because the whole process requires the client to send a request and the server to return a response, it takes 1-RTT.

http-request-and-response

Figure 5 - HTTP request and response

The data exchange of HTTP protocol only consumes 1-RTT. When the client and server only process one HTTP request, we can no longer optimize the HTTP protocol itself. However, as the number of requests gradually increases, HTTP/2 can reuse the established TCP connection to reduce the additional overhead caused by TCP and TLS handshakes.

Summarize

When a client wants to access a server via HTTPS, the whole process requires 7 handshakes and consumes 9 times the delay. If the client and server are limited by physical distance, the RTT is about 40ms, and the first request takes ~180ms; however, if we want to access a server in the United States, the RTT is about 200ms, and the HTTPS request takes ~900ms, which is a relatively high time. Let's summarize the reasons why the HTTPS protocol requires 9 times the delay to complete communication:

  • The TCP protocol needs to establish a TCP connection through a three-way handshake to ensure the reliability of communication (1.5-RTT);
  • The TLS protocol will establish a TLS connection through a four-way handshake on top of the TCP protocol to ensure the security of communication (2-RTT);
  • The HTTP protocol sends requests and receives responses in one round trip (1-RTT) over TCP and TLS;

It should be noted that the calculation of round-trip delay in this article is based on specific scenarios and specific protocol versions. The versions of network protocols are constantly updated and evolved. Problems that were ignored in the past will initially be updated through patches, but in the end they will still need to be rewritten from the bottom up.

HTTP/3 is such an example. It uses the UDP-based QUIC protocol for handshakes, combines the TCP and TLS handshake processes, reduces the 7 handshakes to 3 handshakes, directly establishes a reliable and secure transmission channel, and reduces the original ~900ms time to ~500ms. We will introduce the content related to the HTTP/3 protocol in the following article. In the end, let's look at some relatively open related issues. Interested readers can think carefully about the following questions:

  • As transport layer protocols, what are the similarities and differences between the QUIC protocol and the TCP protocol?
  • Why can 0-RTT be used to establish a client-server connection?

<<:  Racing against time: Why does Weimob's data recovery take so long?

>>:  6 IT roles that need retraining

Recommend

Operators are satisfied with 4G, what can they do after 5G is commercialized?

In the early stage of 4G development, the dividen...

STM32 Network SMI Interface

[[377132]] 01 Introduction to Ethernet The Ethern...

Wireless power could be key to 5G-enabled sustainable smart cities

The Internet of Things (IoT) is estimated to curr...

Private 5G and edge computing: a perfect match for manufacturing

Private 5G is the next evolution of networks for ...

Is the WiFi slow and lag the router? The phone is to blame

Under the guidance of the idea of ​​increasing mo...

Opportunity or chicken ribs? eMTC should not follow the NB-IoT price war

[[255921]] Recently, the official website of the ...

With unlimited data and 5G coming, how far are we from eliminating Wi-Fi?

According to media reports, Wi-Fi may be phased o...