Fatal question: How many HTTP requests can be sent through a TCP connection?

Fatal question: How many HTTP requests can be sent through a TCP connection?

There was once such an interview question: What happens from the time the URL is entered into the browser to the time the page is displayed?

I believe that most of the students who have prepared can answer this question, but if you continue to ask: If the received HTML contains dozens of image tags, how are these images downloaded, in what order, how many connections are established, and what protocol is used?

[[266708]]

To understand this problem, we need to solve the following five problems:

  1. After a modern browser establishes a TCP connection with a server, will it disconnect after an HTTP request is completed? Under what circumstances will it disconnect?
  2. How many HTTP requests can a TCP connection correspond to?
  3. Can HTTP requests be sent together in one TCP connection (for example, three requests are sent together and three responses are received together)?
  4. Why sometimes refreshing a page does not require re-establishing an SSL connection?
  5. Is there any limit on the number of TCP connections that a browser can establish to the same host?

*** Questions

After a modern browser establishes a TCP connection with a server, will it disconnect after an HTTP request is completed? Under what circumstances will it disconnect?

In HTTP/1.0, a server will disconnect the TCP connection after sending an HTTP response. However, each request will re-establish and disconnect the TCP connection, which is too costly. Therefore, although it is not set in the standard, some servers support the Connection: keep-alive Header. This means that after completing the HTTP request, do not disconnect the TCP connection used by the HTTP request. The advantage of this is that the connection can be reused, and there is no need to re-establish the TCP connection when sending HTTP requests later. If the connection is maintained, the overhead of SSL can also be avoided. The two pictures are the time statistics of my two visits to https://www.github.com in a short period of time:


For the first access, there is initial connection and SSL overhead.


The initial connection and SSL overhead disappears, indicating that the same TCP connection is being used.

Persistent connection: Since maintaining a TCP connection has so many benefits, HTTP/1.1 includes the Connection header in the standard and enables persistent connections by default. Unless the request specifies Connection: close, the TCP connection between the browser and the server will be maintained for a period of time and will not be disconnected when a request is completed.

So the answer to the first question is: by default, an established TCP connection will not be disconnected. The connection will only be closed after the request is completed if Connection: close is declared in the request header.

Second question

How many HTTP requests can a TCP connection correspond to?

After understanding the first question, in fact, this question already has an answer. If the connection is maintained, one TCP connection can send multiple HTTP requests.

The third question

Can HTTP requests be sent together in one TCP connection (for example, three requests are sent together and three responses are received together)?

There is a problem with HTTP/1.1. A single TCP connection can only process one request at a time. This means that the lifecycles of two requests cannot overlap, and the start and end time of any two HTTP requests cannot overlap in the same TCP connection.

Although Pipelining is specified in the HTTP/1.1 specification to try to solve this problem, this feature is turned off by default in browsers.

Let's first take a look at what Pipelining is. RFC 2616 stipulates:

  • A client that supports persistent connections MAY "pipeline" its requests (ie, send multiple requests without waiting for each response). A server MUST send its responses to those requests in the same order that the requests were received.

As for why the standard is set this way, we can roughly speculate one reason: HTTP/1.1 is a text protocol, and the returned content cannot distinguish which request it corresponds to, so the order must be consistent. For example, if you send two requests to the server, GET/query?q=A and GET/query?q=B, and the server returns two results, the browser has no way to determine which request the response corresponds to based on the response results.

  • Pipelining is a good idea, but there are many problems in practice:
  • Some proxy servers do not handle HTTP Pipelining correctly.
  • Correct pipelining implementation is complex.

Head-of-line Blocking: After a TCP connection is established, suppose the client sends several requests to the server in succession. According to the standard, the server should return the results in the order in which the requests were received. Suppose the server spends a lot of time processing the first request, then all subsequent requests need to wait for the first request to be completed before responding.

Therefore, modern browsers do not enable HTTP Pipelining by default.

However, HTTP2 provides the Multiplexing feature, which can complete multiple HTTP requests simultaneously in one TCP connection. As for how Multiplexing is implemented, that is another question. Let's take a look at the effect of using HTTP2.


Green is the waiting time from initiating the request to the request returning, and blue is the download time of the response. You can see that they are all completed in parallel on the same Connection.

So this question has an answer: In HTTP/1.1, there is Pipelining technology that can complete the sending of multiple requests at the same time, but since it is disabled by default in browsers, it can be considered infeasible. In HTTP2, due to the Multiplexing feature, multiple HTTP requests can be performed in parallel in the same TCP connection.

So how do browsers improve page loading efficiency in the HTTP/1.1 era? There are two main reasons:

  • Maintain the established TCP connection with the server and process multiple requests sequentially on the same connection.
  • Establish multiple TCP connections with the server.

The fourth question

Why sometimes refreshing a page does not require re-establishing an SSL connection?

The answer has been given in the discussion of the first question. Sometimes the TCP connection will be maintained for a period of time by the browser and the server. TCP does not need to be re-established, and SSL will naturally use the previous one.

The fifth question

Is there any limit on the number of TCP connections that a browser can establish to the same host?

Assuming we are still in the HTTP/1.1 era, when there was no multi-channel transmission, what should the browser do when it gets a web page with dozens of pictures? It certainly cannot just open a TCP connection to download them sequentially, as that would make the user wait uncomfortably. However, if a TCP connection is opened for each picture to send an HTTP request, the computer or server may not be able to handle it. If there are 1,000 pictures, you cannot open 1,000 TCP connections, and your computer may not agree even if NAT is used.

So the answer is: Yes. Chrome allows up to six TCP connections to the same host. There are some differences between different browsers.

https://developers.google.com/web/tools/chrome-devtools/network/issues#queued-or-stalled-requestsdevelopers.google.com

So back to the original question, if the received HTML contains dozens of image tags, how are these images downloaded, in what order, how many connections are established, and what protocol is used?

If all images are HTTPS connections and under the same domain name, then after the SSL handshake, the browser will negotiate with the server whether HTTP2 can be used. If it can, it will use the Multiplexing function to multiplex the connection. However, it is not necessarily the case that all resources on this domain name will be obtained using a TCP connection, but it is certain that Multiplexing will most likely be used.

What if you find that you cannot use HTTP2? Or you cannot use HTTPS (in reality, HTTP2 is implemented on HTTPS, so you can only use HTTP/1.1). Then the browser will establish multiple TCP connections on a HOST. The maximum number of connections depends on the browser settings. These connections will be used by the browser to send new requests when they are idle. What if all connections are sending requests? Then other requests can only wait.

<<:  Ten ways for Vue.js parent-child component communication

>>:  5G is coming: 3 ways it will benefit your business

Recommend

Ethernet Packet Architecture

[[352785]] 01Overview The term Ethernet generally...

Top 10 5G Industry Practice Benchmarks in 2022

In 2022, 5G construction has made breakthrough pr...

How 5G can help realize massive IoT

When discussing the coming 5G era, attention is o...

Juniper Networks MIST AI network solution gives network engineers "superpowers"!

[51CTO.com original article] Under the night, the...

Are wireless networks more energy efficient than wired LANs?

Sustainability and reducing energy consumption ar...

Why do I always see pop-up ads? Yes, it’s a DNS problem

What is DNS? Each IP address can have a host name...

Interesting explanation of TCP three-way handshake and four-way wave

Students who have studied computer networks know ...

Eight excellent open source intranet penetration tools

Intranet penetration (NAT penetration) is a techn...

Mobile edge computing provides unlimited possibilities for 5G innovation

At the "2017 China MEC Industry Development ...

IP address conversion: conversion between numbers and strings

There are generally two formats for storing IP ad...