My HTTP/1.1 is so slow! What can I do?

My HTTP/1.1 is so slow! What can I do?

[[383004]]

This article is reprinted from the WeChat public account "Xiao Lin Coding", the author is Xiao Lin Coding. Please contact Xiao Lin Coding public account to reprint this article.

Let me ask you: "Do you know how to optimize HTTP/1.1?"

I think the first thing that comes to your mind is to use KeepAlive to change HTTP/1.1 from a short connection to a long connection.

This is indeed an optimization method. It starts from the underlying transport layer and reduces the number of TCP connection establishment and disconnection times to reduce network transmission delays, thereby improving the transmission efficiency of the HTTP/1.1 protocol.

But in fact, the HTTP/1.1 protocol can be optimized from other perspectives, such as the following three optimization ideas:

  • Try to avoid sending HTTP requests;
  • When you need to send HTTP requests, consider how to reduce the number of requests;
  • Reduce the data size of the server's HTTP response;

Next, let’s take a look at the specific optimization methods for these three ideas.

1 How to avoid sending HTTP requests?

Do you find this idea strange? If we don't send HTTP requests, how can the client interact with the server? Xiaolin, aren't you being a hooligan?

Calm down, you are right, of course the client has to send a request to the server.

However, for some repetitive HTTP requests, for example, when the data obtained for each request is the same, we can cache the "request-response" data locally, and then read the local data directly next time without having to obtain the server's response through the network. In this way, the performance of HTTP/1.1 will definitely be visibly improved.

Therefore, the way to avoid sending HTTP requests is to use caching technology. HTTP designers have considered this long ago, so there are many fields in the HTTP protocol header that are for caching.

So how does caching work?

The client will save the first request and response data on the local disk, with the requested URL as the key and the response as the value, forming a mapping relationship between the two.

In this way, when the same request is initiated later, the corresponding value, that is, the response, can be found on the local disk by key. If it is found, the response is read directly from the local disk. Undoubtedly, the speed of reading the disk this time is definitely much faster than the network request, as shown in the following figure:

You may be smart enough to think that what if the cached response is not the latest and the client is unaware of it?

Don't worry, HTTP designers have already considered this issue.

Therefore, when the server sends an HTTP response, it estimates an expiration time and puts this information in the response header. In this way, when the client checks the information in the response header, if it finds that the cached response is expired, it will resend the network request. There are many header fields in HTTP about cache, which will be left in the next article. I will not explain them in detail this time.

If the client finds that the response header from the first request is expired, the client resends the request. Assuming that the resource on the server has not changed and is still the same, do you think it is necessary to include this resource in the server's response?

Obviously, without it, the performance of HTTP protocol can be improved, but how to do it specifically?

The client only needs to include the summary of the response header of the first request in the Etag header of the request when resending the request. This summary uniquely identifies the resource of the response. When the server receives the request, it compares the summary of the local resource with the summary in the request.

If they are different, it means that the client's cache is no longer valuable, and the server includes the latest resource in the response.

If they are the same, it means that the client's cache can still be used. Then the server only returns a 304 Not Modified response without a package body to tell the client that it is still valid. This can reduce the delay in transmitting the response resource in the network, as shown in the following figure:

Cache is really a universal key to performance optimization, ranging from CPU Cache, Page Cache, Redis Cache to HTTP protocol cache.

2 How to reduce the number of HTTP requests?

Reducing the number of HTTP requests will naturally improve HTTP performance. You can start from the following three aspects:

  • Reduce the number of redirect requests;
  • Merge requests;
  • Delayed sending of requests;

2.1 Reduce the number of redirect requests

Let's first take a look at what is a redirect request?

A resource on the server may be moved from url1 to url2 due to migration, maintenance, etc., but the client is unaware of this and continues to request url1. At this time, the server cannot rudely return an error, but instead uses a 302 response code and Location header to tell the client that the resource has been moved to url2. The client then needs to send a request for url2 to obtain the server's resources.

Well, if there are more redirection requests, the client will have to initiate HTTP requests multiple times, and each HTTP request has to go through the network, which will undoubtedly reduce network performance.

In addition, there is often more than one server on the server side. For example, the source server is a proxy server at the first level, and then the proxy server communicates with the client. In this case, client redirection will result in two message transmissions between the client and the proxy server, as shown in the following figure:

If the redirection work is done by the proxy server, the number of HTTP requests can be reduced, as shown below:

And when the proxy server knows the redirection rules, the number of message transmissions can be further reduced, as shown in the following figure:

In addition to the 302 redirect response code, there are some other redirect response codes, as you can see from the figure below:

Among them, the 301 and 308 response codes tell the client that the redirect response can be cached to the local disk, and then the client will automatically use url2 instead of url1 to access the server's resources.

2.2 Merge Requests

If multiple requests to access small files are combined into one large request, although the total resources transmitted remain the same, reducing the number of requests means reducing the number of HTTP headers that are sent repeatedly.

In addition, since HTTP/1.1 is a request-response model, if the first request sent does not receive a corresponding response, subsequent requests will not be sent. In order to prevent blocking of a single request, browsers generally initiate 5-6 requests at the same time. Each request is a different TCP connection. If the requests are merged, the number of TCP connections will be reduced, thereby saving the time consumed by the TCP handshake and slow start process.

Next, let’s look at the different ways to merge requests.

Some web pages contain many small images and icons. The client needs to make as many requests as there are small images. For these small images, we can consider using CSS Image Sprites technology to combine them into a large image, so that the browser can obtain a large image with one request, and then cut the large image into multiple small images according to CSS data.

Image source: CSDN of Mo Ran Feng Lin

This method combines multiple small images into one large image to reduce the number of HTTP requests, thereby reducing network overhead.

In addition to merging small images into large images, the server can also use packaging tools such as webpack to merge js, css and other resources into large files to achieve a similar effect.

In addition, you can also encode the binary data of the image with base64, embed it into the HTML file in the form of a URL, and send it along with the HTML file.

  1. <image src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAPoAAAFKCAIAAAC7M9WrAAAACXBIWXMAA ... />

In this way, after the client receives the HTML, it can directly decode the data and then display the image directly without initiating image-related requests, thus reducing the number of requests.

Image source: Chen Jianping's CSDN

As you can see, the way to merge requests is to merge resources, replacing multiple requests for small resources with one request for a large resource.

However, such merge requests will bring new problems. When a small resource in a large resource changes, the client must re-download the entire large resource file, which obviously brings additional network consumption.

2.3 Delayed Request

Don't try to get too much at once. Generally, HTML contains many HTTP URLs. We don't need to obtain resources that are not needed at the moment. So we can reduce the number of HTTP requests in the first place by "obtaining on demand".

When requesting a web page, it is not necessary to obtain all resources. Instead, only the page resources currently viewed by the user are obtained. When the user scrolls down the page, the next resource is obtained from the server, thus achieving the effect of delaying the sending of requests.

3 How to reduce the data size of HTTP response?

For HTTP requests and responses, the data size of HTTP responses is usually larger, that is, the resources returned by the server are larger.

Therefore, we can consider compressing the response resources, which can reduce the response data size and improve the efficiency of network transmission.

There are generally two types of compression methods:

  • Lossless compression;
  • Lossy compression;

3.1 Lossless Compression

Lossless compression means that after the resources are compressed, the information is not destroyed and can be completely restored to its original state before compression. It is suitable for text files, program executable files, and program source code.

First, we compress the code according to its syntax rules, because code files usually have many line breaks or spaces, which are to help programmers read better, but the machine does not need these characters when executing, so we remove these redundant symbols.

Next is lossless compression, which requires building a statistical model for the original resources. Using this statistical model, frequently occurring data is represented by a shorter binary bit sequence, and infrequently occurring data is represented by a longer binary bit sequence. The binary bit sequence is generally generated using the "Huffman coding" algorithm.

Gzip is a common lossless compression. The compression algorithm supported by the client will be told to the server through the Accept-Encoding field in the HTTP request header:

  1. Accept-Encoding: gzip, deflate, br

After receiving the message, the server will select a compression algorithm that the server supports or is suitable, and then use this compression algorithm to compress the response resource. Finally, the server will tell the client the compression algorithm used for the resource through the content-encoding field in the response header.

  1. content-encoding: gzip

The compression efficiency of gzip is still inferior to that of Brotli algorithm launched by Google, which is br mentioned above, so if possible, the server should choose br compression algorithm with higher compression efficiency.

3.2 Lossy Compression

The opposite of lossless compression is lossy compression. After compression in this way, the decompressed data will be different from the original data but very close.

Lossy compression mainly discards minor data and sacrifices some quality to reduce the amount of data and improve the compression ratio. This method is often used to compress multimedia data, such as audio, video, and pictures.

You can tell the server the expected resource quality through the "q quality factor" in the Accept field in the HTTP request header.

  1. Accept: audio/*; q=0.2, audio/basic

Regarding image compression, the WebP format launched by Google currently has a relatively high compression ratio. The following figure shows the compression ratio comparison between it and the common PNG format image:

Source: https://isparta.github.io/compare-webp/index.html

It can be found that under the same image quality, the size of WebP format images is smaller than that of PNG format images, so for websites with a large number of images, you can consider using WebP format images, which will greatly improve the performance of network transmission.

Regarding the compression of audio and video, audio and video are mainly dynamic, each frame has a time relationship, and usually the changes between consecutive frames in time are very small.

For example, in a video of a person reading a book, only the hands of the person and the book on the desk will change, while other parts are usually static. Therefore, only incremental data is needed in a static key frame to express subsequent frames, which reduces a lot of data and improves network transmission performance. Common video encoding formats include H264, H265, etc., and common audio encoding formats include AAC and AC3.

Summarize

This time, we mainly introduced the ideas for optimizing the HTTP/1.1 protocol from three aspects.

The first idea is to avoid sending HTTP requests through caching technology. After the client receives the response to the first request, it can cache it on the local disk. The next time it requests, if the cache has not expired, it will directly read the response data from the local cache. If the cache expires, the client sends a request with a summary of the response data. After comparing, if the server finds that the resource has not changed, it will send a 304 response without a body, telling the client that the cached response is still valid.

The second idea is to reduce the number of HTTP requests. There are the following methods:

The redirection request originally handled by the client is handed over to the proxy server for processing, which can reduce the number of redirection requests;

Merging multiple small resources into a large resource for transmission can reduce the number of HTTP requests and repeated transmission of headers, thereby reducing the number of TCP connections, thereby eliminating the network consumption of TCP handshake and slow start;

Access resources on demand, only access resources that the current user can see/use. When the customer scrolls down, access the next resource, thereby achieving delayed request and reducing the number of HTTP requests at the same time.

The third idea is to reduce the size of transmission resources by compressing response resources, thereby improving transmission efficiency, so a better compression algorithm should be selected.

No matter how you optimize the HTTP/1.1 protocol, it is limited. Otherwise, there would not be HTTP/2 and HTTP/3 protocols. We will introduce HTTP/2 and HTTP/3 protocols later.

<<:  In the era of stock management, operators still need to provide refined services

>>:  my country has surpassed the United States in many technologies, including AI and 5G, and it is becoming increasingly difficult for the United States to strangle us

Recommend

Cisco pledges to be carbon neutral by 2040

In response to calls to limit global warming to 1...

RabbitMQ communication model work model

Hello everyone, I am Zhibeijun. Today, I will lea...

Hizakura: €9.95/year-1GB/15G SSD/2TB/Netherlands data center

Hizakura is a Dutch merchant founded in 2021. The...

LowEndTalk (LEB) 2020 Low-End VPS Voting Ranking

A few years ago, LET often carried out voting act...

Overview of the Latest Data Center Network Architecture Technologies

The network is the most important part of the dat...

2022 UBBF | Huawei iMaster NCE promotes FTTR intelligent monetization

[Bangkok, Thailand, October 28, 2022] Recently, t...

Transition technology from IPv4 to IPv6

As IPv4 addresses are about to be exhausted, the ...

Slow Internet speed? These 8 methods can completely solve it

Here are 8 ways to fix a slow Internet connection...