1 Introduction Today, let's study some things about the Http protocol. Through this article, you will learn the following:
Let's ride the wind and waves to the ocean of knowledge. Captain Dabai is about to set sail! 2. Comparison of HTTP protocol versions The Http Hypertext Transfer Protocol is like air. You can't feel its existence but it is everywhere. The author extracted some simple information about the development of the Http protocol from Wikipedia. Let's take a look: The Hypertext Transfer Protocol is an application protocol for distributed collaborative hypermedia information systems. The Hypertext Transfer Protocol is the basis for data communications on the World Wide Web, where hypertext documents include hyperlinks to other resources that users can easily access. Tim Berners-Lee initiated the development of the Hypertext Transfer Protocol at CERN in 1989. The development of the early Hypertext Transfer Protocol Requests for Comments (RFCs) was a joint effort of the Internet Engineering Task Force (IETF) and the World Wide Web Consortium (W3C), with the work later transferred to the IETF. Introduction to Tim Berners-Lee, the Father of the World Wide Web Tim Berners-Lee is a British engineer and computer scientist, best known as the inventor of the World Wide Web. He is a professor of computer science at Oxford University and a professor at MIT. He proposed an information management system on March 12, 1989, and then realized the first successful communication between a Hypertext Transfer Protocol HTTP client and a server through the Internet in mid-November of the same year. He is the head of the World Wide Web Consortium (W3C), which oversees the continued development of the Web. He is also the founder of the World Wide Web Foundation. He is also the 3Com Founding Chairman and Senior Fellow at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL). He is also the Director of the Web Science Research Initiative (WSRI) and a member of the Advisory Board of the MIT Center for Collective Intelligence. He is also the founder and president of the Open Data Institute and is currently an advisor to the social network MeWe. In 2004, Berners-Lee was knighted by Queen Elizabeth II for his groundbreaking work. In April 2009, he was elected as a foreign fellow of the National Academy of Sciences of the United States, listed in Time magazine's list of the 100 most important people of the 20th century, and was hailed as the "inventor of the World Wide Web" and won the 2016 Turing Award. Basic information about each version of http After more than 20 years of evolution, the HTTP protocol has five major versions: 0.9, 1.0, 1.1, 2.0, and 3.0. The author drew a picture for you to see: A.Http0.9 version 0.9 is the original version, and its main features include:
B.Http1.0 version Version 1.0 is mainly an enhancement of version 0.9, and the effect is quite obvious. The main features and shortcomings include:
C.Http1.1 version Version 1.1 was released about a year after version 1.0. It is an optimization and improvement of version 1.0. The main features of version 1.1 include:
D.Http2.0 version Version 2.0 is a milestone version. Compared with version 1.x, it has many optimizations to adapt to the current network scenarios. Some important features include:
3 Http2.0 Detailed Explanation We have compared the evolution and optimization processes of several versions. Next, we will take a deep look at some of the features of version 2.0 and its basic implementation principles. In comparison, version 2.0 is not an optimization of version 1.1 but an innovation, because 2.0 carries more performance target tasks. Although 1.1 adds long connections and pipelining, it does not fundamentally achieve true high performance. The design goal of 2.0 is to provide users with a faster, simpler, and safer experience while being compatible with 1.x semantics and operations, and to efficiently utilize the current network bandwidth. To this end, 2.0 has made many adjustments, mainly including: binary framing, multiplexing, header compression, etc. Akamai compared the loading effects of http2.0 and http1.1 (the loading time of 379 small fragments on my computer was 0.99s VS 5.80s in the experiment): https://http2.akamai.com/demo 3.1 SPDY Protocol To talk about the 2.0 version standard and new features, we must mention Google's SPDY protocol. Take a look at Baidu Encyclopedia: SPDY is a TCP-based session layer protocol developed by Google to minimize network latency, increase network speed, and optimize the user's network experience. SPDY is not a protocol to replace HTTP, but an enhancement of the HTTP protocol. The new protocol features include data stream multiplexing, request prioritization, and HTTP header compression. Google said that after the introduction of the SPDY protocol, page loading speeds in lab tests were 64% faster than before. Subsequently, the SPDY protocol was supported by major browsers such as Chrome and Firefox, and was deployed on some large and small websites. This efficient protocol attracted the attention of the HTTP working group, and the official Http2.0 standard was formulated on this basis. In the following years, SPDY and Http2.0 continued to evolve and promote each other. Http2.0 allowed servers, browsers, and website developers to have a better experience with the new protocol and was quickly recognized by the public. 3.2 Binary Framing Layer The binary framing layer redesigns the encoding mechanism without changing the request method and semantics. The figure shows the http2.0 layered structure (picture from reference 4): The binary encoding mechanism enables communication to take place over a single TCP connection that remains active for the duration of the conversation. The binary protocol breaks down the communication data into smaller frames. The data frames are filled in the bidirectional data flow between the client and the server, just like a two-way multi-lane highway with constant flow of traffic: To understand the binary framing layer, you need to know four concepts:
The four are one-to-many inclusion relationships. The author drew a picture: Let's take a look at the structure of the HeadersFrame header frame: Let's take a look at the structure of the HeadersFrame header frame: from each field, you can see the length, type, flag, stream identifier, data payload, etc. If you are interested, you can read the relevant rfc7540 documents.
In short, version 2.0 breaks down communication data into binary coded frames for exchange. Each frame corresponds to a specific message in a specific data stream. All frames and streams are multiplexed within a TCP connection. The binary framing protocol is an important foundation for other functions and performance optimizations of 2.0. 3.3 Multiplexing There is a head-of-line blocking problem in version 1.1. Therefore, if the client wants to initiate multiple parallel requests to improve performance, it must use multiple TCP connections, which will incur greater delays and link establishment and teardown costs, and cannot effectively utilize TCP links. The use of a new binary framing protocol in version 2.0 breaks through many limitations of version 1.0 and fundamentally achieves true request and response multiplexing. The client and server break down the interactive data into independent frames, transmit them interleavedly without affecting each other, and finally reassemble them at the other end based on the stream identifier in the frame header, thereby achieving multiplexing of the TCP link. The figure shows the frame-based message communication process of version 2.0 (picture from reference 4): 3.4 Header Compression A.Header redundant transmission We all know that HTTP requests have a header part. Each packet has one and most packets have the same header part for a link. In this case, it is really a waste to transmit the same part every time. In the modern network, each web page contains an average of more than 100 http requests, and each request header has an average of 300-500 bytes, with a total data volume of more than tens of KB. This may cause data delays, especially in complex WiFi environments or cellular networks. In this case, you can only see the phone spinning in circles, but there is usually almost no change between these request headers. It is indeed not an efficient approach to transmit the same data part multiple times in an already crowded link. The congestion control designed based on TCP has the AIMD characteristic. If packet loss occurs, the transmission rate will drop significantly. In a crowded network environment, a large packet header means that the low-speed transmission caused by congestion control will be aggravated. B.Http compression and criminal attacks Before the 2.0 version of the HPACK algorithm, http compression used gzip. The later proposed SPDY algorithm made special designs for Headers, but it still used the DEFLATE algorithm. In some subsequent practical applications, it was found that both DEFLATE and SPDY are vulnerable to attacks. Because the DEFLATE algorithm uses backward string matching and dynamic Huffman coding, attackers can control part of the request header by modifying the request part and then see how much the size changes after compression. If it becomes smaller, the attacker knows that the injected text is repeated in some content of the request. This process is a bit like the elimination process of Tetris. After a period of attempts, the data content may be completely figured out. Due to the existence of this risk, safer compression algorithms have been developed. C.HPACK algorithm In version 2.0, the HPACK algorithm uses a header table in the C/S to store previously sent key-value pairs. For common key-value pairs that hardly change during the same data communication, they only need to be sent once. In extreme cases, if the request header does not change each time, the header is not included in the transmission, that is, the header overhead is zero bytes. If the header key-value pair changes, only the changed data needs to be sent, and the newly added or modified header frame will be appended to the header table. The header table always exists during the life of the connection and is updated and maintained by the client and server. Simply put, the client and server jointly maintain a key-value structure. When changes occur, they are updated and transmitted, otherwise they are not transmitted. This is equivalent to the initial full transmission followed by incremental updates and transmissions. This idea is also very common in daily development, so don't think too much about it. The figure shows the update process of the header table (picture from reference 4): Related documents of hpack algorithm:
3.5 Server Push Server push is a powerful new feature added in version 2.0. Different from the general question-and-answer C/S interaction, in push-based interaction, the server can send multiple responses to a client request. In addition to the response to the initial request, it also pushes additional resources to the client without the client's explicit request. For example: Imagine that you go to a restaurant to eat. A fast food restaurant with good service will provide you with napkins, chopsticks, spoons and even seasonings after you order a bowl of beef noodles. Such proactive service saves guests’ time and improves the dining experience. This method of actively pushing additional resources is very effective in actual C/S interactions, because almost every network application contains multiple resources, and the client needs to obtain them all one by one. At this time, if the server pushes these resources in advance, it can effectively reduce the additional delay time, because the server can know what resources the client will request next. The following figure shows the simple process of server push (picture from reference 4): 4. Conclusion This article introduces the historical evolution of the HTTP protocol, the main features and advantages and disadvantages of each version, and focuses on some features of the HTTP 2.0 protocol, including: SPDY protocol, binary framing protocol, multiplexing, header compression, server push and other important functions. Due to limited space, I cannot expand on it too much. Although the http2.0 version of the protocol has many excellent features and was officially released in 2015, and some major manufacturers at home and abroad now basically use http2.0 to handle some requests, it is still not widely popular. Currently, the http3.0 version was launched in 2018. As for the promotion and popularization of http2.0 and http3.0, it will take time, but we firmly believe that our network can be safer, faster and more economical. |
<<: 5G standards usher in new upgrades, driving development into a new stage
>>: 5G, edge computing and the Industrial Internet of Things
Predictions from global telecom industry experts ...
First we must understand that a TCP socket in the...
For IP network engineers, the deployment of routi...
iWebFusion's 25% discount coupon for VPS host...
RackNerd has released a special package for the 6...
ICT industry recovers According to statistics fro...
At present, for any ToB market, 5G+AI is creating...
Just as a manned spacecraft was sent into space, ...
EuroCloud was established in 2021 and registered ...
The tribe shared news about FantomNetworks twice ...
Recently, StreamNative solemnly announced the rel...
Pesyun (Standard Interconnect) has launched the 2...
Thanks to advances in artificial intelligence (AI...
In the era of the Internet of Everything, with th...
Building a real-time web or mobile application is...