Prerequisites OSI architecture TCP/IP related protocol structure Application layer HTTP, Telnet, FTP, etc. Presentation layer Session layer Transport layer TCP, UDP Network layer IP Data link layer Physical layer We know that the HTTP protocol is built on the basis of TCP connection. HTTP is a protocol that allows browsers to obtain resources from servers. It is the basis of the Web. It is usually initiated by browsers to obtain different types of files, such as HTML files, CSS files, JavaScript files, pictures, videos, etc. In addition, HTTP is also the most widely used protocol by browsers. If we don't know much about HTTP, we may have such doubts, such as why visiting the same site again is faster than the first time, and why the website is in the logged-in state when visited again after logging in once. We can solve these mysteries by analyzing the HTTP request process. The browser initiates the HTTP request process Enter the URL in the browser: http://time.geekbang.org/index.html. What steps will be completed afterwards? 1. Build a request First, the browser builds the request line information. After that, the browser is ready to initiate the network request.
2. Find the cache Before actually initiating a network request, the browser will first query the browser cache to see if there is the file to be requested. Browser cache is a technology that saves a copy of the resource locally for direct use the next time the request is made. When the browser finds that the requested resource already exists in the browser cache, it will intercept the request and return the resource copy to end the request. If the cache search fails, it will enter the network request. So it will be beneficial to:
3. Prepare IP address and port We have roughly understood the relationship between HTTP and TCP through the preliminary knowledge at the beginning and the previous text. The browser uses HTTP protocol as the application layer protocol to encapsulate the text information of the request; and uses TCP/IP as the transport layer protocol to send it to the network, so before HTTP starts working, the browser needs to establish a connection with the server through TCP. In other words, the content of HTTP is realized through the data transmission stage of TCP. Schematic diagram of the relationship between TCP and HTTP: Based on this, we can know that establishing an HTTP network request is to resolve the URL address to obtain IP and port information, and establish a server and TCP connection. We have mentioned in the previous article "TCP Protocol" that data packets are transmitted to the recipient through IP addresses. The general address of our website is the domain name, so it is necessary to map the domain name and IP address, that is, the system "Domain Name System (DNS)" that resolves the IP address resolves the IP address and obtains the corresponding port number to obtain the prerequisite for establishing a connection. In other words, the browser requests DNS to return the IP corresponding to the domain name, and when requesting DNS, it will also query the DNS data cache service to determine whether the domain name has been resolved. If it has been resolved, the query is used directly. After obtaining the IP, it is determined whether the URL specifies the port number. If not, the HTTP protocol defaults to port 80. 4. Waiting for TCP queue Chrome has a mechanism that only 6 TCP connections can be established at the same time for the same domain name. If there are 10 requests at the same time under the same domain name, 4 of them will enter the queue waiting state until the ongoing requests are completed. Of course, if the current number of requests is less than 6, it will directly proceed to the next step to establish a TCP connection. 5. Establish a TCP connection After the queue waiting ends, TCP and the server implement a "three-way handshake" (described in the previous TCP protocol), that is, the client and server send three data packets to confirm the connection, thus realizing the connection between the browser and the service. 6. Send HTTP request Once the TCP connection is established, the browser can communicate with the server. The data in HTTP is transmitted during this communication process. HTTP request data format: First, the browser sends a request line to the server, which includes the request method, request URI (Uniform Resource Identifier) and HTTP version protocol. The request methods include GET, POST, PUT, Delete, etc. The commonly used POST is used to send some data to the server, such as logging into a website and sending user information to the server. Generally, this data is sent through the request body. After the browser sends the request line command, it also sends some other information in the form of a request header to tell the server some basic information about the browser, such as the operating system used by the browser, the browser kernel, the domain name information of the current request, and cookies. Server-side HTTP request processing process 1. Return request
Through the curl tool (or network panel), we can understand the data format returned by the server: First the server returns a response line, including the protocol version and status code. If an error occurs, the server returns the corresponding processing result through the status code of the request line, for example:
Just as the browser sends a request header along with the request, the server also sends a response header to the browser along with the response. The response header contains some information about the server itself, such as the time when the server generated the return data, the type of data returned (JSON, HTML, streaming media, etc.), and the cookies that the server wants to save on the client. After the response header, the server will send the response body data, which usually contains the actual content of the HTML. The above is the process of the server responding to the browser. 2. Disconnect Once the server returns the request data to the client, it closes the TCP connection. However, if the browser or server adds the following to its header information:
The TCP connection will remain open after sending, so that the browser can continue to send requests through the same TCP connection. Maintaining a TCP connection can save the time needed to establish a connection for the next request and increase resource loading speed. If the images embedded in a page are all from the same web site, initializing a persistent connection can reuse and reduce TCP connections. 3. Redirection Redirect returns the response line and response headers: Status 301 tells the browser that I need to redirect to another URL, and the URL that needs to be redirected is contained in the Location field of the response header. Next, the browser obtains the address in the Location field and uses the address to navigate again. This is the execution process of a complete redirection. Summarize Through the complete process of http request, we know that during the request process, DNS cache and page resource cache will be cached by the browser to reduce the resources requested from the server, so the speed will be faster when requesting the site again. Browser resource cache processing process: As can be seen from the first request in the figure above, when the server returns the HTTP response header to the browser, the browser uses the Cache-Control field in the response header to set whether to cache the resource. Usually, we also need to set a cache expiration time for this resource, and this time is set through the Max-age parameter in Cache-Control. Therefore, if the cached resource has not expired, if the resource is requested again, the resource in the cache will be directly returned to the browser. If the cache expires, the browser will continue to initiate a network request and include If-None-Match in the HTTP request header. After receiving the request header, the server will determine whether the requested resource has been updated based on the value of If-None-Match.
Log in to the website and submit information to the server via POST. After the server receives the information submitted by the browser, it will query and verify that the information is correct. It will generate a string indicating the user's identity and write it into the Set-Cookie field of the response header and return it to the browser. The browser parses the response header and saves it locally if there is a Set-Cookie field. When the user visits again, the browser reads the cookie data and writes it into the request header and sends it to the server before initiating an HTTP request. The server judges the information again and displays the user login status and user information if it is correct. Finally, it is concluded that the HTTP request in the browser goes through eight stages from initiation to completion: building a request, searching the cache, preparing the IP and port, waiting for the TCP queue, establishing a TCP connection, initiating an HTTP request, the server processing the request, the server returning the request, and disconnecting. Detailed HTTP request process: |
>>: How much does it cost to build a 5G base station?
After the rapid development in 2020, 2021 is a cr...
UUUVPS (Sanyou Cloud) launched the promotion duri...
[[353771]] This article is reprinted from the WeC...
Intranet penetration (NAT penetration) is a techn...
Many businesses have launched spring promotions i...
[51CTO.com Quick Translation] A research team is ...
With the rapid development of deep learning, a wh...
When I first saw the word "Numerology" ...
The Federal Communications Commission (FCC) voted...
400G backbone network, a term that has not yet ar...
1. Market demand Mobile Internet is mainly orient...
There is a lot of discussion around the next gene...
Wireless routers have entered thousands of househ...
[[385177]] 100G transmission in data centers is p...
V5.NET is a professional independent server renta...