Easy to understand, this article will introduce you to the HTTP protocol?

Easy to understand, this article will introduce you to the HTTP protocol?

1. What is http?

Http protocol is Hypertext transfer protocol (HTTP-Hypertext transfer protocol).

It defines how a browser (i.e., a World Wide Web client process) requests a World Wide Web document from a World Wide Web server, and how the server transmits the document to the browser. From a hierarchical perspective, HTTP is a transaction-oriented application layer protocol, which is an important foundation for reliable exchange of files (including text, sound, images, and other multimedia files) on the World Wide Web. It also specifies in detail the rules for communication between client browsers and servers.

2. Packet capture

The following are all the data packets captured by Yikoujun when accessing the web server he built. The following is the information displayed by the browser:

The following is the actual index.html content:

 < !DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" >
< html xmlns = "http://www.w3.org/1999/xhtml" >
< head >
< meta http-equiv = "Content-Type" content = "text/html; charset=utf-8" />
< title >A bite of Linux </ title >
</ head >
< body >
< div align = "center" >
< table width = "900" border = "0" >
< tr >< td >
< form onsubmit = "return isValidate(myform)" action = "cgi-bin/login.cgi" method = "post" >
Username: < input type = "text" name = "username" id = "username" >
< td >& nbsp ;</ td >
< tr >< td >
Password : < input type = "password" name = "userpass" id = "userpass" >
< td >& nbsp ;</ td >
< tr >< td >
< input type = "submit" value = "Login" id = "button" >
</ form >

</ td ></ tr >
</ table >
</ div >

< div align = "center" >
< table width = "900" height = "467" border = "0" background = "./image/yikou.png" >
< tr >
< td width = "126" height = "948" >& nbsp ;</ td >
< td width = "351" ></ td >
< td width = "101" >& nbsp ;</ td >
</ tr >
</ div >
</ body >
</ html >

The following are all HTTP packets captured using the packet capture tool:

GET request packet sent by the browser:

The data packet corresponding to the page replied by the server:

The complete browsing access server data packet interaction process is as follows:

The data packet interaction process is briefly summarized as follows:

  1. The browser will initiate a TCP 3-way handshake to the web server (HTTP is based on TCP, see data packets 1-3 in the figure above).
  2. The browser will use the DNS protocol to look up the IP address corresponding to the domain based on the URL entered in the address bar (if the IP address is given directly in the URL, this step is omitted).
  3. The browser sends a GET request using the HTTP protocol, and the web server responds with a corresponding page (if not specified, a default file such as index.html is usually specified by the configuration file, see data packet 4-6).
  4. Because the page contains image information, the browser then requests the corresponding image file (see data packet 7-24).
  5. Finally, the TCP connection will be closed and a 4-way handshake will be performed (see packets 25-28).

3. Page Interaction Process

Let's take a look at what HTTP does from entering the URL to loading it.

The browser is responsible for initiating the request and the final response request. After the server receives the request, it processes the request.

1. Enter the URL.

Whether it is a link or an input in the address bar, the situation is the same. The http protocol has specified the format of the URL, and the server is found through the domain name or IP in the http protocol.

2. When the server is found, an HTTP request will be sent to tell the server what I want you to do. The HTTP protocol specifies the format for sending requests, which consists of three parts: request line, request header, and request body.

The request line includes the request method (get, post or other), the file required for response, and the http version. The request header includes local machine information, browser information, etc., and of course, the parameters after the ? in the URL.

The request body includes relevant information about the data passed by POST. When the value is passed using the Get method, the request body is empty.

3. After the request information is sent to the server, the server will obtain the relevant information passed to the backend program for processing. The server can obtain the value passed by the URL through the information in the data packet, obtain the value passed by POST through the form, and of course, it can also obtain all other information from the request, such as browser information, cookie information, operating system information, etc. After obtaining the relevant data, the server will process it according to the program.

4. After the processing is completed, the server will respond and output relevant information to the browser. HTTP also stipulates the format of the response, and the response information mainly includes the response code, response header, and response body.

The response code is used to identify the result of the server's response, such as 200, 404, etc. The general classification is as follows:

Numbers starting with 1 indicate messages.

A number starting with 2 indicates success.

3 at the beginning indicates redirection.

A number starting with 4 indicates failure.

A number starting with 5 indicates a server exception.

The response header records server-related information such as whether the server enables compression, whether the server is IIS or Ngnix, the server-side language used by the program, etc. Of course, the cache is also set here. By modifying the response header, you can modify the local cache of HTML, such as setting the browser cache expiration time.

The response body is mainly the HTML related content I saw.

After completing the above 4 steps, the browser will disconnect the data connection with the server and can no longer transfer data. If data transmission is required again, everything must start from entering the URL.

This is a complete web page process. The role of http is to define the entire process, including the execution steps and the data format of each step. Only after understanding the http protocol and how web pages are generated can we better control web pages, such as controlling browser cache, sending http requests through non-browsers, choosing between get and post value transmission, and even establishing long connections, all of which are based on the http protocol.

4. Supplement

1. http main methods

The descriptions of versions 1.0 and 1.1 are based on RFC1945 and RFC2616 respectively. In addition to the content in the above figure, there are some header fields in the HTTP message that are used to indicate additional information. When the client sends data to the Web server, it sends the header field first, and then sends the data.

2. Status code

After receiving the request message, the web server will parse the content, determine "what" and "what operation to perform" through the URI and method, and complete its work according to these requirements, and then store the result in the response message. There is a status code at the beginning of the response message, which is used to indicate whether the execution result of the operation is successful or an error occurs.

When we access a web server, if we encounter a file that cannot be found, the error message 404 Not Found will be displayed. In fact, this is the status code. The status code is followed by the header field and the web page data. The response message will be sent back to the client. After the client receives it, the browser will read the required data from the message and display it on the screen. At this point, the entire work of HTTP is completed.

HTTP status codes consist of three decimal digits, the first decimal digit defines the type of status code.

Responses are divided into five categories: information responses (100–199), success responses (200–299), redirections (300–399), client errors (400–499), and server errors (500–599):

HTTP status code list:

This article is reprinted from the WeChat public account "Yikou Linux", which can be followed through the following QR code. To reprint this article, please contact Yikou Linux public account.

<<:  Learn about IP addresses in one minute. The Internet is not a lawless place. Please be careful in what you say and do.

>>:  How 5G will drive drone technology forward

Recommend

How Industry 4.0 and 5G will change supply chain visibility

As the pandemic highlights the serious inefficien...

5G spectrum technology has made a breakthrough, and battery life has soared

Improving battery life has been a challenge for a...

Now, how can enterprises fully reap the benefits of private 5G networks?

Over the next decade, 5G is expected to become on...

Computer software: Recommend 10 practical office efficiency tools

[[395494]] 1. Everything search tool Everything i...

7 ways artificial intelligence is impacting enterprise IT infrastructure

Artificial intelligence (AI) technology has gaine...

Discussing the future of TOSCA and NFV

Standardization and unification have great advant...