HTTP knowledge points, a must-know in the exam

HTTP knowledge points, a must-know in the exam

Detailed introduction to http

HTTP is the abbreviation of Hyper Text Transfer Protocol, which is a transmission protocol used to transfer hypertext from the World Wide Web (WWW) server to the local browser. In the OSI seven-layer model, the HTTP protocol is located in the top application layer. Accessing web pages through a browser directly uses the HTTP protocol. When using the HTTP protocol, the client first establishes a TCP connection with the server's port 80, and then makes requests and responses, as well as exchanges data based on this connection.

[[283611]]

HTTP has two commonly used versions, HTTP1.0 and HTTP1.1. The main difference is that in HTTP1.0, each request and response uses a new TCP connection, while starting from HTTP1.1, multiple commands and responses are sent on one TCP connection. Therefore, the establishment and disconnection of TCP connections are greatly reduced, and efficiency is improved.

Features

  • Simple and fast: When a client requests a service from a server, it only needs to send the request method and path. Common request methods are GET, HEAD, and POST. Each method specifies a different type of communication between the client and the server. Since the HTTP protocol is simple, the program size of the HTTP server is small, so the communication speed is very fast.
  • Flexible: HTTP allows the transmission of any type of data object. The type being transmitted is marked by Content-Type.
  • Connectionless: Connectionless means that each connection is limited to processing only one request. After the server processes the client's request and receives the client's response, it disconnects. This method can save transmission time.
  • Stateless: HTTP is a stateless protocol. Stateless means that the protocol has no memory for transaction processing. The lack of state means that if the subsequent processing requires previous information, it must be retransmitted, which may increase the amount of data transmitted per connection. On the other hand, when the server does not need previous information, its response is faster.
  • Support B/S and C/S modes.

Request message

  • The request line specifies the request type, the resource to be accessed, and the HTTP version to be used.
  • The request header, which follows the request line (i.e. the first line), is used to indicate the additional information that the server will use. Starting from the second line, the request header, HOST, will indicate the destination of the request. User-Agent, which can be accessed by both server-side and client-side scripts, is an important basis for browser type detection logic. This information is defined by your browser and is automatically sent in each request, etc.
  • Blank line. The blank line after the request header is required.
  • The request data is also called the body, and any other data can be added.

Response

  • The status line consists of three parts: HTTP protocol version number, status code, and status message.
  • Message header, used to indicate some additional information that the client wants to use
  • Blank line. The blank line after the message header is required.
  • Response body, the text information returned by the server to the client.

Status Code

  • 200 OK //Client request successful
  • 301 Moved Permanently //Permanent redirection, use domain name to jump
  • 302 Found // Temporary redirection, redirecting unlogged users to the login page when accessing the user center
  • 400 Bad Request //The client request has a syntax error and cannot be understood by the server
  • 401 Unauthorized //The request is unauthorized. This status code must be used with the WWW-Authenticate header field.
  • 403 Forbidden //The server received the request but refused to provide service
  • 404 Not Found //The requested resource does not exist, e.g., an incorrect URL was entered
  • 500 Internal Server Error //An unexpected error occurred on the server
  • 503 Server Unavailable //The server cannot process the client's request and may return to normal after a period of time.

http method

  • Get: The client sends a request to the server to obtain the resource. The request obtains the resource at the URL.
  • post: Submit new request fields to the server. Add new data after requesting the resource of the URL.
  • head: request to obtain the response report of the URL resource, that is, to obtain the header of the URL resource
  • patch: Request to partially modify the data item of the resource where the URL is located
  • put: Request to modify the data element of the resource where the URL is located.
  • delete: request to delete the data of the URL resource

How does https ensure the security of data transmission?

https actually adds SSL/TLS between the TCP layer and the http layer to protect the security of the upper layer. It mainly uses symmetric encryption, asymmetric encryption, certificates, and other technologies to encrypt data transmission between the client and the server, ultimately ensuring the security of the entire communication. Click here to understand 9 problems with https.

SSL/TLS protocol functions:

  • Authenticate users and servers to ensure data is sent to the correct client and server;
  • Encrypt data to prevent data from being stolen midway;
  • Maintain data integrity and ensure that data is not altered during transmission.

What does the HTTP protocol consist of?

The request message consists of three parts:

  • Request line: contains request method, URI, HTTP version protocol
  • Request header fields
  • Request content entity

The response message consists of three parts:

  • Status line: contains HTTP version, status code, and status code reason phrase
  • Response header fields
  • Response content entity

Idempotence

An idempotent operation is one that has the same effect as a single execution if it is executed any number of times. Idempotent functions, or idempotent methods, are functions that can be executed repeatedly with the same parameters and produce the same results. These functions do not affect the state of the system, and there is no need to worry about repeated executions causing changes to the system. For example, the "getUsername() and setTrue()" function is an idempotent function.

Long Connection

1. Long connection based on http protocol

Both HTTP1.0 and HTTP1.1 protocols support long connections. HTTP1.0 requires adding the "Connection: keep-alive" header to the request to support it, while HTTP1.1 supports it by default.

The interaction process between http1.0 request and server:

  • The client sends a request with a header: "Connection: keep-alive"
  • After receiving this request, the server determines that this is a long connection based on http1.0 and "Connection: keep-alive", and will also add "Connection: keep-alive" to the response header, but will not close the established TCP connection.
  • After the client receives the response from the server and finds that it contains "Connection: keep-alive", it considers it a long connection and does not close it. It then uses the connection to send the request again. Go to a) and click here to learn about the difference between HTTP 1.0 and 2.0.

2. Send a heartbeat packet. Send a data packet every few seconds.

What is the difference between Http1.0 and 1.1 in Http protocol?

In http1.0, after a connection is established, the client sends a request, and the server closes the connection after returning a message. When the browser requests next time, it has to establish a connection again. Obviously, this way of constantly establishing connections will cause many problems.

The principle mechanism of Http protocol implementation:

  • Domain name resolution process:
  • Three-way handshake process
  • Initiate HTTP request
  • Respond to Http request and get HTML code
  • Browser parses HTML code
  • The browser renders the page and presents it to the user

Will Cookies be overwritten, will localStorage be overwritten

Cookies can be overwritten. If you write a cookie with the same name repeatedly, the previous cookie will be overwritten.

If you want to delete a cookie, just create a new cookie with the same name, set maxAge to 0, and add it to the response to overwrite the original cookie. Note that it is 0, not a negative number. Negative numbers represent other meanings.

localStorage is stored in an object with key-value pairs.

What is localStorage? In HTML5, a new localStorage feature is added. This feature is mainly used as local storage, which solves the problem of insufficient cookie storage space (the storage space of each cookie in the cookie is 4k). The localStorage size generally supported by browsers is 5M, which may be different in different browsers.

Advantages of localStorage

  • localStorage extends the 4K limit of cookies
  • localStorage can store the data of the first request directly locally, which is equivalent to a 5M database for the front-end page. Compared with cookies, it can save bandwidth, but this is only supported in high-version browsers.

Limitations of localStorage

  • The browser sizes are not uniform, and only IE versions above IE8 support the localStorage property.
  • Currently, all browsers limit the value type of localStorage to string type, which requires some conversion for the JSON object type that we are more familiar with in daily life.
  • localStorage is not readable in the browser's privacy mode
  • LocalStorage is essentially a string read. If there is a lot of storage, it will consume memory space and cause the page to become stuck.
  • localStorage cannot be captured by crawlers

The only difference between localStorage and sessionStorage is that localStorage is a permanent storage, while sessionStorage is a key-value pair that will be cleared when the session ends.

The difference between Cookie and Session

HTTP is a stateless connection. Every time a client reads a web page, the server considers it a new session. But sometimes we need to keep some information persistently, such as username and password when logging in, information from the last time the user connected, etc. This information is saved by Cookie and Session.

1. Cookie

A cookie is actually a small piece of text information. When a client requests a server, if the server needs to record the user's status, it will use the response to issue a cookie to the client's browser. The client's browser will save the cookie. When the browser requests to access the website again, the browser will submit the requested website together with the cookie to the server. The server will check the cookie to identify the user's status.

In simple terms, the working principle of cookies can be summarized as follows:

  • client connects to server
  • The client generates a cookie (validity period) and carries the cookie when visiting again
  • The server identifies the user based on the cookie information

2. Session

Session is a mechanism used by the server to record the client status. It is simpler to use than Cookie. When the same client interacts with the server each time, it does not need to send back all the cookie values ​​each time. Instead, it only needs to send back an ID. This ID is generated when the client first accesses the server and is unique to each client. In this way, each client has a unique ID. The client only needs to send back this ID. This ID is usually a cookie named JSESIONID. Session uses this ID to identify whether it is the same user (it only recognizes the ID, not the person).

Cookies are a technology that allows a website server to store a small amount of data on the client's hard disk or memory, or to read data from the client's hard disk. Cookies are a very small text file placed on your hard disk by the web server when you browse a website. It can record your user ID, password, web pages you have browsed, length of stay, and other information. Session: When a user requests a web page from an application, if the user does not have a session, the web server will automatically create a Session object. When the session expires or is abandoned, the server will terminate the session. Cookie mechanism: It uses a solution that maintains state on the client, while the session mechanism uses a solution that maintains state on the server. At the same time, we see that since the server-side solution for maintaining state also needs to save an identifier on the client, the session mechanism may need to use the cookie mechanism to achieve the purpose of saving the identifier.

Session is a method used by the server to track users. Each session has a unique identifier: session ID. When the server creates a session, the response message sent to the client contains the Set-cookie field, which contains a key-value pair called sid, which is the session ID. After receiving the cookie, the client saves the cookie in the browser, and all subsequent request reports contain the session ID. HTTP uses session and cookie to track user status. Session is used on the server side, and cookie is used on the client side:

  • Cookie data is stored on the client's browser, and session data is stored on the server.
  • Cookies are not very secure. Others can analyze the locally stored cookies and cheat with cookies. For security reasons, sessions should be used.
  • The session will be saved on the server for a certain period of time. When the number of visits increases, it will occupy your server's performance. In order to reduce the server's performance, you should use COOKIE.
  • The data stored in a single cookie cannot exceed 4K, and many browsers limit a site to storing a maximum of 20 cookies.

The difference between Http and Https:

  • HTTP URLs begin with http://, while HTTPS URLs begin with https://
  • HTTP is insecure, while HTTPS is secure
  • The standard port for HTTP is 80, and the standard port for HTTPS is 443
  • In the OSI network model, HTTP works at the application layer, while the secure transmission mechanism of HTTPS works at the transport layer.
  • HTTP cannot be encrypted, but HTTPS encrypts the transmitted data.
  • HTTP does not require a certificate, but HTTPS requires an SSL certificate issued by CA organization wosign

What is the Http protocol stateless protocol? How to solve the Http protocol stateless protocol?

Stateless protocols have no memory for transactions. The lack of state means that if subsequent processing requires previous information,

That is to say, after the client completes an HTTP request, the client sends another HTTP request. HTTP does not know that the current client is an "old user".

Cookies can be used to solve the stateless problem. Cookies are equivalent to a pass. A cookie is sent to the client during the first visit. When the client comes again, it takes the cookie (pass), then the server knows that this is an "old user".

Difference between URI and URL

1. URI

URI, which is a uniform resource identifier, is used to uniquely identify a resource.

Every resource available on the Web, such as HTML documents, images, video clips, programs, etc., is located by a URI.

A URI generally consists of three parts:

  • Naming mechanism for accessing resources
  • The host name where the resource is stored
  • The name of the resource itself, represented by the path, with emphasis on the resource.

2. URL

URL is a uniform resource locator, which is a specific URI. That is, URL can be used to identify a resource and also specifies how to locate the resource.

URL is a string used to describe information resources on the Internet. It is mainly used in various WWW client programs and server programs, especially the famous Mosaic.

URL can be used to describe various information resources in a unified format, including files, server addresses and directories.

A URL generally consists of three parts:

  • Protocol (or service method)
  • The IP address of the host where the resource is stored (and sometimes the port number)
  • The specific address of the host resource, such as directory and file name, etc.

URN, uniform resource name, identifies resources by name, such as mailto:[email protected].

URI is an abstract, high-level concept that defines a uniform resource identifier, while URL and URN are specific ways of identifying resources. Both URL and URN are a type of URI. Generally speaking, every URL is a URI, but not every URI is a URL. This is because URI also includes a subclass, the Uniform Resource Name (URN), which names a resource but does not specify how to locate it. The mailto, news, and isbn URIs above are all examples of URNs.

In Java URI, a URI instance can represent absolute or relative, as long as it conforms to the grammatical rules of URI. The URL class not only conforms to the semantics, but also contains information for locating the resource, so it cannot be relative.

In the Java class library, the URI class does not contain any methods for accessing resources; its only function is parsing.

In contrast, the URL class opens a stream to a resource.

HTTP URL

HTTP uses Uniform Resource Identifiers (URIs) to transfer data and establish connections. A URL is a special type of URI that contains enough information to find a resource.

URL, the full name is Uniform Resource Locator, which is called uniform resource locator in Chinese. It is the address used to identify a resource on the Internet. Taking the following URL as an example, let's introduce the components of a common URL:

http://www.aspxfans.com:8080/news/index.asp?boardID=5&ID=24618&page=1#name

As can be seen from the URL above, a complete URL consists of the following parts:

  • Protocol part: The protocol part of the URL is "http:", which means that the webpage uses the HTTP protocol. There are many protocols that can be used on the Internet, such as HTTP, FTP, etc. In this example, the HTTP protocol is used. The "//" after "HTTP" is a separator
  • Domain name part: The domain name part of the URL is "www.aspxfans.com". In a URL, you can also use an IP address as the domain name.
  • Port part: Following the domain name is the port, and ":" is used as a separator between the domain name and the port. The port is not a required part of a URL. If the port part is omitted, the default port will be used.
  • Virtual directory part: From the first "/" after the domain name to the last "/", it is the virtual directory part. The virtual directory is not a required part of a URL. In this example, the virtual directory is "/news/"
  • File name part: From the last "/" after the domain name to the "?", it is the file name part. If there is no "?", then from the last "/" after the domain name to the "#", it is the file part. If there is no "?" and "#", then from the last "/" after the domain name to the end, it is the file name part. The file name in this example is "index.asp". The file name part is not a required part of a URL. If it is omitted, the default file name is used.
  • Anchor part: From the "#" to the end, it is the anchor part. In this example, the anchor part is "name". The anchor part is not a required part of the URL.
  • Parameter part: The part from "?" to "#" is the parameter part, also known as the search part or query part. In this example, the parameter part is "boardID=5&ID=24618&page=1". Multiple parameters can be allowed, and "&" is used as a separator between parameters.

(Original text: http://blog.csdn.net/ergouge/article/details/8185219 )

How HTTPS works

  • First, the HTTP request server generates a certificate, and the client verifies the certificate's validity period, legitimacy, whether the domain name is consistent with the requested domain name, and the certificate's public key (RSA encryption).
  • If the client passes the verification, it will generate a random number based on the validity of the public key of the certificate, and encrypt the random number using the public key (RSA encryption);
  • After the message body is generated, its summary is encrypted using the MD5 (or SHA1) algorithm, and the RSA signature is obtained.
  • Sent to the server, at which point only the server (RSA private key) can decrypt it.
  • The random number obtained by decryption is then encrypted with AES as the key (at this time, the key is only known by the client and the server).

The 7 steps of a complete HTTP request

The HTTP communication mechanism is that in a complete HTTP communication process, the web browser and the web server will complete the following 7 steps:

1. Establish a TCP connection

Before HTTP starts working, the web browser must first establish a connection with the web server through the network. This connection is completed through TCP. This protocol and the IP protocol together build the Internet, that is, the famous TCP/IP protocol suite, so the Internet is also called the TCP/IP network. HTTP is a higher-level application layer protocol than TCP. According to the rules, only after the lower-level protocol is established can the higher-level protocol be connected. Therefore, a TCP connection must be established first. Generally, the port number of the TCP connection is 80.

The web browser sends a request line to the web server

Once the TCP connection is established, the web browser sends a request command to the web server. For example: GET /sample/hello.jsp HTTP/1.1.

2. Web browser sends request header

After the browser sends its request command, it also sends some other information to the Web server in the form of header information. Then the browser sends a blank line to notify the server that it has finished sending the header information.

3. Web server response

After the client sends a request to the server, the server sends a response to the client, HTTP/1.1 200 OK. The first part of the response is the protocol version number and the response status code.

4. Web server sends response header

Just as the client sends information about itself along with a request, the server sends data about itself and the requested document to the user along with a response.

5. Web server sends data to browser

After the web server sends the header information to the browser, it sends a blank line to indicate that the sending of the header information ends here. Then, it sends the actual data requested by the user in the format described by the Content-Type response header information.

6. Web server closes TCP connection

Normally, once the web server sends the request data to the browser, it closes the TCP connection. If the browser or server adds this line of code to its header information:

7. Connection: keep-alive

The TCP connection will remain open after it is sent, so the browser can continue to send requests through the same connection. Keeping the connection alive saves the time required to establish a new connection for each request and also saves network bandwidth.

Establish TCP connection -> send request line -> send request header -> (reach server) send status line -> send response header -> send response data -> disconnect TCP connection

Common HTTP response status codes

  • 200: The request was processed normally
  • 204: The request was accepted but no resources were returned
  • 206: The client only requests part of the resource, and the server only executes the GET method on the requested part of the resource. The corresponding message specifies the range of resources through Content-Range.
  • 301: Permanent Redirect
  • 302: Temporary Redirect
  • 303: Similar to the 302 status code, except that it expects the client to redirect to another URI via the GET method when requesting a URI.
  • 304: When sending a conditional request, it is returned when the condition is not met, and has nothing to do with redirection
  • 307: Temporary redirect, similar to 302, but requires the use of the POST method
  • 400: The request message syntax is incorrect and the server cannot recognize it.
  • 401: The request requires authentication
  • 403: The requested resource is not allowed to be accessed
  • 404: The server cannot find the corresponding resource
  • 500: Internal server error
  • 503: Server is busy

How HTTP works

The HTTP protocol defines how a web client requests a web page from a web server and how the server transmits the web page to the client. The HTTP protocol uses a request/response model. The client sends a request message to the server, which contains the request method, URL, protocol version, request header, and request data. The server responds with a status line, which includes the protocol version, success or error code, server information, response header, and response data.

Following are the steps of HTTP request/response:

  • Client connects to the Web server: An HTTP client, usually a browser, establishes a TCP socket connection to the Web server's HTTP port (default is 80). For example, http://www.oakcms.cn.
  • Sending HTTP request: Through the TCP socket, the client sends a text request message to the Web server. A request message consists of four parts: request line, request header, blank line and request data.
  • The server accepts the request and returns an HTTP response: The Web server parses the request and locates the requested resource. The server writes a copy of the resource to the TCP socket, which is read by the client. A response consists of a status line, a response header, a blank line, and the response data.
  • Release TCP connection

<<:  IPv6 basics explained in one minute

>>:  In the 5G era, industry market users’ choice of public network or private network

Blog    

Recommend

The road to containerized network functions

【51CTO.com Quick Translation】Service providers an...

BandwagonHost: $37.3/year KVM-1GB/20GB/1TB/Fremont Data Center

In January this year, BandwagonHost released a pa...

RackNerd March Promotion: KVM for 5 Data Centers starts at $14.99 per year

Although it is the end of February, RackNerd has ...

Verizon adds three new regions to its 5G mmWave service

Verizon's 5G millimeter wave network is now a...

After two weeks of remote work, do you still need an office?

【Abstract】If the Industrial Revolution drove peop...

UK government to phase out 2G and 3G mobile networks by 2033

Britain said on the 8th that it will gradually ph...

Ubuntu 18.04 changes the IP address

My memory is getting worse and worse, just record...

What exactly can Wi-Fi 6 do?

Let’s first understand the key technologies of Wi...

my country will start deploying and building IPv6 address projects in 2017

[[181003]] A reporter from the Economic Informati...