HTTP OriginHTTP was initiated by Tim Berners-Lee at the European Organization for Nuclear Research (CERN) in 1989. The most famous of these is RFC 2616[1], published in June 1999, which defines the version of the HTTP protocol that is widely used today - HTTP 1.1. What is HTTPFull name: HyperText Transfer Protocol. Concept: HTTP is a communication protocol that can obtain network resources such as HTML, images, etc. It is the basis for data exchange on the web and is a client-server protocol. HTTP - The Internet's Multimedia Messenger - "HTTP Authoritative Guide". The role of HTTP on the Internet: It acts as a messenger, doing errands, transmitting information between the client and the server, but we cannot do without it. The HTTP protocol is an application layer protocol and is the protocol most closely related to front-end development. The HTTP requests, HTTP cache, Cookies, cross-domain, etc. that we encounter in daily life are actually closely related to HTTP. Basic features of HTTP
That is, HTTP relies on connection-oriented TCP for message delivery, but the connection is not required. It only needs to be reliable, or not lose messages (at least return errors). By default, HTTP/1.0 opens a separate TCP connection for each HTTP request/response pair. When multiple requests need to be initiated in succession, this mode is less efficient than multiple requests sharing the same TCP link. For this reason, HTTP 1.1 has the concept of persistent connections, and the underlying TCP connection can be implemented through the connection header. However, HTTP 1.1 is not perfect in terms of connections, which we will mention later. HTTP-based component systemThe component systems of HTTP include clients, web servers, and proxies. Client: user-agentBrowsers are programs used by engineers, in particular, and web developers to debug applications. Web ServerThe Web Server serves and provides the documents requested by the client. Each request sent to the server will be processed by the server and return a message, which is the response. ProxiesBetween the browser and the server, there are many computers and other devices that forward HTTP messages. They may appear at the transport layer, network layer, and physical layer, which are transparent to the HTTP application layer. There are some functions as follows:
HTTP message compositionHTTP has two types of messages:
HTTP messages consist of multiple lines of text encoded in ASCII. In HTTP/1.1 and earlier, these messages are sent openly over a connection. In HTTP2.0, messages are divided into multiple HTTP frames. HTTP messages are exposed through configuration files (for proxies or servers), APIs (for browsers), or other interfaces. Typical HTTP Session
HTTP Requests and ResponsesHTTP requests and responses both include a start line, HTTP Headers, an empty line, and a body, as shown in the following figure: Start line. The start line of the request: request method, request path and HTTP version number. The start line of the response: HTTP version number, response status code and status text description. The following is a detailed description of the request path. There are several types of request paths: 1) An absolute path followed by a '?' and a query string. This is the most common form, called the origin form, used by the GET, POST, HEAD, and OPTIONS methods. POST / HTTP / 1.1 2) A complete URL. Mainly used when connecting to a proxy using the GET method. GET http://developer.mozilla.org/en - US / docs / Web / HTTP / Messages HTTP / 1.1 3) The authority component of the URL, consisting of a domain name and an optional port number (prefixed with ':'), is called the authority form. It is only used when establishing an HTTP tunnel using CONNECT. CONNECT developer .mozilla .org : 80 HTTP / 1.1 4) Asterisk form: A simple asterisk ('*') is used with the OPTIONS method to represent the entire server. OPTIONS * HTTP / 1.1
Request Body: Some requests send data to the server in order to update data: a common case is a POST request (containing HTML form data). There are generally two types of request message bodies. One is a single-file body defined by Content-Type and Content-Length. The other is composed of multiple bodies, usually associated with HTML Forms. The difference between the two lies in the value of Content-Type. 1) Content-Type —— application/x-www-form-urlencoded For form content in application/x-www-form-urlencoded format, it has the following characteristics: I. The data will be encoded into key-value pairs separated by &. II. Characters are encoded in URL encoding. // Conversion process : { a : 1 , b : 2 } -> a = 1 & b = 2 -> as follows ( final form ) 2)Content-Type——multipart/form-data. The Content-Type field in the request header will contain the boundary, and the boundary value is specified by the browser by default. Example: Content-Type: multipart/form-data;boundary=----WebkitFormBoundaryRRJKeWfHPGrS4LKe. The data will be divided into multiple parts, each of which is separated by a separator. Each part is described by an HTTP header, such as Content-Type. The final separator will be added with -- to indicate the end. Content - Disposition : form - data ; name = "data1" ; Response Body: 1) Consists of a single file of known length. This type of body is defined by two headers: Content-Type and Content-Length. 2) Consists of a single file of unknown length, using chunks encoding by setting Transfer-Encoding to chunked. Content-Length will be mentioned in HTTP 1.0 below. This is a very important header added in HTTP 1.0. methodSafe methods: HTTP defines a set of methods called safe methods. Both the GET method and the HEAD method are considered safe, which means that neither the GET method nor the HEAD method will produce any action - the HTTP request will not produce any results on the server, but this does not mean that no action has occurred. In fact, this is more determined by web developers.
Difference between GET and POSTFirst, we need to understand the concepts of side effects and idempotence. Side effects refer to modifications to server-side resources. Idempotence means that after sending M and N requests (the two are different and both are greater than 1), the status of resources on the server is consistent. In application scenarios, get is side-effect-free and idempotent. Post mainly has side effects and is not idempotent. Technically there are the following distinctions:
Status Code
101 Switching Protocols. When HTTP is upgraded to WebSocket, if the server agrees to the change, it will send status code 101.
200 OK, indicating that the request sent from the client is processed correctly on the server. 204 No content, indicating that the request is successful, but the response message does not contain the entity body. 205 Reset Content, indicating that the request is successful, but the response message does not contain the entity body. However, it is different from the 204 response in that it requires the requester to reset the content. 206 Partial Content, for range request.
301 moved permanently, a permanent redirect, indicates that the resource has been assigned a new URL. 302 found, temporary redirection, indicating that the resource is temporarily assigned a new URL. 303 see other, indicating that the resource exists at another URL and the GET method should be used to obtain the resource. 304 not modified, indicating that the server allows access to the resource, but the request conditions are not met. 307 temporary redirect, temporary redirect, has a similar meaning to 302, but the client is expected to keep the request method unchanged and send a request to the new address.
400 bad request. The request message contains a syntax error. 401 unauthorized, indicating that the request sent requires authentication information through HTTP authentication. 403 forbidden, indicating that access to the requested resource is denied by the server. 404 not found, which means the requested resource was not found on the server.
500 internal server error, indicating that an error occurred on the server while executing the request. 501 Not Implemented, indicating that the server does not support a function required by the current request. 503 service unavailable, indicating that the server is temporarily overloaded or down for maintenance and cannot process the request. FirstHTTP Headers 1. General headers are applicable to both request and response messages, but are irrelevant to the data transmitted in the final message body, such as Date. 2. Request headers contain more information about the resource to be obtained or the client itself, such as User-Agent. 3. Response headers contain additional information about the response. 4. Entity headers contain more information about the entity body, such as the length of the body (Content-Length) or its MIME type, such as Accept-Ranges. For detailed headers, see HTTP Headers Collection [2]. HTTP: Past and PresentHTTP (HyperText Transfer Protocol) is the basic protocol of the World Wide Web. Dr. Tim Berners-Lee and his team created it between 1989 and 1991. [HTTP, web browser, server]. HTTP version 0.9 was released in 1991, version 1.0 was released in 1996, and version 1.1 was released in 1997. Version 1.1 is also the most widely transmitted version to date. Version 2.0 was released in 2015, which greatly optimized the performance and security of HTTP/1.1. Version 3.0, released in 2018, continued to optimize HTTP/2 and radically replaced the TCP protocol with UDP. Currently, HTTP/3 is supported by Chrome, Firefox, and Cloudflare on September 26, 2019. HTTP 0.9Single-line protocol, the request consists of a single-line instruction. It starts with the only available method GET. It is followed by the path of the target resource. GET /mypage.html Response: Includes only the response document itself. < HTML >
HTTP 1.0RFC 1945[3] proposed HTTP1.0 to build better scalability.
Media type is a standard used to indicate the nature and format of a document, file, or byte stream. Browsers usually use MIME (Multipurpose Internet Mail Extensions) types to determine how to handle URLs, so it is very important that the web server configures the correct MIME type in the response header. If it is not configured correctly, it may cause the website to not work properly. The structure of MIME is very simple; it consists of two strings, type and subtype, separated by '/'. HTTP takes a part of the MIME type to mark the data type of the message body. These types are reflected in the Content-Type field. Of course, this is for the sender. If the receiver wants to receive a specific type of data, it can also use the Accept field. The values of these two fields can be divided into the following categories: - text : text / html , text / plain , text / css, etc. At the same time, in order to agree on the compression method, supported language, character set, etc. of request data and response data, the following header is also proposed. 1. Compression method: Sending end: Content-Encoding (the server informs the client how the server encodes the main body of the entity) and receiving end: Accept-Encoding (the encoding method supported by the user agent). The values are gzip: the most popular compression format today; deflate: another well-known compression format; br: a compression algorithm invented specifically for HTTP. 2. Supported languages: Content-Language and Accept-Language (the set of natural languages supported by the user agent). 3. Character set: Sending end: Specified by the charset attribute in Content-Type. Receiving end: Accept-Charset (character set supported by the user agent). // Sender Although HTTP1.0 has made many improvements based on HTTP 0.9, it still has many shortcomings. The main disadvantage of HTTP/1.0 is that only one request can be sent per TCP connection. After sending the data, the connection is closed. If you want to request other resources, you must create a new connection. The cost of creating a new TCP connection is very high because it requires a three-way handshake between the client and the server, and the sending rate is slow at the beginning (slow start). The earliest model of HTTP, which is also the default model of HTTP/1.0, is a short connection. Each HTTP request is completed by its own independent connection; this means that there will be a TCP handshake before each HTTP request is initiated, and it is continuous. HTTP 1.1HTTP/1.1 was released in January 1997 as RFC 2068[4]. HTTP 1.1 eliminated a lot of ambiguity and introduced several technologies:
Virtual hosting is also known as shared web hosting. It can use virtual technology to divide a complete server into several hosts, so that multiple websites or services can be run on a single host. For example, there is a server with an IP address of 61.135.169.125, on which the websites of Google, Baidu, and Taobao are deployed. Why do we see the homepage of Google instead of the homepage of Baidu or Taobao when we visit https://www.google.com? The reason is that the Host request header determines which virtual host to visit. HTTP 2.0In 2015, HTTP2.0 was released. rfc7540[5].
Frame: The client and server communicate by exchanging frames, which is the smallest unit of communication based on this new protocol. Message: refers to the logical HTTP message, such as request, response, etc., which consists of one or more frames. Stream: A stream is a virtual channel in a connection that can carry bidirectional messages; each stream has a unique integer identifier. Frames in HTTP 2.0 break HTTP/1.x messages into frames and embed them into streams. Data frames and header frames are separated, which allows header compression. Multiple streams are combined, a process called multiplexing, which allows for more efficient underlying TCP connections. That is to say, streams are used to carry messages, and messages are composed of one or more frames. Binary transmission further improves transmission performance. Each data stream is sent in the form of a message, and a message is composed of one or more frames. A frame is the unit of data in a stream. HTTP framing is now transparent to web developers. In HTTP/2, this is an additional step between HTTP/1.1 and the underlying transport protocol. Web developers do not need to make any changes in the APIs they use to take advantage of HTTP framing; HTTP/2 will be turned on and used when both browsers and servers are available.
As we mentioned before, although HTTP 1.1 has long connections and pipelining, it still has head-of-line blocking. HTTP 2.0 solves this problem. The new binary framing layer in HTTP/2 breaks through these limitations and implements full request and response multiplexing: clients and servers can break down HTTP messages into independent frames, send them interleaved, and finally reassemble them at the other end. As shown in the figure above, the snapshot captures multiple data streams running in parallel on the same connection. The client is transmitting a DATA frame (data stream 5) to the server, while at the same time, the server is sending a series of frames from data stream 1 and data stream 3 to the client, interleaved. Therefore, there are three parallel data streams on one connection at the same time. Breaking down HTTP messages into separate frames, sending them interleaved, and then reassembling them at the other end is the most important enhancement of HTTP 2. In fact, this mechanism will trigger a series of chain reactions throughout the entire network technology stack, resulting in huge performance improvements, allowing us to: 1. Send multiple requests in parallel and interleave them without affecting each other. 2. Send multiple responses in parallel and interleave them without interfering with each other. 3. Use a single connection to send multiple requests and responses in parallel. 4. Eliminate unnecessary delays and improve the utilization of existing network capacity, thereby reducing page load time. 5. No more work to get around HTTP/1.x limitations (such as sprites)... Connection sharing, that is, each request is used as a connection sharing mechanism. One request corresponds to one id, so there can be multiple requests on a connection, and the requests of each connection can be randomly mixed together. The receiver can attribute the request to different server requests according to the request id. For a comparison between HTTP 1.1 and HTTP 2.0, please refer to this website demo [6]. HTTP 1.1 is demonstrated below: HTTP2.0 demonstration is as follows:
Server push. It allows the server to fill data in the client cache and request it in advance through a mechanism called server push. The server pushes resources to the client without the client explicitly requesting it. The server can push necessary resources to the client in advance, which can reduce the request delay time. For example, the server can actively push JS and CSS files to the client instead of waiting until the HTML is parsed to send a request to reduce the delay time. The general process is shown in the following figure: How to upgrade your HTTP versionUsing HTTP/1.1 and HTTP/2 is transparent to sites and applications. It is enough to have an up-to-date server that interacts with new browsers. Only a small group of people need to make changes, and as older browsers and servers are updated, adoption will naturally increase without any effort on the part of web developers. HTTPSHTTPS also transmits information through the HTTP protocol, but uses the TLS protocol for encryption. Symmetric and asymmetric encryptionSymmetric encryption means that both parties have the same secret key, and both parties know how to encrypt and decrypt the ciphertext. However, because data transmission is done over the network, if the secret key is transmitted over the network, once the secret key is intercepted, there is no point in encryption. Asymmetric encryptionAs we all know, public keys can be used to encrypt data. However, to decrypt data, private keys must be used, and the private key is in the hands of the party that issues the public key. First, the server publishes the public key, so the client knows the public key. Then the client creates a secret key, encrypts it with the public key, and sends it to the server. After receiving the ciphertext, the server uses the private key to decrypt the correct secret key. TLS handshake processThe TLS handshake process uses asymmetric encryption
HTTP CacheStrong CacheStrong caching is mainly determined by the two headers Cache-control and Expires. The value of Expires and the value of the Date attribute in the header are used to determine whether the cache is still valid. Expires is a header field in the response message of the Web server. When responding to an http request, it tells the browser that the browser can directly retrieve data from the browser cache before the expiration time without requesting again. One disadvantage of Expires is that the expiration time returned is the server time, which is an absolute time. This poses a problem. If the client time differs greatly from the server time (for example, the clocks are not synchronized, or across time zones), then the error will be large. Cache-Control specifies the validity period of the current resource, and controls whether the browser directly retrieves data from the browser cache or re-requests the server to retrieve data. However, it sets a relative time. Specify the expiration time: max-age is the number of seconds from the time the request was initiated. For example, the following means that the strong cache can be hit within 31536000 seconds from the time the request was initiated. Cache - Control : max - age = 31536000 Indicates no cache. Cache - Control : no - store There is a cache but it needs to be revalidated. Cache - Control : no - cache Private and public caches. public means that the response can be cached by any middleman (such as an intermediate proxy, CDN, etc.), while private means that the response is dedicated to a single user, the middleman cannot cache this response, and the response can only be applied to the browser's private cache. Cache - Control : private Authentication method: The following means that once a resource expires (for example, it has exceeded max-age), the cache cannot use the resource to respond to subsequent requests before successfully authenticating with the origin server. Cache - Control : must - revalidate Cache-control has a higher priority than Expires. The following is a Cache-Control strong cache process:
Negotiation Cache
Last-Modified indicates the date when the local file was last modified. The browser will add If-Modified-Since (the value of Last-Modified returned last time) to the request header to ask the server whether the resource has been updated after that date. If so, the new resource will be sent back. However, if you open the cached file locally, the Last-Modified will be modified, so ETag appears in HTTP/1.1.
Etag is like a fingerprint. Any resource change will cause ETag to change, regardless of the last modification time. ETag can ensure that each resource is unique. The If-None-Match header will send the last returned Etag to the server, asking whether the Etag of the resource has been updated. If there is a change, a new resource will be sent back. If-none-match and ETags have higher priority than If-Modified-Since and Last-Modified. First request: Second request to the same page: Negotiate cache, if there is no change, return 304, if changed, return 200 Resource
Now 200 (from cache) has become disk cache (disk cache) and memory cache (memory cache) Revving TechnologyThe above mentioned HTTP cache is related, but sometimes we need to update online resources after going online. Web developers have developed a technique that Steve Souders calls revving. Files that are updated infrequently are named in a special way: a version number is appended to the URL (usually the file name). Disadvantage: When the version number is updated, the version numbers of all places that reference these resources must be changed. Web developers usually use automated build tools to complete these trivial tasks in actual work. When the low-frequency updated resources (js/css) change, only the entry point needs to be changed in the high-frequency changed resource file (html). CookiesHTTP Cookie (also called Web Cookie or Browser Cookie) is a small piece of data sent by the server to the user's browser and saved locally. It will be carried and sent to the server the next time the browser makes a request to the same server. Creating cookiesSet-Cookie response header and Cookie request header. Set - Cookie : < cookie name >=< cookie value > Session CookiesSession cookies are the simplest cookies: they are automatically deleted after the browser is closed, which means they are only valid during the session. Session cookies do not need to specify an expiration time (Expires) or a validity period (Max-Age). It should be noted that some browsers provide a session recovery function. In this case, even if the browser is closed, the session cookie will be retained, as if the browser had never been closed. Persistent CookiesUnlike session cookies that expire when you close your browser, persistent cookies can specify a specific expiration time (Expires) or validity period (Max-Age). Set - Cookie : id = a3fWa ; Expires = Wed , 21 Oct 2015 07 : 28 : 00 GMT ; Secure and HttpOnly Cookie FlagsCookies marked as Secure should only be sent to the server via requests encrypted by the HTTPS protocol. Cookies marked as Secure should only be sent to the server through requests encrypted by the HTTPS protocol. However, even if the Secure tag is set, sensitive information should not be transmitted through Cookies, because Cookies are inherently insecure and the Secure tag cannot provide real security guarantees. Cookies with the HttpOnly flag are not accessible through the JavaScript Document.cookie API. This is done to prevent cross-site scripting attacks (XSS). Set - Cookie : id = a3fWa ; Expires = Wed , 21 Oct 2015 07 : 28 : 00 GMT ; Secure ; HttpOnly Scope of CookiesThe Domain and Path tags define the scope of the cookie: that is, which URLs the cookie should be sent to. Domain identifies which hosts are specified to accept cookies. If not specified, the default is the current host (not including subdomain). If Domain is specified, the subdomain is generally included. For example, if Domain=mozilla.org is set, then cookies are also included in the subdomain (such as developer.mozilla.org). The Path identifies which paths under the host can accept cookies (the URL path must exist in the request URL). Subpaths are matched with the character %x2F ("/") as the path separator. For example, if Path=/docs is set, the following addresses will match: /docs SameSite CookiesSameSite Cookie allows the server to require a cookie not to be sent during cross-site requests, thus preventing cross-site request forgery attacks. None browser will continue to send cookies under requests on the same site and cross-site requests, which are case-insensitive. [Before the default Chrome version of old versions of Chrome 80]. The Strict browser will only send cookies when visiting the same site. Lax will be reserved for some cross-site sub-requests, such as image loading or frames calls, but will only be sent when the user navigates from an external site to the URL. For example, link link: Set-Cookie: key=value; SameSite=Strict None Strict Lax In the new version of browser (after Chrome 80), the default attribute of SameSite is SameSite=Lax. In other words, when the cookie does not set the SameSite attribute, it will be considered that the SameSite attribute is set to Lax - which means that cookies will not be automatically sent when the current user is used. If you want to specify that cookies are sent on the same site and cross-site requests, you need to explicitly specify SameSite as None. Because of this, we need to check whether the old system clearly specifies SameSite, and recommend that the new system clearly specifies SameSite to be compatible with the old and new versions of Chrome For more cookie related, you can view an article about cookies that I have summarized before. Cookie knowledge summary for front-end instructions [7] HTTP Access Control (CORS)Cross-domain resource sharing (CORS) is a mechanism that uses additional HTTP headers to tell the browser that a web application running on an origin (domain) is allowed to access specified resources from different origin servers. The cross-domain resource sharing standard has added a set of HTTP header fields, allowing the server to declare which source sites have permission to access which resources through the browser. Simple requestA simple request (which will not trigger CORS preflight request) needs to meet the following three points at the same time:
The following is a simple request message and a response message: Simplify the following: The request header field Origin indicates that the request comes from http://foo.example. In this example, the Access-Control-Allow-Origin: * returned by the server indicates that the resource can be accessed by any unavailable domain. If the server only allows access from http://foo.example, the content of the header field is as follows: Access - Control - Allow - Origin : http : // foo .example Access-Control-Allow-Origin should be * or contain the domain name specified by the Origin header field. Pre-check requestThe specification requires that HTTP request methods that may have side effects on server data. The browser must first use the OPTIONS method to initiate a preflight request to know whether the server allows the cross-domain request. The actual HTTP request will be initiated only after the server confirms that the permission is allowed. In the return of the preflight request, the server can also notify the client whether it is necessary to carry identity credentials (including cookies and HTTP authentication related data) The following two header fields are carried in the pre-flight request: Access - Control - Request - Method : POST The header field Access-Control-Request-Method tells the server that the actual request will use the POST method. The header field Access-Control-Request-Headers tells the server that the actual request will carry two custom request header fields: X-PINGOTHER and Content-Type. The server decides whether the actual request is allowed. The response to the preflight request includes the following fields Access - Control - Allow - Origin : http : // foo .example HTTP requests and responses Generally speaking, for cross-domain XMLHttpRequest or Fetch requests, the browser will not send credential information. If you want to send credential information, you need to set a special flag of XMLHttpRequest. For example, if the withCredentials flag of XMLHttpRequest is set to true, you can send cookies to the server. For requests with credentials, the server must not set the value of Access-Control-Allow-Origin to "*". This is because the request is carried with cookie information. If the value of Access-Control-Allow-Origin is "*", the request will fail. If the value of Access-Control-Allow-Origin is "*", the request will be executed successfully. The request and response headers involved in CORS are as follows: HTTP response header field
HTTP request header field:
refer to
References [1]RFC 2616: https://tools.ietf.org/html/rfc2616 [2]HTTP Headers collection: https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Headers [3]RFC 1945: https://tools.ietf.org/html/rfc1945 [4]RFC 2068: https://tools.ietf.org/html/rfc2068 [5]rfc7540: https://httpwg.org/specs/rfc7540.html [6] Website demo demo: https://http2.akamai.com/demo [7] Cookie knowledge summary for front-end instructions: https://juejin.im/post/6844903841909964813 [8]MDN: https://developer.mozilla.org/zh-CN/docs/Web/HTTP [9]HTTP development: https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Basics_of_HTTP/Evolution_of_HTTP [10]HTTP Overview: https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Overview [11]HTTP/2 Introduction: https://developers.google.com/web/fundamentals/performance/http2?hl=zh-cn [12] Cache (II) - Browser caching mechanism: strong cache, negotiated cache: https://github.com/amandakelake/blog/issues/41 [13] (Suggested to read carefully) HTTP soul question to consolidate your HTTP knowledge system: https://juejin.im/post/6844904100035821575#heading-62 |
>>: Five API Gateway Technology Selections, yyds
On the 27th, China Telecom released its first qua...
As new technologies continue to emerge, more and ...
Python is a high-level programming language with ...
As a sports vocational college integrating sports...
We are not unfamiliar with instant messaging (IM)...
[[416676]] In RF circuits, RF devices with variou...
The CAN bus was originally designed by Bosch in t...
As high-speed cellular networks become mainstream...
ITLDC's Black Friday promotion targets regula...
[51CTO.com original article] Xiao Wang is an ordi...
[[386236]] In this article, we will talk about th...
OneTechCloud (Yikeyun) brings you a discount code...
[51CTO.com original article] Cloud native is one ...
In China, 5G has blossomed in the past year. Not ...
As an aspiring programmer, it is necessary to und...