Getting to the bottom of HTTP and WebSocket protocols

Getting to the bottom of HTTP and WebSocket protocols

I was chatting with my boss that day and we mentioned Meteor inadvertently, then talked about WebSocket, and then the following conversation took place. I have to say that if you look at problems in different ways, you will see very different things.

[[269579]]

A: Meteor is a very new development framework, and I think it is very cleverly designed.

B: How is it clever?

A: Its front-end and back-end all use JS, achieving true front-end and back-end unification; a copy of the database opened from the back-end is stored in the front-end browser, which is fast; the WebSocket protocol is used as the data transmission protocol to synchronize the front-end and back-end databases, achieving true real-time synchronization.

B: Oh? What is WebSocket? When is it real? Is it still a polling method at the bottom? How is it different from HTTP's persistent connection?

A: (starting to feel guilty) It is a new application layer protocol based on TCP. It only requires one connection. Subsequent data does not need to re-establish a connection and can be sent directly. It is based on TCP and has the same status as HTTP (uh, starting to make up stories). The underlying layer is not polling, and the difference with long connections... I'm not sure about this.

B: What is the transmission process like?

A: First, there is a handshake connection (another nonsense). It seems that the connection can be established based on HTTP (I have used Socket.io before, just making it up). After the connection is established, data can be transmitted, and it also includes mechanisms such as reconnection after disconnection.

B: It looks similar to what HTTP long connection does. It seems to be a protocol based on HTTP and Socket.

A: Uh... (I'd better go back and read a book)

Sometimes we look at things too superficially. We understand the general outline of each thing, but we don't seek to understand it in depth. When we talk about it with friends, few people will ask us about it in depth, which leads to a lot of weak basic knowledge. So I came back and roughly read the RFC documents of HTTP and WebSocket protocols (RFC2616 and RFC6455). I happened to be a little vague about the transmission process of HTTP, so here I summarize the similarities and differences between the two protocols.

Protocol Basics

If you look at these two protocols carefully, they are actually very simple, but anything you want to do well will slowly become extremely complicated with all kinds of details. Here I will only briefly describe the structure of the two protocols, and will not go into deep details, which is enough for understanding HTTP.

HTTP

The HTTP address format is as follows:

http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]

The protocol and host are not case sensitive.

HTTP Messages

An HTTP message may be a request or a response message. Both types of messages consist of a start-line, zero or more header fields, a blank line indicating the end of the header field (that is, a blank line prefixed with CRLF), and a possibly empty message body. A conforming HTTP client should not add extra CRLF to the message header or end, and the server will ignore these characters.

The header value does not include any leading or trailing LWS (linear whitespace), which may appear before or after the first non-whitespace character of the field value. Leading or trailing LWS may be removed without changing the semantics of the field value. Any LWS appearing between filed-content may be replaced by a SP (space). The order of header fields is not important, but it is recommended to put commonly used headers first (so the protocol says).

Request message

RFC2616 defines the HTTP Request message as follows:

  1. Request = Request-Line
  2. *(( general-header
  3. | request-header (some headers related to this request)
  4. | entity-header ) CRLF) (some headers related to this request)
  5. CRLF
  6. [ message-body ]

An HTTP request message starts with a request line. The second line contains the header, followed by a blank line, indicating the end of the header, and finally the message body.

The request line is defined as follows:

  1. //Request line definition
  2. Request-Line = Method SP Request-URL SP HTTP-Version CRLF
  3.  
  4. //Method definition
  5. Method = "OPTIONS" | "GET" | "HEAD" | "POST" | "PUT" | "DELETE" | "TRACE" | "CONNECT" | extension-method
  6.  
  7. //Definition of resource address
  8. Request-URI = "*" | absoluteURI | abs_path | authoritity ( CONNECT )

The header used in the Request message can be a general-header or a request-header, request-header (to be explained later). One of the special ones is the Host, which is used together with the request Uri as the receiver of the Request message to determine the conditions for requesting resources, as follows:

  1. If the Request-URI is an absolute URI, then the host in the request is present in the Request-URI. Any Host header field value present in the request SHOULD be ignored.
  2. If the Request-URI is not an absolute URI and the request includes a Host header field, the host is determined by the value of the Host header field.
  3. If the host defined by rule 1 or rule 2 is an invalid host, a 400 (Bad Request) error message should be returned.

Response message

The response message is almost identical to the request message and is defined as follows:

  1. Response = Status-Line
  2. *(( general-header
  3. | response-header
  4. | entity-header ) CRLF)
  5. CRLF
  6. [ message-body ]

As you can see, except that the header does not use the request-header, the only difference is the first line. The first line of the response message is the status line, which contains the famous return code.

The content of Status-Line is first the version number of the protocol, followed by the return code, and finally the interpretation content, each separated by a space, and the end of the line is terminated by a carriage return line feed character. The definition is as follows:

  1. Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF

Return Code

The return code is a 3-digit number. The first digit defines the category of the return code. There are 5 categories in total, which are:

  1. - 1xx: Informational - Request received, continuing process
  2.  
  3. - 2xx: Success - The action was successfully received,
  4. understood, and accepted
  5.  
  6. - 3xx: Redirection - Further action must be taken in   order   to  
  7. complete the request
  8.  
  9. - 4xx: Client Error - The request contains bad syntax or cannot
  10. be fulfilled
  11.  
  12. - 5xx: Server Error - The server failed to fulfill an apparently
  13. valid request

RFC2616 then gives a series of return code extensions, which we usually use, but those are just examples. HTTP1.1 does not force the communicating parties to comply with these extended return codes. The communicating parties only need to comply with the definitions of the five categories defined above in the implementation of return codes. That is to say, the first digit of the return code must be strictly in accordance with what is described in the document, and the others can be defined at will.

Anyone who receives an unknown return code xyz can treat it as x00. Response messages with unknown return codes cannot be cached.

Header

RFC2616 defines four types of headers. If all parties agree, the request header can be extended (trusted extensions can only be made when the protocol version is updated). If the receiver receives an unknown request header, it will be treated as an entity header. The four types of headers are as follows:

1. General Header Fields: can be used as request or response headers, but cannot be used as entity headers, only as message headers.

  1. general-header = Cache-Control; Section 14.9
  2. | Connection ; Section 14.10
  3. | Date ; Section 14.18
  4. |Pragma; Section 14.32
  5. | Trailer; Section 14.40
  6. | Transfer-Encoding; Section 14.41
  7. | Upgrade; Section 14.42
  8. | Via ; Section 14.45
  9. | Warning; Section 14.46

2. Request Header Fields: Headers used by the request initiator to change the request behavior.

  1. request-header = Accept; Section 14.1
  2. | Accept-Charset; Section 14.2
  3. | Accept-Encoding; Section 14.3
  4. | Accept-Language; Section 14.4
  5. | Authorization ; Section 14.8
  6. | Expect; Section 14.20
  7. | From ; Section 14.22
  8. | Host; Section 14.23
  9. | If-Match; Section 14.24
  10. | If-Modified-Since ; Section 14.25
  11. | If-None-Match; Section 14.26
  12. | If-Range ; Section 14.27
  13. | If-Unmodified-Since ; Section 14.28
  14. | Max -Forwards ; Section 14.31
  15. | Proxy- Authorization ; Section 14.34
  16. | Range; Section 14.35
  17. | Referer; Section 14.36
  18. |TE; Section 14.39
  19. | User -Agent ; Section 14.43

3. Response Header Fields: Used by the server to further describe the resource.

  1. response-header = Accept-Ranges; Section 14.5
  2. | Age; Section 14.6
  3. | ETag; Section 14.19
  4. | Location; Section 14.30
  5. | Proxy-Authenticate; Section 14.33
  6. | Retry- After ; Section 14.37
  7. | Server; Section 14.38
  8. | Vary; Section 14.44
  9. | WWW-Authenticate; Section 14.47

4. Entity Header Fields: If the message has a message body, the entity header is used as meta-information; if there is no message body, it is used to describe the information of the requested resource.

  1. entity-header = Allow; Section 14.7
  2. | Content-Encoding; Section 14.11
  3. | Content-Language ; Section 14.12
  4. | Content-Length; Section 14.13
  5. | Content-Location; Section 14.14
  6. | Content-MD5; ​​Section 14.15
  7. | Content-Range; Section 14.16
  8. | Content-Type; Section 14.17
  9. | Expires ; Section 14.21
  10. | Last -Modified ; Section 14.29
  11. | extension-header

Message Body and Entity Body

If there is a Transfer-Encoding header, then the message body is the entity body after decoding. If there is no Transfer-Encoding header, the message body is the entity body.

  1. message-body = entity-body
  2. | <entity-body encoded as per Transfer-Encoding>

In the request message, the message header contains Content-Length or Transfer-Encoding, indicating that there will be a message body following. If the request method should not contain a message body (such as OPTION), then the request message must not contain a message body. Even if the client sends it, the server will not read the message body.

In the response message, whether there is a message body is determined by the request method and the return code. For example, 1xx, 204, and 304 will not have a message body.

Length of the message body

The message body length is determined by the following rules, which are executed in order:

  1. All Response messages that should not return content should not have any message body, and the message will be considered terminated at the first blank line.
  2. If the message header contains Transfer-Encoding and its value is not identity, the length of the message body is determined using chunked decoding until the connection is terminated.
  3. If there is Content-Length in the message header, it represents entity-length and transfer-length. If Transfer-Encoding is also included, entity-length and transfer-length may not be equal, and Content-Length will be ignored.
  4. If the media type of the message is multipart/byteranges and transfer-length is not specified, the transfer length is defined by the media itself. Usually the format is defined by both the sender and the receiver. If the Range header field appears in the HTTP1.1 client request with multiple byte-range indicators, it means that the client can parse the multipart/byteranges response.
  5. If it is a Response message, the server can also disconnect and end the message body.

Get the entity body from the message body. Its type is defined by two headers, Content-Type and Content-Encoding (usually used for compression). If there is an entity body, there must be a Content-Type. If not, the receiver needs to guess. If it can't guess, use application/octet-stream.

HTTP Connection

HTTP1.1 connections use persistent connections by default. Persistent connections mean that sometimes the client needs to request a large number of related resources from the server in a short period of time. If it is not a persistent connection, a new connection must be established for each resource. HTTP uses TCP at the bottom layer, so a three-way handshake must be used to establish a TCP connection each time, which will cause a huge waste of resources.

Constant connectivity can bring many benefits:

  1. Using fewer TCP connections results in less stress on the communicating parties.
  2. Pipelines can be used to transmit information so that the requester can send the next message without waiting for the result, making better use of a single TCP.
  3. Smaller flow rate
  4. Sequential requests have less latency.
  5. It is not necessary to re-establish the TCP connection to transmit error, close connection and other information.

HTTP1.1 servers use TCP flow control to control HTTP traffic. When HTTP1.1 clients receive an error message from the server during a connection, they must immediately close the connection. There are many more details about HTTP connections, which will be discussed later.

WebSocket

Judging from the time of RFC release, WebSocket is much later, HTTP 1.1 was released in 1999, and WebSocket was released 12 years later. The opening of the WebSocket protocol states that the purpose of this protocol is to solve the problem that browser-based programs must initiate multiple HTTP requests and long polling when pulling resources... and it was created.

to be continued

I originally planned to sort out the general details of HTTP and WebSocket in one article and then compare them. But as I was writing, I realized that the article might be too long and not very user-friendly, so I decided to write a second article. In the second article, I will describe the general situation of WebSocket and compare it with the applicable scenarios of HTTP.

<<:  A world first! Traffic lights are used as 5G base stations: Japan has come up with a wave of cool operations

>>:  The country's first Wi-Fi 6 subway is built! Tired of hearing about 5G? Let's learn about the 6th generation of WiFi!

Recommend

CentOS8 installation screen prompts No match for argument: screen

CentOS8 has been released for some time. I person...

Juniper Networks: AI empowers experience first

In the era of the Internet of Everything, with th...

Attention! Eight pitfalls in managing integrated cabling systems

After nearly 20 years of development, the integra...

5G and the Future of Commercial Security Surveillance

Many commercial security surveillance networks ar...

5G needs new Wi-Fi tech to succeed, Cisco says

As the tech industry talks up 5G networks, Cisco ...

Comprehensive network monitoring tools to watch in 2020

Network monitoring is one of the most important n...