The data is not real-time enough: try long connection?

The data is not real-time enough: try long connection?

background

In certain scenarios, we often need to obtain the latest data in real time, such as message push or announcements, chat messages, real-time logs and academic status, etc., which all have high requirements for the real-time nature of the data. In the face of such scenarios, the most commonly used method may be polling, but in addition to polling, there are also long connections (Websocket) and server push (SSE) solutions to choose from.

polling

Polling is to use a cyclic http request method to obtain the latest data through repeated interface requests.

Short polling

Short polling is probably the most commonly used method for real-time data refresh. When we talk about polling solutions, most of the time we are referring to short polling. Its implementation is the same as that of ordinary interfaces. The modification only requires adding a timer or using useRequest to configure polling parameters on the front end. The principle is also very simple. As shown in the figure below, if it is http1.1 and above, TCP connections can be reused. Of course, http1.0 and below can also be used, but the consumption will be higher. The characteristic of short polling is that the interface request will be returned immediately, and each request can be understood as a new request.

Advantages and disadvantages of short polling

The biggest advantage of short polling is its simplicity. The front end sets the time interval and requests data regularly, while the server only needs to return the query data synchronously. However, the disadvantages are also obvious:

  1. Too many useless requests: As can be seen from the figure below, a request is sent at a fixed interval, and the interface may return the same data or an empty result each time. The server will repeatedly query the database and the front end will repeatedly re-render.
  2. Real-time performance is uncontrollable. For example, if the data is updated but a polling request has just ended, the data will not be updated within the polling interval.

Long polling

After reading the above introduction to short polling, we know that polling has two main defects: one is too many useless requests, and the other is that the real-time data is uncontrollable. In order to solve these two problems, a further long polling solution was developed.

In the above figure, after the client initiates the request, the server finds that there is no new data at present. At this time, the server does not return the request immediately, but suspends the request. After waiting for a period of time (usually 30s or 60s, setting a timeout return is mainly to consider that the long-term dataless connection will be disconnected by the gateway or a certain layer of middleware or even by the operator), if it is found that there is still no data update, an empty result is returned to the client. After receiving the reply from the server, the client immediately sends a new request to the server again. This time, after receiving the request from the client, the server also waited for a while. Fortunately, the data on the server was updated and the server returned the latest data to the client. After getting the result, the client sends the next request again, and so on.

Advantages and Disadvantages of Long Polling

Long polling perfectly solves the problem of short polling. First, the server does not return data to the client when there is no data update, so a large number of repeated requests from the client are avoided. In addition, after receiving the response from the server, the client immediately sends the next request, which ensures better data real-time performance. However, long polling also has disadvantages:

  • Large consumption of server resources: The server will always hold the client's request, which will occupy the server's resources. For some languages, each HTTP connection is an independent thread, and too many HTTP connections will consume the server's memory resources.
  • Difficult to handle frequent data updates: If data is updated frequently, there will be a large number of connection creation and reconstruction processes, which consumes a lot of resources. Although HTTP has TCP connection reuse, the client needs to re-request every time after getting data. Therefore, compared with WebSocket and SSE, it has an additional stage of sending new requests, which still affects real-time performance and performance.

From the above description, it seems that the number of long polling and the delay can be reduced. Is long polling better? Actually, not really. Both polling methods have their advantages and disadvantages and are suitable for different scenarios.

How to choose short polling or long polling?

Long polling is mostly used for frequent operations, point-to-point communication, and the number of connections cannot be too many. Each TCP connection requires a three-step handshake, which takes time. If each operation is connected first and then operated, the processing speed will be much lower. Therefore, it is OK to send data packets directly during the next processing without establishing a TCP connection. For example: the database connection uses a long connection. If short connections are used for frequent communication, socket errors will occur, and frequent socket creation is also a waste of resources.

HTTP services like WEB websites generally use short polling, because long connections consume certain resources for the server, and short connections save more resources for frequent connections of tens of thousands or even hundreds of millions of clients like WEB websites. If long connections are used, and there are tens of thousands of users at the same time, if each user occupies a connection, you can imagine how much time it would take. Therefore, when the concurrency is large but each user does not need to operate frequently, short connections are better.

Long Connection

​WebSocket​

As mentioned above, long polling is not suitable for scenarios where server resources are frequently updated. One solution to this problem is WebSocket. To put it simply, WebSocket establishes a persistent long connection between the client and the server. This connection is duplex, and both the client and the server can send messages to each other in real time. The following is a diagram of WebSocket:

WebSocket is very common for front-end students, because whether it is webpack or vite, the reload for HMR is done through WebSocket. When there is a code change, the project is recompiled, and the new changed module is notified to the browser to load the new module. Here, the notification of the browser to load the new module is done through WebSocket. As shown in the figure above, after the connection is established through handshake (protocol conversion), the two parties maintain a persistent connection. Due to historical reasons, WebSocket relies on HTTP to establish a connection, but its connection request has obvious characteristics, the purpose is that both the client and the server can recognize and maintain the connection.

Request Features

Request header characteristics

  • HTTP 1.1 GET request must be
  • The value of the Connection field in the HTTP Header must be Upgrade
  • The Upgrade field in the HTTP Header must be websocket
  • The value of the Sec-WebSocket-Key field is a random 16-byte string encoded in base64.
  • The value of the Sec-WebSocket-Protocol field records the subprotocol used, such as binary base64
  • Origin indicates the source of the request

Response header characteristics

  • The status code is 101, which means Switching Protocols
  • Upgrade / Connection / Sec-WebSocket-Protocol is consistent with the request header
  • Sec-WebSocket-Accept is generated by Sec-WebSocket-Key in the request header

compatibility

The WebSocket protocol was born in 2008 and became an international standard in 2011. Now all browsers support it.

Implementing a simple WebSocket

Based on native WebSocket, we implement a simple long connection.

connect

 // Only one WebSocket instance is needed for the connection
const ws = new WebSocket ( `wss : // $ { url } ` ) ;

Send Message

 ws .send ( "This is a message: " + count ) ;

Listening for messages

 ws .onmessage = function ( event ) {
console .log ( event .data ) ;
}

Close the connection

 ws .close ( ) ;

​Using WebSocket in your project​

In engineering, business requirements are rarely implemented directly based on native WebSocket. Using WebSocket requires completing the following issues:

  • Authentication: Prevent malicious connections from connecting to receive messages
  • Heartbeat: The client is accidentally disconnected, resulting in a dead link occupying server resources. A connection without messages for a long time may be disconnected by the intermediate gateway or operator.
  • Login: By establishing a connection, you need to identify which user is connecting, whether he has permission, and what messages need to be pushed
  • Log: monitor connection, report errors
  • Backstage: can easily view the number of online connected clients and message transmission volume

Server-Side Message Entry (SSE)

SSE stands for Server-sent Events, which is a component of the HTML 5 specification. The specification is very simple and consists of two main parts: the first part is the communication protocol between the server and the browser, and the second part is the EventSource object that can be used by JavaScript on the browser. The communication protocol is a simple protocol based on plain text. The content type of the server's response is "text/event-stream". The content of the response text can be regarded as an event stream consisting of different events. Each event consists of two parts: type and data, and each event can have an optional identifier. The contents of different events are separated by blank lines ("rn") containing only carriage returns and line feeds. The data of each event may consist of multiple lines.

​Comparison with Websocket

SSE

WebSocket

One-way: Only the server can send messages

Bidirectional: client and server send data in both directions

Text data only

Both binary and text are acceptable

Regular HTTP protocol

WebSocket Protocol

Compatibility

​Data Format

The SSE data sent by the server to the browser must be UTF-8 encoded text.

Response Headers

 Content - Type : text / event - stream
Cache - Control : no - cache
Connection : keep - alive

Data Transfer

Each time the server sends a message, it consists of several messages separated by \n\n. If a single message is too long, it can be separated by \n.

field value

 data
event
id
retry

example

 // Comment, used for heartbeat packet
: this is a test stream\n\n
// Set the link to retry once every 1000ms
retry : 1000 \n\n
event : custom message\n\n

data : some text \n\n

data : another message\n
data : with two lines \n\n

​Implement a simple SSE​

Web

Instantiate EventSource and listen for open, message, and error

 const source = new EventSource ( url , { withCredentials : true } ) ;
// Listen for messages
source .onmessage = function ( event ) {
// handle message
} ;
source .addEventListener ( 'message' , function ( event ) {
// handle message
} , false ) ;

// Listen for errors
source .onerror = function ( event ) {
// handle error
} ;
source .addEventListener ( 'error' , function ( event ) {
// handle error
} , false ) ;

// Close the connection
source .close ( )

Server

Taking nodejs as an example, the server code is no different from ordinary requests, and there is no new processing library.

 res .writeHead ( 200 , {
"Content-Type" : "text/event-stream" ,
"Cache-Control" : "no-cache" ,
"Connection" : "keep-alive" ,
"Access-Control-Allow-Origin" : '*' ,
} ) ;
res .write ( "retry: 10000\n\n" ) ;
res .write ( "event: connecttime\n\n" ) ;
res .write ( "data: " + ( new Date ( ) ) + "\n" ) ;
res .write ( "data: " + ( new Date ( ) ) + "\n\n" ) ;

// Simulate receiving a message and push it to the client
interval = setInterval ( function ( ) {
res .write ( "data: " + ( new Date ( ) ) + "\n\n" ) ;
} , 1000 ) ;

Unlike WebSocket, SSE is not a new communication protocol. Its essence is to define a Content-Type based on ordinary HTTP requests to maintain the connection. The effect of SSE can also be simulated through ordinary interfaces. Take XMLHttpRequest as an example.

 const xhr = new XMLHttpRequest ( ) ;
xhr .open ( "GET" , "http://localhost:8844/long" , true ) ;
xhr .onload = ( e ) => {
console .log ( "onload" , xhr .responseText ) ;
} ;
xhr .onprogress = ( e ) => {
// Every time the server writes response data, it will be transmitted and an onprogress event will be generated
console .log ( "onprogress" , xhr .responseText ) ;
} ;
xhr .send ( ) ;

References

rfc6455.pdf[1]

Chinese version of WebSocket protocol (rfc6455)[2]

In-depth analysis of the principles of WebSocket - Zhihu [3]

HTTP long connection implementation principle - Nuggets[4]

WebSocket() - Web API Reference | MDN[5]

EventSource - Web API Reference | MDN[6]​

<<:  6G Trends in 2023: Architecture drives key technologies from broad to deep

>>:  HPE (Aruba) Named a Leader in Gartner® 2022 Magic Quadrant™ for Enterprise Wired and Wireless LAN Infrastructure for the 17th Consecutive Year

Recommend

Seamless mobile connectivity is key to digitalization in healthcare

[[373455]] The widespread problem of unreliable c...

NETSCOUT's OneTouch AT G2 is your network testing nightmare

[51CTO.com original article] Xiao Nie just return...

Remote workers are greener, but the technology they use still has a carbon cost

According to foreign media TechCrunch, affected b...

Artificial intelligence builds an iron wall of network security

Every moment, thousands of scientists around the ...

5G or WiFi 6? Tips for choosing the best wireless network solution

Over the past five years, IT professionals who fo...

What will happen when 5G network falls in love with public cloud?

[[410935]] Recently, AT&T, the second largest...

How many HTTP requests can you guess on a TCP connection?

A classic interview question is what happens from...

For the first time, such a clear and unconventional explanation of K8S network

[51CTO.com original article] K8S network design a...

Ten ways for Vue.js parent-child component communication

[[266702]] Interviewer: What are the ways for par...

An introduction to different types of edge computers

Before buying edge computer hardware, we must fir...