The network protocols behind server push, online gaming, and email

The network protocols behind server push, online gaming, and email

We have talked a lot about network protocols before, and now we will take a deep dive into the key network protocols and their role in different applications. The focus is on understanding how these protocols shape the way we communicate and interact on the Internet. We will delve into the following areas:

WebSocket

In the previous discussion, we looked at HTTP and its role in the typical request-response interaction between a client and a server. HTTP performs well in most situations, especially when the response is immediate. However, in situations where the server needs to proactively push updates to the client, especially when those updates depend on events that the client cannot predict (such as actions by other users), HTTP may not be the most efficient approach. This is because HTTP is fundamentally a pull-based protocol, where the client must initiate all requests. So, how do you get the server to push data to the client without requiring the client to predict and request each update? There are typically four ways to handle this push-type communication, as shown in the following figure.

1. Short polling

This is the most basic approach. In this approach, the client, usually a web application running in our browser, continuously sends HTTP requests to the server. Imagine a scenario where we log into a web application and are asked to scan a QR code with our smartphone. This QR code is usually for a specific action, like authentication or starting a process. The web application does not know when we will scan the QR code. Therefore, it sends a request to the server every 1-2 seconds to check the status of the QR code. Once we scan the QR code with our smartphone, the server recognizes the scan and then sends back the updated status in the next check request from the web application. This way, we will get a response within the next 1-2 seconds after scanning the QR code. Due to this frequent check, we call this approach “Short Polling”.

There are two problems with this approach:

  • It sends a large number of HTTP requests, occupying bandwidth and increasing server load.
  • In the worst case, we might have to wait up to 2 seconds to receive a response, causing a noticeable delay.

2. Long polling

Long polling solves the problem of short polling by setting a longer HTTP request timeout. It can be understood like this: we adjust the timeout to 30 seconds. If we scan the QR code within this time period, the server will send a response immediately. This approach significantly reduces the number of HTTP requests.

However, long polling is not without its challenges. Even though long polling reduces the number of requests, each open request still requires a connection to the server. If there are many clients, this can put a strain on server resources.

3. WebSocket

Short polling and long polling are suitable for simple tasks, such as scanning QR codes. But for complex, data-intensive tasks that require real-time interaction, such as online games, a more efficient solution is needed - that is WebSocket.

TCP inherently allows for bidirectional data flow, enabling clients and servers to send data to each other simultaneously. However, HTTP/1.1, which is based on TCP, does not fully utilize this capability. In HTTP/1.1, data transmission is usually sequential - one party sends data, and then the other party responds. This design is sufficient for web page interactions, but it is insufficient for applications such as online games that require real-time interaction. WebSocket is another TCP-based protocol that allows full-duplex communication over a single connection, filling this gap. We will introduce it in detail later.

4. SSE (Server-Sent Events)

SSE, or Server Push Events, is suitable for specific use cases. When a client establishes an SSE connection, the server keeps this connection open to continuously send updates. This setup is ideal for situations where the server needs to push data to the client periodically, and the client only needs to receive the data without sending information to the server. A typical example is real-time stock market data updates. Using SSE, the server can push real-time data to the client every time there is an update, without sending a request for each update. It is worth noting that unlike WebSocket, SSE does not support two-way communication, so it is not very suitable for use cases that require two-way interaction.

How to establish a WebSocket connection

To establish a WebSocket connection, we need to include specific fields in the HTTP header that tell the browser to switch to the WebSocket protocol. A randomly generated Base64-encoded key (Sec-WebSocket-Key) is sent to the server.

Request header:

 Connection: Upgrade Upgrade: WebSocket Sec-WebSocket-Key: T2a6wZlAwhgQNqruZ2YUyg==

Server response headers:

 HTTP/1.1 101 Switching Protocols Sec-WebSocket-Accept: iBJKv/ALIW2DobfoA4dmr3JHBCY= Upgrade: WebSocket Connection: Upgrade

Status code 101 indicates that the protocol is switching. After this additional handshake, the WebSocket connection is established, as shown in the following figure:

9f2ff945-1c60-4e43-9252-474e74dc4fe7_1600x1303.png

WebSocket Messages

Once HTTP is upgraded to WebSocket, the client and server will exchange data in frames. Let's take a look at what the data looks like:

The operation code (Opcode) is a 4-bit field that indicates the type of frame data.

  • "1" indicates a text frame.
  • "2" indicates a binary frame.
  • "8" is the signal to close the connection.

The payload length can be a 7-bit field or it can be extended to include an extended payload length. If both length fields are fully utilized, the payload length can represent several terabytes of data.

WebSocket is suitable for scenarios such as online games, chat rooms, and collaborative editing applications that require frequent interactions between clients and servers.

RPC

RPC allows functions to be executed on different services. From the calling program's perspective, it appears to be executing the function locally. The following figure shows the difference between local procedure calls and remote procedure calls. We can deploy modules such as order management and payment in the same process or on different servers. When deployed in the same process, this is a local function call. When deployed on different servers, this is a remote procedure call.

Why do we need RPC? Can't we use HTTP to communicate between services? Let's compare RPC and HTTP in the following table.

The main advantage of RPC over HTTP is its lightweight message format and superior performance. For example, gRPC is an example, it runs on HTTP/2 and due to this, it has better performance.

Next, we will explore another important application layer protocol - RPC (Remote Procedure Call).

Let's understand the operation process of gRPC step by step:

  • Step 1: The client initiates a REST call. The request body is usually expressed in JSON format.
  • Steps 2 to 4: After receiving the REST call, the order service (acting as a gRPC client) converts it into the appropriate format and initiates an RPC call to the payment service. gRPC encodes the client stub into binary format and sends it to the underlying transport layer.
  • Step 5: gRPC sends the data packet to the network via HTTP2. Binary encoding and network optimization make gRPC up to five times faster than JSON.
  • Steps 6-8: The payment service (acting as a gRPC server) receives the packet, decodes it, and calls the server application.
  • Steps 9-11: The results returned by the server application are encoded and sent back to the transport layer.
  • Steps 12 to 14: After receiving the data packet, the order service decodes it and sends the result to the client application.

<<:  git clone network speed is too slow, what to do, teach you how to solve

>>:  Say hello politely - TCP protocol three-way handshake

Blog    

Recommend

Year-end review: 2020 network communication "three major" keywords

In 2020, the COVID-19 pandemic spread wildly arou...

Code Comics | TCP three-way handshake

[[356210]] This article is reprinted from WeChat ...

How does your domain name become an IP address?

[[420883]] This article is reprinted from the WeC...

How to choose DCIM, a data center infrastructure management tool?

DCIM (Data Center Infrastructure Management) is a...

Ten important components of SDN controller

SDN controller features include modularity, API, ...

An article to understand the principles of CDN technology

Overview The rapid development of the Internet ha...

Detailed Explanation of WiFi Wireless Network Technology

Introduction to Wireless Networks Wi-Fi is a tech...

Global users' views on 5G: Five keys to commercial success

Recently, Ericsson Consumer Lab released the &quo...

Still using OpenFeign? Try this new thing in SpringBoot3!

The New Year is over, and Brother Song has been m...

Quick Start with Linkerd v2 Service Mesh

In this guide, we'll walk you through how to ...