Understand TCP Packet Unpacking in One Minute

Understand TCP Packet Unpacking in One Minute

Usually, you may encounter such a phenomenon during network programming: the client sends a long string of messages, and the messages received by the server are mixed together or split, which makes it difficult to understand the messages correctly.

For example, one day you really want to drink milk tea, and you look at the takeaway, and the milk tea of ​​Yidiandian looks good, (Yidiandian, please send me money doge quickly), so you send a message in the group, hoping to find a few people to share milk tea:

A little milk tea anyone?

As a result, a colleague in the group replied:

Isn't it three o'clock now?

You feel puzzled and check your colleague's phone. He has received two messages:

a little

Anyone want some milk tea?

Haha, that was just a joke. In professional terms, this phenomenon is called "unpacking". Let's continue.

TCP Packet Sticking and Unpacking Phenomenon

The problem of packet sticking and unpacking is generally a problem at the application layer, and may occur at the data link layer, network layer, and transport layer. Most of our daily network application development is done at the transport layer, so this article focuses on the problem of packet sticking and unpacking at the transport layer.

There are two protocols in the transport layer that we are all familiar with: UDP and TCP. UDP has a message protection boundary and will not cause the problem of packet sticking and unpacking. Therefore, the problem of packet sticking and unpacking only occurs in the TCP protocol.

The following is a simple example to explain what is sticking and unpacking.

Assume that the client sends two data packets to the server in succession, represented by packet1 and packet2. Then the server may receive four types of data:

(1) In the first case, the server receives two packets in order, that is, there is no packet sticking or unpacking.

(2) In the second case, the server only receives one data packet. Due to the guaranteed delivery feature of TCP, this data packet contains the information of two data packets sent by the client. This phenomenon is called packet sticking. Unless the data packets sent by the client have clear rules, the server will not know the boundary between the two packets and it will be difficult to process the data.

(3) In the third case, the server receives three data packets, and the Package 1 data packet is split into two data packets: Package 1.1 and Package 1.2. This phenomenon is called unpacking. The reason for unpacking will be discussed below. It is also difficult for the server to process the unpacked data packets.

(4) In the fourth case, some large data packets are split into small data packets, and the small data packets are glued together with other data packets. This phenomenon is a combination of the above glueing and unpacking.

The reason for TCP packet unpacking

TCP is a "stream"-oriented protocol, and a stream is a long string of binary data without boundaries. As a transport layer protocol, TCP does not understand the specific meaning of upper-layer business data. It divides data packets according to the actual situation of the TCP buffer. Therefore, a complete packet considered by the business may be split into multiple packets by TCP for transmission. It is also possible to encapsulate multiple small packets into a large data packet for transmission, which will cause the problem of packet sticking and unpacking.

For example, the TCP buffer size is 1024 bytes. If the amount of data sent by the application in one request is relatively small and does not reach the buffer size, TCP will merge multiple requests into one request and send it. From a business perspective, this is "packet sticking";

If the amount of data requested by the application to be sent in one time is large and exceeds the buffer size, TCP will split it into multiple packets for transmission. This is "packetizing", which means splitting a large packet into multiple small packets for transmission.

Solution to TCP Packet Unpacking

TCP is stream-oriented, and packet sticking and unpacking will occur. So, as an application, how can we split or merge meaningful information from this continuous influx of data streams? There are usually some common methods:

(1) The sender adds a packet header to each data packet. The header should contain at least the length of the data packet. In this way, after receiving the data, the receiver can know the actual length of each data packet by reading the length field in the packet header.

As shown in the figure below, add the actual length of the packet in front of each packet.

(2) The sender encapsulates each data packet into a fixed length (if it is not enough, it can be padded with zeros). In this way, the receiver will naturally split each data packet every time it reads fixed-length data from the receive buffer.

In the figure below, each packet has a fixed length of 4, which makes it easy for the receiver to distinguish them.

(3) Boundaries can be set between data packets, such as adding special symbols, so that the receiving end can separate different data packets through this boundary.

As shown below, add special characters after each package: /

How does the Netty framework solve the problem of sticky package unpacking?

As a high-performance Java network programming framework, Netty is not only deeply encapsulated based on Java NIO, but also effectively handles data transmission between the client and the server.

As mentioned earlier, TCP transmission will have the phenomenon of packet sticking and unpacking. Netty has built-in multiple data stream codecs to solve this problem. The client and server can solve this problem by transmitting data according to the agreed rules.

Netty provides several codecs out of the box:

(1) FixedLengthFrameDecoder Fixed length decoder

(2) DelimiterBasedFrameDecoder specifies the delimiter decoder

(3) LengthFieldBasedFrameDecoder based on packet length decoder

(4) etc.…I will not list them here.

summary

TCP is a "stream"-oriented protocol, and a stream is a long string of binary data without boundaries. In the actual transmission process, TCP will split or assemble data packets according to the network conditions. If the business does not define a clear boundary rule, the business at the application layer will be stuck and unpacked.

The common solutions to the TCP packet sticking and unpacking problem are as follows:

(1) The sender adds a packet header to each data packet.

(2) The sender encapsulates each data packet into a fixed length.

(3) Boundaries can be set between data packets.

In order to solve the problem of packet sticking and unpacking, the Netty framework also provides many out-of-the-box codecs, which greatly simplifies the difficulty of solving such problems in network programming.

<<:  What is the difference between wireless repeater and Mesh? This article will teach you how to use your home network well

>>:  5G video calls can't save 5G. The problem with 5G is 5G itself.

Blog    

Recommend

European and American telecom operators claim: No one needs 6G

"No one needs 6G. The industry should make 6...

Where does the power of high-performance 5G core network come from?

The core network is the brain of the entire commu...

Don’t just focus on SD-WAN, pay attention to IPv6

The Internet of Things (IoT) is fundamentally cha...

What are the main problems facing 5G networks?

5G networks are the next generation of wireless t...

Small router, do you really understand its structure?

There are four main types of routers in the netwo...

Will ZeroNet subvert the existing Internet network?

ZeroNet is a revolutionary P2P network based on B...

Why the popular dual-band wireless router advantages tell you

Open the e-commerce website, dual-band wireless r...