Why does the TCP protocol have a sticky packet problem?

Why does the TCP protocol have a sticky packet problem?

The TCP/IP protocol suite establishes a conceptual model of communication protocols in the Internet. The two main protocols in this protocol suite are TCP and IP. The TCP protocol in the TCP/IP protocol suite can guarantee the reliability and order of data segments. With a reliable transport layer protocol, the application layer protocol can directly use the TCP protocol to transmit data, and no longer needs to worry about the loss and duplication of data segments.

Figure 1 - TCP protocol and application layer protocol

The IP protocol solves the routing and transmission of data packets. The upper-layer TCP protocol no longer focuses on routing and addressing[^2]. The TCP protocol solves the problems of transmission reliability and order. The upper layer does not need to worry about whether the data can be transmitted to the target process. As long as the data is written into the TCP protocol buffer, the protocol stack can almost guarantee the delivery of the data.

When the application layer protocol uses the TCP protocol to transmit data, the TCP protocol may divide the data sent by the application layer into multiple packets and send them sequentially. The data segment received by the data receiver may be composed of multiple "application layer data packets". Therefore, when the application layer finds sticky data packets when reading data from the TCP buffer, it needs to split the received data.

Packet sticking is not caused by the TCP protocol. It occurs because the application layer protocol designers have a wrong understanding of the TCP protocol, ignore the definition of the TCP protocol, and lack experience in designing application layer protocols. This article will start from the TCP protocol and the application layer protocol to analyze how the packet sticking in the TCP protocol that we often mention occurs:

  • The TCP protocol is a byte stream-oriented protocol, which may combine or split the data of the application layer protocol;
  • The application layer protocol does not define message boundaries, which makes it impossible for the data receiver to splice the data;

Many people may think that packet sticking is a relatively low-level issue that is not even worth discussing, but in the author's opinion, this issue is still very interesting. Not everyone has systematically learned the design of application layer protocols based on TCP, and not everyone has a deep understanding of the TCP protocol. I believe that many people learn programming from the bottom up, so the author thinks this is a question worth answering. We should convey correct knowledge rather than negative and condescending emotions.

Byte stream oriented

The TCP protocol is a connection-oriented, reliable, byte-stream-based transport layer communication protocol[^3]. The data handed over to the TCP protocol by the application layer is not transmitted to the destination host in the form of messages. In some cases, the data will be combined into a data segment and sent to the target host.

The Nagle algorithm is an algorithm that improves TCP transmission performance by reducing data packets[^4]. Because network bandwidth is limited, it does not send small data blocks directly to the destination host, but waits for more data to be sent in the local buffer. Although this strategy of sending data in batches affects real-time performance and network latency, it can reduce the possibility of network congestion and reduce additional overhead.

In the early days of the Internet, Telnet was a widely used application. However, using Telnet would generate a large amount of valid data with only 1 byte of payload. Each data packet would have an additional overhead of 40 bytes, and the bandwidth utilization was only ~2.44%. The Nagle algorithm was designed under this scenario at the time.

When the application layer protocol transmits data through the TCP protocol, the data to be sent is actually written into the TCP protocol buffer first. If the user turns on the Nagle algorithm, the TCP protocol may not send the written data immediately. It will wait until the data in the buffer exceeds the maximum segment size (MSS) or the previous data segment is ACKed before sending the data in the buffer.

Figure 2 - Nagle's algorithm

Network congestion was still a problem decades ago, but today's network bandwidth resources are no longer as tight as they were in the past. By default, the Linux kernel will disable the Nagle algorithm by default in the following way:

  1. TCP_NODELAY = 1  

The Linux kernel uses the tcp_nagle_test function shown below to test whether we should send the current TCP data segment. Interested readers can use this code as an entry point to learn more about the implementation of the Nagle algorithm today:

  1. static inline bool tcp_nagle_test(const struct tcp_sock *tp, const struct sk_buff *skb,
  2. unsigned int cur_mss, int nonagle)
  3. {
  4. if (nonagle & TCP_NAGLE_PUSH)
  5. return true;
  6.  
  7. if (tcp_urg_mode(tp) || (TCP_SKB_CB(skb)- > tcp_flags & TCPHDR_FIN))
  8. return true;
  9.  
  10. if (!tcp_nagle_check(skb- > len <   cur_mss , tp, nonagle))
  11. return true;
  12.  
  13. return false;
  14. }

The Nagle algorithm can indeed improve the utilization of network bandwidth and reduce the additional overhead caused by TCP and IP protocol headers when the data packets are small. However, using this algorithm may also cause data written multiple times by the application layer protocol to be merged or split and sent. When the receiver reads the data from the TCP protocol stack, it will find that unrelated data appears in the same data segment, and the application layer protocol may not be able to split and reassemble them.

In addition to the Nagle algorithm, the TCP protocol stack has another option TCP_CORK for delaying the sending of data. If we turn on this option, when the data to be sent is less than the MSS, the TCP protocol will delay sending the data by 200ms or wait for the data in the buffer to exceed the MSS[^5].

Both TCP_NODELAY and TCP_CORK will improve bandwidth utilization by delaying data transmission. They will split and reassemble the data written by the application layer protocol. The most important reason why these mechanisms and configurations can appear is that the TCP protocol is a byte stream-based protocol. It does not have the concept of data packets and will not send data according to data packets.

Message Boundaries

If we have systematically studied the TCP protocol and the design of the application layer protocol based on TCP, then it will not be a problem to design an application layer protocol that can be arbitrarily split and assembled into data packets by the TCP protocol stack. Since the TCP protocol is based on byte streams, this actually means that the application layer protocol must divide the message boundaries by itself.

If we can define the message boundary in the application layer protocol, then no matter how the TCP protocol splits and reassembles the data packets of the application layer protocol, the receiver can restore the corresponding message according to the rules of the protocol. In the application layer protocol, the two most common solutions are length-based or delimiter-based.

Figure 3 - Methods for implementing message boundaries

There are two ways to implement length-based messages. One is to use a fixed length, and all application layer messages use a uniform size. The other way is to use an unfixed length, but it is necessary to add a field indicating the payload length in the protocol header of the application layer protocol so that the receiver can separate different messages from the byte stream. The message boundary of the HTTP protocol is implemented based on length:

  1. HTTP/1.1 200 OK
  2. Content-Type: text/html; charset = UTF -8
  3. Content-Length: 138
  4. ...
  5. Connection: close
  6.  
  7. < html >  
  8. < head >  
  9. < title > An Example Page </ title >  
  10. </ head >  
  11. < body >  
  12. < p > Hello World, this is a very simple HTML document. </ p >  
  13. </ body >  
  14. </ html >  

In the above HTTP message, we use the Content-Length header to indicate the payload size of the HTTP message. When the application layer protocol parses enough bytes, the complete HTTP message can be separated from it. No matter how the sender processes the corresponding data packet, we can follow this rule to complete the reassembly of the HTTP message[^6].

However, in addition to using a length-based approach to implement boundaries, the HTTP protocol also uses a terminator-based strategy. When HTTP uses the Chunked Transfer mechanism, the HTTPz header no longer contains Content-Length. It uses an HTTP message with a payload size of 0 as a terminator to indicate the message boundary.

Of course, in addition to these two methods, we can implement message boundaries based on specific rules. For example, when sending JSON data using the TCP protocol, the receiver can determine whether the message is terminated based on whether the received data can be parsed into legal JSON.

Summarize

The TCP packet sticking problem is caused by the incorrect design of the application layer protocol developers. They ignored the core mechanism of TCP protocol data transmission - based on byte streams, which does not contain concepts such as messages and data packets. All data transmission is streaming, and the application layer protocol needs to design the message boundary, namely, message framing. Let's review the core reasons for the packet sticking problem:

  • The TCP protocol is a transport layer protocol based on byte streams, in which the concepts of messages and packets do not exist;
  • The application layer protocol does not use length-based or terminator-based message boundaries, resulting in the concatenation of multiple messages;

The learning process of network protocols is very interesting. Continuously thinking about the problems behind them can help us have a deeper understanding of the definitions. Finally, let's look at some more open related questions. Interested readers can think carefully about the following questions:

  • How should the application layer protocol based on UDP be designed? Will there be a problem of packet sticking?
  • Which application layer protocols use length-based framing? Which use terminator-based framing?

<<:  What is the difference between LoRa and LoRaWAN?

>>:  30 pictures to explain HTTP, if you don't believe it, you still don't know it

Recommend

Competition in the fixed broadband market enters the "second half"

China Telecom leads strongly, China Mobile overta...

What is the difference between WiFi and Ethernet connections?

In today's networking world, Wifi and Etherne...

How to apply code intelligence technology to daily development?

01/ Let’s start with the developers’ worries When...

The industry chain works together to make great progress in 5G messaging

2020 is a critical year for my country's 5G c...

...

How about HostYun? Simple test of HostYun Los Angeles CN2 GIA cheap version

Recently, I shared the news that HostYun (Host Cl...

5G UPF traffic distribution technology and deployment methods

Labs Guide The User Plane Function (UPF) is an im...

Is 5G only about fast internet speed? Is it a rigid demand or a false demand?

In 2019, we thought 5G was a distant thing, but i...

Finding strength in numbers: Data center agglomeration effect

In the past, data centers were often built in rem...

Three ways to sign in with single sign-on, awesome!

[[374892]] This article is reprinted from the WeC...

Meituan second interview: TCP's four waves, can it be reduced to three?

Hello everyone, I am Xiaolin. I have posted this ...

What is the difference between Cat-M1 and NB-IoT?

Cat M1 and NB-IoT are two of the most popular IoT...

Comparison and conversion between IF sampling and IQ sampling

RF receiving systems usually use digital signal p...