In TCP programming, we use protocols to solve the problems of sticky packets and unpacking packets. This article will explain in detail the causes of TCP sticky packets and half packets, and how to solve the problems of sticky packets and unpacking packets through protocols. Let you know the reasons.
1 TCP Packet Sticking and Unpacking Diagram Since the TCP transmission protocol is stream-oriented and has no message protection boundary, multiple messages sent by one party may be combined into a large message for transmission, which is called packet sticking; or a message sent may be split into multiple small messages, which is called packet unpacking. The following figure demonstrates the process of packet sticking and unpacking. The client sends two data packets D1 and D2 to the server. The number of bytes read by the server at one time is uncertain, so the following situations may exist: These situations are described as follows: The server reads two independent data packets, D1 and D2, twice, without any packet sticking or unpacking. The server receives two packets at a time, D1 and D2 are glued together, which is called TCP glue packet. The server reads the data packet twice. The first time, it reads the complete D1 packet and part of the D2 packet. The second time, it reads the remaining content of the D2 packet. This is called TCP unpacking. The server reads the data packet twice. The first time, it reads the partial content D1_1 of the D1 packet. The second time, it reads the remaining content D1_2 of the D1 packet and the complete D2 packet. Since the data sent by the sender may be stuck or unpacked, it is difficult for the receiver to distinguish them. Therefore, a scientific mechanism must be provided to solve the problem of sticking and unpacking. This is the role of the protocol. Before introducing the protocol, let us first understand the causes of packet sticking and packet unpacking. 2 Causes of sticking and unpacking The author summarizes the causes of sticking and unpacking problems into the following three types:
2.1 Socket Buffer and Sliding Window Each TCP socket has a send buffer (SO_SNDBUF) and a receive buffer (SO_RCVBUF) in the kernel. TCP's full-duplex working mode and TCP's sliding window depend on the filling status of these two independent buffers. SO_SNDBUF: When a process sends data, it is assumed that a send method is called. In the simplest case (also the general case), the data is copied into the kernel send buffer of the socket, and then send returns to the upper layer. In other words, when send returns, the data may not be sent to the other end (similar to write to a file), send only copies the data in the application layer buffer into the kernel send buffer of the socket. SO_RCVBUF: Cache the received data into the kernel. If the application process has not called read to read, the data will be cached in the corresponding socket's receive buffer. To be more specific, regardless of whether the process reads the socket, the data sent by the other end will be received by the kernel and cached in the socket's kernel receive buffer. What read does is to copy the data in the kernel buffer to the application layer user's buffer, nothing more. Sliding Window: During the three-way handshake, the TCP connection will send its own window size to the other party, which is actually the value specified by SO_RCVBUF. When sending data afterwards, the sender must first confirm that the receiver's window is not full. If it is not full, it can send. After each data transmission, the sender reduces the window size of the other party that it maintains, indicating that the available space of the other party's SO_RCVBUF becomes smaller. When the receiver starts to process the data in SO_RCVBUF, it reads the data from the socket's receive buffer in the kernel. At this time, the available space of the receiver's SO_RCVBUF becomes larger, that is, the window size becomes larger. The receiver returns its latest window size to the sender in the form of an ack message. At this time, the sender sets the receiver's window size that it maintains to the window size returned by the ack message. In addition, the sender can continuously send messages to the receiver as long as the other party's SO_RCVBUF space can buffer data, that is, window size>0. When the receiver's SO_RCVBUF is full, the window size=0, and the sender can no longer send data and must wait for the receiver's ack message to obtain the latest available window size. 2.2 MSS/MTU Fragmentation MTU (Maximum Transmission Unit) is the link layer's limit on the maximum data that can be sent at one time. MSS (Maximum Segment Size) is the maximum length of the data portion of a TCP message and is the transport layer's limit on the maximum data that can be sent at one time. To understand MSS/MTU, you first need to review the TCP/IP five-layer network model. When data is transmitted, some additional information is added to each layer:
After reviewing this basic content, let's look at MTU and MSS. MTU is the limitation of Ethernet data transmission. Each Ethernet frame cannot exceed 1518 bytes. Excluding the 14-byte header (DMAC+SMAC+Type field) and the 4-byte tail (CRC check) of the Ethernet frame, the maximum size of the remaining data field that carries the upper layer protocol can only be 1500 bytes. We call it MTU. MSS is the result of subtracting the IP Header of the network layer and the TCP Header of the transport layer from the MTU. This is the maximum size of actual application data that the TCP protocol can send at one time.
Due to the different lengths of IPV4 and IPV6, in IPV4, the Ethernet MSS can reach 1460 bytes; in IPV6, the Ethernet MSS can reach 1440 bytes. When the sender sends data, when the amount of data in SO_SNDBUF is greater than the MSS, the operating system will split the data so that each part is smaller than the MSS, which also forms unpacking. Then, the TCP Header is added to each part to form multiple complete TCP messages for sending. Of course, when passing through the network layer and data link layer, the corresponding content will be added respectively. Another thing to note is that the local loopback address (lookback) does not need to go through Ethernet, so it is not limited by Ethernet MTU = 1500. Enter the ifconfig command on the Linux server to view the MTU size of different network cards, as follows: The above picture shows 2 network card information:
2.3 Nagle algorithm In the TCP/IP protocol, no matter how much data is sent, a protocol header (TCP Header + IP Header) must always be added in front of the data (DATA). At the same time, when the other party receives the data, it also needs to send ACK to confirm. Even a character typed from the keyboard, which takes up one byte, may result in a 41-byte packet on transmission, including 1 byte of useful information and 40 bytes of header data. This translates into a 4000% consumption, which is unacceptable for a heavily loaded network. It is called "silly window syndrome". In order to make the best use of network bandwidth, TCP always wants to send as much data as possible. (A connection will set the MSS parameter, so TCP/IP hopes to send data in MSS-sized data blocks each time.) The Nagle algorithm is to send as large a block of data as possible to avoid the network being filled with many small data blocks. The basic definition of the Nagle algorithm is that at any time, there can be at most one unconfirmed small segment. The so-called "small segment" refers to a data block smaller than the MSS size, and the so-called "unconfirmed" means that after a data block is sent out, no ACK is received from the other party to confirm that the data has been received. Rules of Nagle's algorithm:
3 Communication Protocol After understanding the reasons for packet sticking and packet unpacking, let's analyze how the receiver can distinguish between them. The reason is simple: if there is incomplete data (packet unpacking), you need to continue waiting for data until a complete request or response can be formed. By defining a communication protocol, we can solve the problem of packet sticking and unpacking. The role of the protocol is to define the format of the transmitted data. In this way, when receiving data: If the package is glued, you can distinguish different packages according to this format If it is unpacked, wait until the data can form a complete message for processing. 3.1 Fixed-Length Agreement Fixed-length protocol: As the name implies, it specifies that a message must have a fixed length. For example, we stipulate that every 3 bytes represent a valid message. If we send the following 9 bytes in 4 times:
According to the protocol, we can determine that there are three valid request messages, as follows:
In a fixed-length agreement:
Tip: Netty provides FixedLengthFrameDecoder, which supports decoding a fixed length of bytes as a complete message. 3.2 Special Character Separator Protocol Add special characters such as carriage return or space at the end of the packet to split it. For example, when parsing by line, when encountering characters \n, \r\n, it is considered a complete data packet. For the following binary byte stream:
Then according to the protocol, we can judge that there are 2 valid request messages here
In the special character delimiter protocol:
When using the special character delimiter protocol, it is important to note that the special characters we choose must not appear in the message body, otherwise incorrect unpacking may occur. For example, the sender wants to treat "12\r\n34" as a complete message. If it is split by line, it will be split into 2 messages by mistake. One solution is that the sender pre-encodes the content to be sent in base64. Since base64 encoding only contains 64 characters: 0-9, az, AZ, +, /, we can choose special characters other than these 64 characters as delimiters. Tip: Netty provides DelimiterBasedFrameDecoder to decode based on special characters. In fact, the cache server redis we are familiar with also uses line breaks to distinguish a complete message. 3.3 Variable-length protocol The message is divided into a message header and a message body. In the message header, we use an integer, such as an int, to indicate the length of the message body. The message body is actually the binary data bytes to be sent. The following is a basic format:
In the variable length protocol:
Tip: Netty provides LengthFieldPrepender to encode the actual content Content and add the Length field. The recipient uses LengthFieldBasedFrameDecoder to decode. 3.4 Serialization Serialization is not essentially about solving the problem of sticking and unpacking packets, but about making network development more convenient. In the variable-length protocol, we can add a length field before the actual data to be sent, indicating the length of the data to be sent. This actually gives us a good idea that we can convert an object into binary bytes for communication, such as using a Request object to represent a request and a Response object to represent a response. There are many serialization frameworks. When choosing, we mainly consider the speed of serialization/deserialization, the volume occupied by serialization, multi-language support, etc. The following is a list of popular serialization frameworks in the industry: Tip: xml and json also belong to the category of serialization framework, which are not listed in the above table. Some network communication RPC frameworks usually support multiple serialization methods. For example, dubbo supports hessian, json, kyro, fst, etc. When supporting multiple serialization frameworks, a field is usually required in the protocol to indicate the serialization type. For example, we can transform the format of the above variable-length protocol into:
Here, 1 byte is used to represent the value of the Serializer, and different values are used to represent different frameworks. The sender, after selecting the serialization framework and encoding, needs to specify the value of the Serializer field. The receiver, when decoding, selects the corresponding framework for deserialization based on the value of Serializer; 3.5 Compression Usually, in order to save network overhead, you can consider compressing data during network communication. Common compression algorithms include lz4, snappy, gzip, etc. When choosing a compression algorithm, we mainly consider the compression ratio and decompression efficiency. We can add a compress field in the network communication protocol to indicate the compression algorithm used:
Usually, we do not need to use a byte to indicate the compression algorithm used. One byte can identify 256 possible situations, and there are only a few commonly used compression algorithms. Therefore, usually only 2 to 3 bits are needed to indicate the compression algorithm used. In addition, since the compression ratio is not too high when the amount of data is relatively small, there is no need to compress all the sent data. Compression is only considered when it exceeds a certain size. For example, when RocketMQ producer sends a message, the default message size exceeds 4K before compression. Therefore, the compress field should have a value indicating that no compression algorithm is used, such as 0. 3.6 Error Check Code Some communication protocols also include error checking codes in the data transmitted. Typical algorithms include CRC32 and Adler32. Java supports both of these methods, java.util.zip.Adler32 and java.util.zip.CRC32.
Here we will not explain CRC32 and Adler32 in detail, but mainly consider why we need to perform verification? Some people say that it is because of security considerations, but this reason does not seem to be sufficient, because we already have TLS layer encryption, and the role of CRC32 and Adler32 should not be for security considerations. I agree with a colleague's point of view: during the transmission of binary data, electromagnetic interference may cause a high level to become a low level, or a low level to become a high level. In this case, the data is equivalent to being contaminated. At this time, the correctness of the data can be verified by checking values such as CRC32. In addition, the verification mechanism is usually optional in the communication protocol and does not need to be turned on. Although it can ensure the correctness of the data, calculating the verification value will also bring some additional performance loss. For example, in MySQL master-slave synchronization, although the CRC32 verification is turned on by default in the higher version, it can also be disabled through configuration. 3.7 Summary This section uses some basic cases to explain how to solve the problems of sticking and unpacking packets through protocols in TCP programming. In actual development, our protocols are usually more complicated. For example, some RPC frameworks will add an ID to the protocol to uniquely identify a request, and some RPC frameworks that support two-way communication, such as sofa-bolt, will also add a direction information. Of course, the so-called complexity is nothing more than adding a field to the protocol for a certain purpose. As long as you understand the meaning of these fields, it will not be complicated. |
<<: TCP/IP, UDP, HTTP, MQTT, CoAP: five IoT protocols
>>: When WiFi6 collides with 5G, is it a crisis or a business opportunity?
IPV4 resources have been exhausted and there is n...
The Ministry of Industry and Information Technolo...
In recent years, wireless charging has been widel...
Yes, the title is correct. 5G will enable phone c...
RAKsmart data center has launched the "Every...
Fairytale Town is a Chinese hosting company estab...
ZJI is the original well-known WordPress host com...
Manufacturing has always been the lifeline of the...
RackNerd has launched some promotions in Los Ange...
10gbiz is a newly opened foreign hosting service ...
According to the latest "2021 Global Mobile ...
[[257522]] 1. With the support of policies, the c...
RAKsmart cloud servers also participate in the ye...
What entrepreneurial opportunities are there in t...
SpartanHost is a well-known foreign VPS hosting c...