Hello everyone, I am Xiaolin. A few days ago, a reader in the group asked a very interesting question. He captured a packet capture diagram. After the client and server waved four times, the client reused the same port as the previous connection within 17 seconds, initiated a SYN message to the server, and successfully established a connection. He thinks the server should still be in the TIME_WAIT state (because in the Linux operating system, the 2MSL time is 60 seconds, which is the duration of the TIME_WAIT state). Why can the connection be established normally after receiving the SYN message from the client? The packet capture image on mobile phones is not very good looking. To make it easier for everyone to see, I drew a picture: Simply put, the question is why a TCP connection in the TIME_WAIT state can establish a connection normally after receiving a SYN message with the same four-tuple. Some people may ask, didn’t Xiaolin write about this problem before? Yes, I wrote an article before: What happens when a connection in TIME_WAIT state receives a SYN with the same four-tuple? The article concluded at the time:
This reader has also read this article. He thinks that the sequence number of the client's SYN message in the packet capture image is illegal, so it should return RST, but the phenomenon is that the connection can be established normally. At this point, I started to panic. Did I write something wrong in my previous article? Does it mean that a connection in the TIME_WAIT state will re-establish a connection as long as it receives a SYN message, regardless of whether it is legal or not? I will not rush to draw conclusions. I will analyze the information in this packet capture diagram to see whether the kernel has the logic to re-establish the connection after receiving the SYN message in the TIME_WAIT state. Analyze a wave of kernel source codeIn the Linux kernel, when a connection in the TIME_WAIT state receives a SYN message, there is such a logic: Basically, if the message is a SYN packet, and the timestamp + sequence number are both legal, then the connection will be allowed to be reestablished in the TIME_WAIT state. The client in the packet capture diagram reuses the port to initiate a SYN message after 17 seconds, so the timestamp must be incremental compared to historical connections, so the timestamp is legal. Next, we focus on analyzing whether the sequence number in the SYN message is legal. First, the kernel determines: //If the after function returns 1, it is legal, otherwise it is illegal If the after function returns 1, it means that the sequence number of the received message is legal, otherwise it is illegal. The parameters of the after function represent:
According to the packet capture diagram, we can conclude that seq = 3145977016, tw_rcv_nxt = 40088018880. The implementation of the after function is very short, so I’ll post it here for everyone to see: static inline bool before ( unsigned int seq1 , unsigned int seq2 ) Then, I wrote a code to verify what the value returned by the after function is: It can be found that the after function returns 0, which means that the sequence number of the SYN message in the packet capture diagram is illegal, so there is no logic to enter the TIME_WAIT state to rebuild the connection. There is another angle that proves that this packet capture diagram does not have the logic of reestablishing the connection in the TIME_WAIT state. Because when the TIME_WAIT state allows the connection to be reestablished, the initial sequence number of the server's second handshake is calculated as tcptw->tw_snd_nxt + 65535 + 2, where tw_snd_nxt represents the sequence number of the last message sent by the server in the TIME_WAIT state. According to the packet capture diagram, we can conclude that tw_snd_nxt is 1082535342. If the logic of reestablishing the connection in the TIME_WAIT state is followed, then the sequence number in the second handshake on the server side should be 1082535342+ 65535 + 2. However, the sequence number of the second handshake on the server side shown in the packet capture diagram is 2175872083. These two values are different. So from this perspective, it can also be proved that this packet capture diagram does not have the logic of reestablishing the connection in the TIME_WAIT state. At that time, I also expressed this conclusion in the group. Analyzed by meAfter the above analysis, if the server is still in the TIME_WAIT state, then when it receives an illegal SYN message, it will definitely reply with RST, there is no doubt about this. So, I began to wonder if it was because the server had enabled certain TCP kernel parameters, causing the connection in the TIME_WAIT state to be quickly recycled, allowing the SYN message initiated by the client later to establish a connection normally. Here I will first tell you which TCP kernel parameters will cause the TIME_WAIT state to be quickly recycled:
From the packet capture diagram, we can see that the server actively initiates the FIN message, so the server is in the TIME_WAIT state. Therefore, the tcp_tw_reuse parameter is not the reason why the TIME_WAIT state is quickly recycled, because this parameter is used for the connection initiator, that is, the client is in the TIME_WAIT state. When initiating a connection, the TIME_WAIT state can be reused. Therefore, the possibility of parameter one is ruled out. I suspected at the time that it was because the server had enabled the tcp_tw_recycle parameter, which caused the server's TIME_WAIT state to be quickly recycled, and did not go through the full 2MSL (60 seconds) TIME_WAIT state. Therefore, I will ask the readers to confirm whether the tcp_tw_recycle parameter is enabled on the server. Wow, after confirmation from the reader, it was found that the server really had enabled the tcp_tw_recycle parameter. Then the phenomenon in the packet capture diagram can be well explained. It is because the server has enabled the parameter tcp_tw_recycle, which causes the server's TIME_WAIT state to be quickly recycled. The server may enter the CLOSED state in less than a few seconds. Then, 17 seconds later, the client sends a SYN message with the same four-tuple, and the connection is established normally, because the server is not in the TIME_WAIT state. SummarizeFinally, let me summarize my analysis ideas. Through the sequence number information in the packet capture diagram, it is confirmed that the sequence number of the SYN message initiated by the client is illegal. So if the server is still in the TIME_WAIT state, it should return RST after receiving this illegal SYN message, but the phenomenon in the packet capture diagram shows that the connection is established normally. So from this analysis, I confirmed that the TIME_WAIT state of the server may be quickly recovered. Then, I thought of two Linux parameters for quickly recycling the TIME_WAIT state: tcp_tw_reuse and tcp_tw_recycle. The tcp_tw_reuse parameter is used for the connection initiator, and in this case the TIME_WAIT state is on the server side, not the client side, so the possibility of this parameter can be ruled out. Therefore, let the reader confirm whether the tcp_tw_recycle parameter is enabled, because after this parameter is enabled, connections in the TIME_WAIT state, whether on the server or the client, will be quickly recycled, and then the TCP connection will enter the CLOSE state. Finally, after confirmation by the reader, it was found that the server did enable the tcp_tw_recycle parameter. However, it is not recommended to enable the tcp_tw_recycle state because it is unsafe in NAT networks. After Linux version 4.12, this parameter was directly cancelled. Through this analysis case, have you experienced the power of "eight-part essay"? Just from the phenomenon of a packet capture diagram, we can analyze what caused it. How about it, I got it this time! |
>>: Allocating 5G private network spectrum is conducive to promoting market competition
At the "2020 China IPv6 Development Forum&qu...
[[407105]] On June 23, according to the "Eco...
Recently, Ericsson Consumer Lab released the &quo...
On April 5, while China was going crazy for the &...
This article is reproduced from Leiphone.com. If ...
Even with all the coffee or energy drinks in the ...
The Internet has been developed for decades. Face...
Recently, the cloud service of GaussDB T, the wor...
In 2020, the country's top leadership has cla...
In the era of Internet big data, people come into...
5G has been commercially available for a year, an...
1. Basic Concepts of OSPF OSPF is based on IP pro...
The speed and convenience brought by the Internet...
[51CTO.com original article] The early winter of ...
Importance of Network Services Network services p...