Two ways of TCP retransmission

Two ways of TCP retransmission

There is no communication without errors. This sentence shows that no matter how perfect the external conditions are, there will always be the possibility of errors. Therefore, in the normal communication process of TCP, errors may also occur. Such errors may be caused by packet loss, packet duplication, or even packet disorder.

[[401946]]

During the TCP communication process, the TCP receiving end will return a series of confirmation information to determine whether an error occurs. Once packet loss occurs, TCP will start a retransmission operation to retransmit the data that has not yet been confirmed.

TCP retransmission has two modes, one is based on time, the other is based on confirmation information. Generally, retransmission based on confirmation information is more efficient than retransmission based on time.

So from this point we can see that TCP's confirmation and retransmission are both based on whether the data packet is confirmed.

TCP sets a timer when sending data. If no confirmation information is received within the time specified by the timer, a corresponding timeout or timer-based retransmission operation will be triggered. The timer timeout is usually called the retransmission timeout (RTO).

But there is another way that does not cause delay, which is fast retransmit.

Each time TCP retransmits a message, the retransmission time will be doubled. This "doubling of the interval time" is called binary exponential backoff. When the interval time doubles to 15.5 minutes, the client will display

  1. Connection closed by foreign host.

TCP has two thresholds to decide how to retransmit a segment. These two thresholds are defined in RFC[RCF1122]. The first threshold is R1, which indicates the number of times it is willing to try to retransmit. Threshold R2 indicates the time when TCP should give up the connection. R1 and R2 should be set to at least three retransmissions and 100 seconds to give up the TCP connection.

It should be noted here that for the connection establishment message SYN, its R2 should be set to at least 3 minutes, but the setting methods of R1 and R2 values ​​are different in different systems.

In Linux systems, the values ​​of R1 and R2 can be set by the application, or by modifying the values ​​of net.ipv4.tcp_retries1 and net.ipv4.tcp_retries2. The variable value is the number of retransmissions.

The default value of tcp_retries2 is 15. This number of retries takes about 13 to 30 minutes. This is just an approximate value. The final time depends on RTO, which is the retransmission timeout. The default value of tcp_retries1 is 3.

For the SYN segment, the two values ​​of net.ipv4.tcp_syn_retries and net.ipv4.tcp_synack_retries limit the number of SYN retransmissions. The default value is 5, which is about 180 seconds.

Windows operating system also has R1 and R2 variables, their values ​​are defined in the following registry

  1. HKLM\System\CurrentControlSet\Services\Tcpip\Parameters
  2. HKLM\System\CurrentControlSet\Services\Tcpip6\Parameters

One of the most important variables is TcpMaxDataRetransmissions, which corresponds to the tcp_retries2 variable in Linux, and its default value is 5. This value means the number of times TCP has not confirmed the data segment on the existing connection.

Fast Retransmit

We mentioned fast retransmit above. In fact, the fast retransmit mechanism is triggered based on the feedback information from the receiving end, and it is not affected by the retransmission timer. Therefore, compared with timeout retransmission, fast retransmit can effectively repair packet loss. When out-of-order messages (such as 2 - 4 - 3) arrive at the receiving end during the TCP connection, TCP needs to generate a confirmation message immediately. This confirmation message is also called a duplicate ACK.

When an out-of-order message arrives, the repeated ACK must be returned immediately without delay. The purpose of this is to tell the sender that a segment of the message arrived out of order, and the sender is expected to point out the sequence number of the out-of-order segment.

There is another situation that can also cause duplicate ACKs to be sent to the sender, that is, the subsequent message of the current message segment is sent to the receiver, which can be used to determine whether the message segment of the current sender is lost or delayed. Because the consequences of these two situations are that the receiver does not receive the message, but we cannot determine whether the message segment is lost or not delivered. Therefore, the TCP sender will wait for a certain number of duplicate ACKs to be received to determine whether the data is lost and trigger a fast retransmission. Generally, the number of this judgment is 3. This text description may not be clear to understand. Let's take an example.

As shown in the figure above, segment 1 is successfully received and confirmed as ACK 2. The receiver's expected sequence number is 2. When segment 2 is lost, segment 3 arrives out of order, but it does not match the receiver's expectations, so the receiver will send redundant ACK 2 repeatedly.

In this way, before the timeout retransmission timer expires, after receiving three consecutive identical ACKs, the sender will know which segment is lost, so the sender will resend the lost segment, without having to wait for the retransmission timer to expire, greatly improving efficiency.

SACK

In the standard TCP confirmation mechanism, if the sender sends data between sequence numbers 0 and 10000, but the receiver only receives data between 0 and 1000, and 3000 and 10000, and the data between 1000 and 3000 has not reached the receiver, the sender will retransmit the data between 1000 and 10000, which is actually unnecessary because the data after 3000 has been received. However, the sender cannot perceive the existence of this situation.

How to avoid or solve this problem?

In order to optimize this situation, we need to let the client know more information. In the TCP segment, there is a SACK option field, which is a selective acknowledgment mechanism. This mechanism can tell the TCP client, in our common language: "I am allowed to receive a maximum of 1000 segments, but I received 3000-10000 segments. Please give me segments between 1000-3000."

However, whether this selective confirmation mechanism is enabled is also affected by a field, which is the SACK allow option field. The communicating parties add the SACK allow option field in the SYN segment or SYN + ACK segment to notify the other host whether it supports SACK. If both parties support it, the SACK option can be used in the subsequent SYN segment.

It should be noted here that the SACK option field can only appear in the SYN segment.

Spurious timeouts and retransmissions

In some cases, even if there is no segment loss, it may still trigger a message retransmission. This retransmission behavior is called a spurious retransmission. This retransmission is unnecessary and may be caused by a spurious timeout, which means that a timeout is determined too early. There are many factors that cause spurious timeouts, such as out-of-order arrival of segments, duplicate segments, ACK loss, etc.

There are many ways to detect and handle pseudo timeouts, which are collectively referred to as detection algorithms and response algorithms. The detection algorithm is used to determine whether a timeout has occurred or a timer retransmission has occurred. Once a timeout or retransmission occurs, the response algorithm will be executed to cancel or mitigate the impact of the timeout. The following are several algorithms, and this article will not go into these implementation details:

  • Repeated SACK extension - DSACK
  • Eifel Detection Algorithm
  • Forward RTO Recovery - F-RTO
  • Eifel Response Algorithm

Packet out of sequence and packet duplication

Above we discussed how TCP handles packet loss. Now let’s discuss the issues of packet disorder and packet duplication.

Packet out of order

The out-of-order arrival of data packets is a very common phenomenon on the Internet. Since the IP layer cannot guarantee the order of data packets, each data packet may be sent over the link with the fastest transmission speed at the current situation. Therefore, it is very likely that three data packets A -> B -> C are sent, and the order of data packets arriving at the receiving end is C -> A -> B or B -> C -> A, etc. This is a phenomenon of packet out-of-order.

In packet transmission, there are mainly two types of links: forward link (SYN) and reverse link (ACK).

If the disorder occurs in the forward link, TCP cannot correctly determine whether the data packet is lost. Both data loss and disorder will cause the receiving end to receive out-of-order data packets, resulting in gaps between data. If the gap is not large enough, this situation has little impact; but if the gap is large, it may cause false retransmissions.

If the disorder occurs in the reverse link, the TCP window will move forward, and then duplicate ACKs that should be discarded will be received, causing unnecessary traffic bursts at the sender and affecting the available network bandwidth.

Back to the fast retransmit we discussed above, since fast retransmit is initiated based on the inference of packet loss based on duplicate ACKs, it does not need to wait until the retransmission timer times out. Since the TCP receiver will immediately return ACKs to received out-of-order packets, any out-of-order packets in the network may cause duplicate ACKs. Assuming that once an ACK is received, the fast retransmit mechanism will be initiated. When the number of ACKs surges, a large number of unnecessary retransmissions will occur, so fast retransmit should reach the duplication threshold (dupthresh) before triggering. However, severe out-of-order transmission is not common on the Internet, so the value of dupthresh can be set as small as possible. Generally speaking, 3 can handle most situations.

Packet Duplicate

Packet duplication is also a rare situation on the Internet. It means that during network transmission, a packet may be transmitted multiple times. When retransmission is generated, TCP may be confused.

Packet duplication may cause the receiver to generate a series of duplicate ACKs, which can be resolved using SACK negotiation.

<<:  Interviewer: Tell me what happens after you enter the URL in the address bar and press Enter?

>>:  What is an access network?

Recommend

Popular science article: What exactly is 5G technology?

[[280757]] Introduction As a post-80s generation,...

8 technologies that are changing IT services

No one can deny that service is a job performed b...

What is 5G? How fast is 5G? Learn about 5G in one article

Recently, the world's first 5G railway statio...

Interviewer: How to close a TCP connection without killing the process?

Hello everyone, I am Xiaolin. A reader was asked ...

Expert: China ranks second in the world in terms of the number of IPv6 addresses

[[230257]] The global Internet Protocol (IP) addr...

Can you distinguish between distribution, high concurrency and multithreading?

When these three words are mentioned, do many peo...

618 is here, it’s time to upgrade your WiFi 6 router!

[[404856]] 618 is here, and during the annual sho...

Why 5G needs network slicing and how to implement it

[[189050]] When 5G is widely mentioned, network s...

Five things you need to know before buying a router

A router is a digital product that basically no o...

WOT Huang Shuquan: Edge computing helps industrial intelligent manufacturing

[51CTO.com original article] On May 18-19, 2018, ...

China Mobile: All new mobile terminals must support 700MHz from October 1

At the launch ceremony of China Mobile's 2021...