PrefaceTCP protocol is a must-know knowledge point in interviews with large companies. I have sorted out 15 very classic TCP interview questions. I hope everyone can find their ideal offer. 1. Talk about the TCP three-way handshake processAt first, both the client and the server are in the CLOSED state, and then the server starts listening to a port and enters the LISTEN state.
2. Why is the TCP handshake three times, not two? Not four?Why does the TCP handshake take place three times? To make it easier to understand, let's take a love relationship as an example: the most important thing for two people to get together is to love each other, that is, I love you, and I know that you love me too. Next, we will use this to simulate the three-way handshake process: Why can't the handshake be two times? If there are only two handshakes, the girl may not know whether the boy received her "I love you too", and the relationship cannot develop happily. Why can't the handshake be four times? Why can't we shake hands four times? Because three times is enough. Three times is enough for both parties to know: you love me, and I love you. Four times is redundant. 3. Talk about the TCP four-wave process
4. Why does TCP need to wave four times?Let me give you an example! ★ Xiao Ming and Xiao Hong were chatting on the phone. When the call was almost over, Xiao Hong said, "I have nothing more to say," and Xiao Ming replied, "I understand." However, Xiao Ming might still have something to say, and Xiao Hong couldn't ask Xiao Ming to end the call at her own pace, so Xiao Ming might talk a lot, and finally Xiao Ming said, "I'm done," and Xiao Hong replied, "I understand," and the call was considered over. " 5. Why does the TIME-WAIT state need to wait for 2MSL?2MSL, 2 Maximum Segment Lifetime, that is, two maximum segment life cycles
6. Differences between TCP and UDP
7. What are the fields in the TCP message header? Explain their functions
8. How does TCP ensure reliability?
9. TCP retransmission mechanismTimeout retransmissionIn order to achieve reliable transmission, TCP implements a retransmission mechanism. The most basic retransmission mechanism is timeout retransmission, that is, when sending a data message, a timer is set, and if no ACK confirmation message is received from the other party at a certain interval, the message will be resent. What is the general setting for this interval? Let's first look at what RTT (Round-Trip Time) is. RTT is the time it takes for a data packet to be sent out and returned, that is, the round trip time of the data packet. The timeout retransmission time is Retransmission Timeout, referred to as RTO. How long is the RTO set?
Generally, RTO is slightly larger than RTT, which is the best. Some friends may ask, is there a formula to calculate the timeout? Yes! There is a standard method to calculate the RTO formula, also known as the Jacobson/Karels algorithm. Let's take a look at the formula for calculating RTO. 1. Calculate SRTT first (calculate smoothed RTT)
2. Calculate RTTVAR (round-trip time variation)
3. Final RTO
Among them, α = 0.125, β = 0.25, μ = 1, ? = 4, these parameters are the optimal parameters obtained from a large number of results. However, timeout retransmission has the following disadvantages:
In addition, TCP has a strategy that the timeout interval will be doubled. Timeout retransmission requires a long wait. Therefore, a fast retransmission mechanism can also be used. Fast RetransmitThe fast retransmit mechanism is not time-driven but data-driven. It triggers retransmission based on feedback from the receiving end. Let's take a look at the fast retransmission process: Fast retransmit process The sender sent 1, 2, 3, 4, 5, 6 copies of data:
But there may be a problem with fast retransmission: ACK only tells the sender the largest ordered message segment. Which message is lost? It is not certain! So how many packets should be retransmitted? Should it retransmit Seq3? Or should it retransmit Seq3, Seq4, Seq5, and Seq6? Because the sender does not know who sent back these three consecutive ACK3s. Retransmission with Selective Acknowledgement (SACK) In order to solve the problem of fast retransmission: how many packets should be retransmitted? TCP provides the SACK method (Selective Acknowledgment). The SACK mechanism is that, based on fast retransmission, the receiver returns the sequence number range of the most recently received segments, so that the sender knows which packets the receiver has not received, and thus knows which packets to retransmit. The SACK marker is added to the TCP header option field. SACK mechanism As shown in the figure above, the sender receives the same ACK=30 confirmation message three times, which triggers the fast retransmission mechanism. Through the SACK information, it is found that only the data segment 30~39 is lost, so only the TCP segment 30~39 is selected for retransmission. D-SACKD-SACK, or Duplicate SACK, is an extension of SACK. It is mainly used to tell the sender which packets it has received repeatedly. The purpose of DSACK is to help the sender determine whether there is packet disorder, ACK loss, packet duplication or false retransmission. This allows TCP to better perform network flow control. Let's take a look at the picture: D-SACK Brief Process 10. Let’s talk about TCP’s sliding windowTCP sends a piece of data and needs to receive a confirmation response before sending the next piece of data. This has a disadvantage, which is that the efficiency is relatively low. It's like we're chatting face to face, you say one sentence, I respond, and then you say the next sentence. So, if I'm busy with other things and can't reply you in time, after you finish your sentence, you have to wait until I finish my work and reply before you say the next sentence, which is obviously unrealistic. " To solve this problem, TCP introduces a window, which is a buffer space created by the operating system. The window size value indicates the maximum value of data that can be sent without waiting for a confirmation response. The TCP header has a field called win, which is the 16-bit window size. It tells the other party how many bytes of data the local TCP receive buffer can accommodate, so that the other party can control the speed of sending data, thereby achieving the purpose of flow control. In simple terms, every time the receiver receives a data packet, it tells the sender how much free space is left in its buffer when sending a confirmation message. The free space in the buffer is called the receive window size. This is win. The TCP sliding window is divided into two types: the send window and the receive window. The sender's sliding window consists of four parts, as follows:
The receiver's sliding window consists of three parts, as follows:
11. Let’s talk about TCP flow controlAfter the TCP three-way handshake, the sender and receiver enter the ESTABLISHED state, and they can happily transmit data. However, the sender cannot send data to the receiver frantically, because if the receiver cannot receive all the data, the receiver can only store the data that cannot be processed in the buffer. If the buffer is full and the sender is still sending data frantically, the receiver can only discard the received data packets, which wastes network resources. TCP provides a mechanism that allows the sender to control the amount of data sent based on the actual receiving capacity of the receiver. This is flow control. TCP controls traffic through a sliding window. Let's take a look at the brief process of traffic control: First, the two parties perform three-way handshake and initialize their respective window sizes, both of which are 400 bytes. TCP Flow Control
12. TCP Congestion ControlCongestion control acts on the network to prevent too many data packets from being injected into the network and avoid the situation where the network is overloaded. Its main goal is to maximize the bandwidth of the bottleneck link on the network. What is the difference between it and flow control? Flow control acts on the receiver, controlling the sending speed according to the actual receiving capacity of the receiver to prevent packet loss. We can compare a network link to a water pipe. If we want to maximize the use of the network to transmit data, we need to make the water pipe reach the optimal full state as quickly as possible. The sender maintains a variable called congestion window cwnd (congestion window) to estimate the amount of data (water) that this link (water pipe) can carry and transport over a period of time. Its size represents the degree of congestion in the network and changes dynamically. However, in order to achieve the maximum transmission efficiency, how do we know the transmission efficiency of this water pipe? A relatively simple method is to continuously increase the amount of water transmitted until the pipe is about to burst (which corresponds to packet loss on the network). The TCP description is: ★As long as there is no congestion in the network, the value of the congestion window can be increased to send more data packets, but as long as there is congestion in the network, the value of the congestion window should be reduced to reduce the number of data packets injected into the network. " In fact, there are several common congestion control algorithms:
Slow start algorithmThe slow start algorithm, on the surface, means, don't rush and take your time. It means that after TCP establishes a connection, do not send a large amount of data at the beginning, but first detect the congestion level of the network. Gradually increase the size of the congestion window from small to large. If there is no packet loss, the size of the congestion window cwnd is increased by 1 (in MSS) for each ACK received. The sending window doubles in each round, increasing exponentially. If packet loss occurs, the congestion window is halved and enters the congestion avoidance phase.
In order to prevent network congestion caused by excessive growth of cwnd, a slow start threshold ssthresh (slow start threshold) state variable needs to be set. When cwnd reaches this threshold, it is like turning down the tap of a water pipe to reduce congestion. That is, when cwnd > ssthresh, the congestion avoidance algorithm is entered. Congestion Avoidance AlgorithmGenerally speaking, the slow start threshold ssthresh is 65535 bytes. After cwnd reaches the slow start threshold
Obviously, this is a linearly increasing algorithm to avoid network congestion caused by too fast growth. Congestion occursWhen network congestion causes packet loss, there are two situations:
If an RTO timeout retransmission occurs, the congestion occurrence algorithm will be used
This is really a return to the pre-liberation era after decades of hard work. In fact, there is a better way to deal with it, which is fast retransmission. When the sender receives three consecutive repeated ACKs, it will retransmit quickly without waiting for the RTO to time out before retransmitting. image.png The slow start thresholds ssthresh and cwnd change as follows:
Fast recoveryFast retransmit and fast recovery algorithms are usually used at the same time. The fast recovery algorithm believes that since three duplicate ACKs have been received, the network is not that bad, so there is no need to be as aggressive as RTO timeout. As mentioned before, before entering fast recovery, cwnd and sshthresh are updated:
Then, the really fast algorithm is as follows:
13. Relationship between semi-connection queue and SYN Flood attackBefore TCP enters the three-way handshake, the server will change from the CLOSED state to the LISTEN state, and create two queues internally: a semi-connection queue (SYN queue) and a full-connection queue (ACCEPT queue). What is a semi-connection queue (SYN queue)? What is a full-connection queue (ACCEPT queue)? Recall the TCP three-way handshake diagram: Three-way handshake
SYN Flood is a typical DoS (Denial of Service) attack, which forges non-existent IP addresses and sends a large number of SYN messages to the server in a short period of time. When the server replies to the SYN+ACK message, it will not receive the ACK response message, resulting in a large number of half-connections on the server. The half-connection queue is full, and it cannot process normal TCP requests.
SYN Proxy Firewall: The server firewall will proxy and respond to each SYN message received and maintain a semi-connection. After the sender returns the ACK packet, it will reconstruct the SYN packet and send it to the server to establish a real TCP connection. 14. Nagle Algorithm and Delayed ConfirmationNagle's algorithmIf the sender frantically sends very small packets to the receiver, such as 1 byte, then dear friends, what problems do you think will arise? In the TCP/IP protocol, no matter how much data is sent, a protocol header must always be added in front of the data. At the same time, when the other party receives the data, it also needs to send an ACK to indicate confirmation. In order to make the best use of the network bandwidth, TCP always hopes to send as much data as possible. The Nagle algorithm is to send as large a block of data as possible to avoid the network being filled with many small data blocks. The basic definition of Nagle's algorithm is: at any time, there can be at most one unconfirmed small segment. The so-called "small segment" refers to a data block smaller than the MSS size, and the so-called "unconfirmed" means that after a data block is sent out, no ACK is received from the other party to confirm that the data has been received. Implementation rules of Nagle's algorithm:
Delayed confirmationIf the receiver just received a data packet from the sender, and then received a second packet in a very short time, should the receiver reply one by one or combine them and reply together? ★After receiving the data packet, if the receiver has no data to send to the other end, it can wait for a period of time before confirming (the default is 40ms on Linux). If there is data to be transmitted to the other end during this period, the ACK is transmitted along with the data, and there is no need to send a separate ACK. If there is no data to be sent after the time, the ACK is also sent to avoid the other end thinking that the packet is lost. However, there are some scenarios where confirmation cannot be delayed, such as when out-of-order packets are detected, when a message larger than one frame is received, and when the window size needs to be adjusted. Generally speaking, the Nagle algorithm and delayed confirmation cannot be used together. The Nagle algorithm means delayed sending, and delayed confirmation means delayed receiving, which will cause greater delays and performance problems. 15. TCP Packet Gluing and UnpackingTCP is a stream-oriented, unbounded string of data. The TCP bottom layer does not understand the specific meaning of the upper layer business data. It will divide the packets according to the actual situation of the TCP buffer. Therefore, in terms of business, a complete packet may be split into multiple packets by TCP for transmission, or multiple small packets may be encapsulated into a large data packet for transmission. This is the so-called TCP packet sticking and unpacking problem. TCP Packet Gluing and Unpacking Why does sticking and unpacking occur?
Solution:The sender encapsulates each data packet into a fixed length Add special characters at the end of the data to split it The data is divided into two parts, one is the header and the other is the content body; the header structure has a fixed size and has a field that declares the size of the content body. References [1] TCP (Part 2): https://coolshell.cn/articles/11609.html [2] What you need to know about TCP congestion control for interview headlines: https://zhuanlan.zhihu.com/p/76023663 [3] 30 diagrams: TCP retransmission, sliding window, flow control, congestion control: https://zhuanlan.zhihu.com/p/133307545 [4]Soul-searching questions about TCP protocol: Strengthen your network infrastructure: https://juejin.cn/post/6844904070889603085 [5] TCP packet sticking and unpacking: https://blog.csdn.net/ailunlee/article/details/95944377 This article is reprinted from the WeChat public account "Little Boy Picking Snails", which can be followed through the following QR code. To reprint this article, please contact the public account of Little Boy Picking Snails. |
>>: The battle for 5G wide-area coverage has begun. Whose future do you think will be better?
[[276056]] 1. What is TCP and what does it do? TC...
Kvmla has launched a year-end and 2024 New Year p...
PacificRack is now offering a promotion for Multi...
Overview In the previous article, I introduced ho...
iONcloud under Krypt currently offers a 50% disco...
1. Introduction Sudden failures in data center op...
At the Apple conference this morning, the most ex...
Recently, the issue of 4G network speed reduction...
In fact, 2022 is another peak year for 5G investm...
CloudCone's large hard disk VPS host is back ...
[[433796]] introduction This article verifies the...
November is about to end, and LOCVPS has launched...
[[439094]] Mathematics is an ancient discipline t...
This article is reprinted from the WeChat public ...
[51CTO.com original article] On December 5, Venus...