Preface Regardless of whether you are interviewing for a Java, C/C++, Python or other development position, knowledge about TCP is a must-ask question. No matter how many times TCP abuses me, I will still treat TCP like my first love. I recall that when Xiaolin was recruiting on campus, he was often rejected because of TCP interview questions. It was really a love-hate relationship.... It doesn't matter if the past is not here, let's eliminate this fear today and face it bravely with a smile! So Xiaolin sorted out the interview questions about TCP three-way handshake and four-way wave, and discussed them with everyone.
PS: This article does not cover TCP flow control, congestion control, reliable transmission and other aspects. These will be left for the next article! text 1. Basic knowledge of TCP Look at the TCP header format Let's first look at the format of the TCP header. The color-coded fields are those that are most relevant to this article, and the other fields are not described in detail. (1) Sequence number: A random number generated by a computer when a connection is established as its initial value, which is transmitted to the receiving host through the SYN packet. Each time data is sent, the size of the "number of data bytes" is "accumulated". It is used to solve the problem of disordered network packets. (2) Confirmation number: refers to the sequence number of the data that is "expected" to be received next time. After receiving this confirmation, the sender can assume that the data before this sequence number has been received normally. It is used to solve the problem of packet loss. (3) Control bit:
2. Why do we need TCP protocol? At which layer does TCP work? The IP layer is "unreliable". It does not guarantee the delivery of network packets, the in-order delivery of network packets, or the integrity of the data in the network packets. Relationship between OSI reference model and TCP/IP If the reliability of network data packets needs to be guaranteed, the upper layer (transport layer) TCP protocol needs to be responsible for it. Because TCP is a reliable data transmission service working at the transport layer, it can ensure that the network packets received by the receiver are damage-free, gap-free, non-redundant and in-order. 3. What is TCP? TCP is a connection-oriented, reliable, byte stream-based transport layer communication protocol.
4. What is a TCP connection? Let's take a look at how RFC 793 defines "connection":
Simply put, certain status information used to ensure reliability and flow control maintenance. The combination of this information, including Socket, sequence number and window size, is called a connection. So we can know that establishing a TCP connection requires the client and the server to reach a consensus on the above three pieces of information.
5. How to uniquely identify a TCP connection? The TCP tuple can uniquely identify a connection. The tuple includes the following:
The source address and destination address fields (32 bits) are in the IP header and are used to send messages to the other host via the IP protocol. The source port and destination port fields (16 bits) are in the TCP header and their function is to tell the TCP protocol to which process the message should be sent. 6. A server with an IP address listens to a port. What is the maximum number of TCP connections? The server usually listens on a fixed local port, waiting for the client's connection request. Therefore, the client IP and port are variable, and their theoretical value calculation formula is as follows: For IPv4, the maximum number of client IP addresses is 2 to the power of 32, and the maximum number of client ports is 2 to the power of 16, which is the maximum number of TCP connections on a single server, approximately 2 to the power of 48. Of course, the maximum number of concurrent TCP connections on the server side is far from reaching the theoretical upper limit.
7. What is the difference between UDP and TCP? What are their respective application scenarios? UDP does not provide complex control mechanisms and uses IP to provide "connectionless" communication services. The UDP protocol is really very simple, with only 8 bytes (64 bits) in the header. The UDP header format is as follows:
Differences between TCP and UDP: (1) Connection:
(2) Service Targets:
(3) Reliability:
(4) Congestion control and flow control:
(5) Header overhead:
TCP and UDP application scenarios: Since TCP is connection-oriented and can ensure reliable data delivery, it is often used for:
Since UDP is connectionless, it can send data at any time. In addition, UDP processing is simple and efficient, so it is often used for:
8. Why does the UDP header not have a "Header Length" field, while the TCP header has a "Header Length" field? The reason is that TCP has a variable-length "option" field, while the UDP header length does not change, so there is no need for an extra field to record the UDP header length. 9. Why does the UDP header have a "packet length" field, but the TCP header does not have a "packet length" field? Let's first talk about how TCP calculates the payload data length: The total IP length and IP header length are known in the IP header format. The TCP header length is known in the TCP header format, so the length of the TCP data can be obtained. Everyone was curious and asked: "UDP is also based on the IP layer, so the data length of UDP can also be calculated using this formula? Why is there still a "packet length"? " Asking this question, it does seem that the UDP "packet length" is redundant. Because for the convenience of network device hardware design and processing, the header length needs to be an integer multiple of 4 bytes. If the UDP "Packet Length" field is removed, the UDP header length will not be an integer multiple of 4 bytes. So Xiaolin thinks that this may be to complete the UDP header length is an integer multiple of 4 bytes, so the "Packet Length" field is added. 2. TCP connection establishment 1. TCP three-way handshake process and state transition TCP is a connection-oriented protocol, so a connection must be established before using TCP, and the connection is established through a three-way handshake. TCP three-way handshake
The first message - SYN message
The third message - ACK message
From the above process, we can find that the third handshake can carry data, while the first two handshakes cannot carry data. This is also a frequently asked question in interviews. Once the three-way handshake is completed, both parties are in the ESTABLISHED state, and the connection is established. The client and server can send data to each other. 2. How to check TCP status in Linux? To view the TCP connection status, you can use the netstat -napt command in Linux. 3. Why is it a three-way handshake? Not two or four? I believe the most common answer is: "Because the three-way handshake can ensure that both parties have the ability to receive and send." There is nothing wrong with this answer, but it is one-sided and does not state the main reason. Earlier we learned what a TCP connection is:
Therefore, it is important to understand why a three-way handshake is required to initialize the Socket, sequence number, and window size and establish a TCP connection. Next, we analyze the reasons for the three-way handshake from three aspects:
(1) Reason 1: Avoid historical connections Let's look at the primary reason why TCP connections use a three-way handshake, as stated in RFC 793:
In short, the primary reason for the three-way handshake is to prevent confusion caused by old duplicate connection initializations. The network environment is complicated. It is often not as we expect that the first data packet sent will reach the target host first. On the contrary, it is very complicated. Due to network congestion and other messy reasons, the old data packet may reach the target host first. So how to avoid the TCP three-way handshake in this case? The client sends multiple SYN packets to establish a connection. In the case of network congestion, the following happens:
If it is a two-way handshake connection, it is impossible to determine whether the current connection is a historical connection. A three-way handshake allows the client (sender) to determine whether the current connection is a historical connection because it has enough context when the client (sender) is ready to send the third message:
Therefore, the main reason why TCP uses three-way handshake to establish a connection is to prevent historical connections from initializing the connection. (2) Reason 2: Synchronizing the initial sequence numbers of both parties Both parties in the TCP protocol must maintain a "sequence number". The sequence number is a key factor in reliable transmission. Its functions are:
It can be seen that the sequence number plays a very important role in the TCP connection. Therefore, when the client sends a SYN message carrying the "initial sequence number", the server needs to send an ACK response message to indicate that the client's SVN message has been successfully received by the server. When the server sends the "initial sequence number" to the client, it still needs to get an acknowledgment from the client. This back and forth process can ensure that the initial sequence numbers of both parties can be reliably synchronized. Four-way handshake and three-way handshake The four-way handshake can actually reliably synchronize the initialization sequence numbers of both parties, but because the second and third steps can be optimized into one step, it becomes a "three-way handshake". The two-way handshake only ensures that the initial sequence number of one party can be successfully received by the other party, but there is no way to ensure that the initial sequence numbers of both parties can be confirmed and received. (3) Reason 3: Avoiding waste of resources If there is only a "two-way handshake", when the client's SYN request connection is blocked in the network and the client does not receive an ACK message, it will resend the SYN. Since there is no third handshake, the server does not know whether the client has received the ACK confirmation signal it sent to establish a connection. Therefore, each time a SYN is received, it can only actively establish a connection. What will this cause? If the client's SYN is blocked and the SYN message is sent repeatedly, the server will establish multiple redundant invalid links after receiving the request, resulting in unnecessary waste of resources. Two handshakes will cause a waste of resources That is, the two-way handshake will cause message retention, and the server will repeatedly accept useless connection request SYN messages, resulting in repeated allocation of resources. (4) Summary When TCP establishes a connection, the three-way handshake can prevent the establishment of a historical connection, reduce unnecessary resource consumption for both parties, and help both parties synchronize and initialize the sequence number. The sequence number can ensure that data packets are not repeated, discarded, and transmitted in order. Reasons for not using "two-way handshake" and "four-way handshake":
4. Why are the initial sequence numbers (ISNs) of the client and server different? Because messages in the network may be delayed, copied and resent, or lost, this may cause different connections to affect each other. Therefore, in order to avoid mutual impact, the initial sequence numbers of the client and the server are random and different. 5. How is the Initial Sequence Number (ISN) randomly generated? The starting ISN is based on a clock that increments by 4 milliseconds + 1, making one revolution in 4.55 hours. RFC1948 proposes a better random generation algorithm for the initialization sequence number ISN. ISN = M + F (localhost, localport, remotehost, remoteport)
6. Since the IP layer can fragment, why does the TCP layer still need MSS? Let's first understand MTU and MSS
If the entire TCP message (header + data) is handed over to the IP layer for fragmentation, what abnormality will occur? When the IP layer has data (TCP header + TCP data) that exceeds the MTU size to be sent, the IP layer will fragment the data into several pieces to ensure that each piece is smaller than the MTU. After an IP datagram is fragmented, it is reassembled by the IP layer of the target host and then handed over to the upper TCP transport layer. This seems to be in good order, but there is a hidden danger. If an IP fragment is lost, all fragments of the entire IP message must be retransmitted. Because the IP layer itself does not have a timeout retransmission mechanism, the TCP of the transport layer is responsible for timeout and retransmission. When the receiver finds that a piece of the TCP message (header + data) is lost, it will not respond to the other party with an ACK. Then the sender's TCP will resend the "entire TCP message (header + data)" after timeout. Therefore, it can be concluded that fragmented transmission at the IP layer is very inefficient. Therefore, in order to achieve the best transmission efficiency, the TCP protocol usually negotiates the MSS value of both parties when establishing a connection. When the TCP layer finds that the data exceeds the MSS, it will first fragment it. Of course, the length of the IP packet formed by it will not be greater than the MTU, so IP fragmentation is naturally not needed. After TCP layer fragmentation, if a TCP fragment is lost, it is retransmitted in units of MSS instead of retransmitting all fragments, which greatly increases the efficiency of retransmission. 7. What is a SYN attack? How to avoid SVN attacks? (1) SYN attack We all know that establishing a TCP connection requires three handshakes. Assuming that an attacker forges SYN messages with different IP addresses in a short period of time, the server enters the SYN_RCVD state every time it receives an SVN message. However, the ACK + SYN message sent by the server cannot get an ACK response from the unknown IP host. Over time, the server's SYN receive queue (unconnected queue) will be filled up, making the server unable to serve normal users. (2) Avoid SVN attack method 1 One solution is to modify the Linux kernel parameters to control the queue size and what to do when the queue is full.
(3) Avoid SVN attack method 2 Let's first look at how the Linux kernel's SYN (unfinished connection establishment) queue and Accpet (completed connection establishment) queue work? Normal process:
Application is too slow:
Under attack from SVN:
The tcp_syncookies method can be used to deal with SYN attacks:
3. TCP connection disconnection 1. TCP four-wave process and state transition All good things must come to an end, and this is also true for TCP connections. TCP disconnects by waving four times. Both parties can actively disconnect, and after disconnection, the "resources" in the host will be released. The client actively closes the connection - TCP wave four times
One thing to note here is that only when the connection is closed actively will there be a TIME_WAIT state. 2. Why does it take four waves? If we review the process of both parties sending FIN packets four times, we can understand why it is necessary to do it four times.
From the above process, we can see that the server usually needs to wait for the data to be sent and processed, so the server's ACK and FIN are generally sent separately, resulting in one more handshake than the three-way handshake. 3. Why is the waiting time of TIME_WAIT 2MSL? MSL is Maximum Segment Lifetime, the maximum survival time of a message. It is the longest time that any message can exist on the network. If it exceeds this time, the message will be discarded. Because TCP messages are based on the IP protocol, and there is a TTL field in the IP header, which is the maximum number of routes that an IP datagram can pass through. This value decreases by 1 for each router that processes it. When this value is 0, the datagram will be discarded, and an ICMP message will be sent to notify the source host. The difference between MSL and TTL: The unit of MSL is time, while TTL is the number of routing hops. Therefore, MSL should be greater than or equal to the time when TTL is consumed to 0 to ensure that the message has been naturally destroyed. TIME_WAIT waits for 2 times of MSL. A more reasonable explanation is that there may be data packets from the sender in the network. When these data packets from the sender are processed by the receiver, they will send a response to the other party, so it takes 2 times the time to wait for the round trip. For example, if the passive closing party does not receive the last ACK message of the disconnection, it will trigger a timeout and resend the Fin message. After the other party receives the FIN, it will resend ACK to the passive closing party. The total time is exactly 2 MSLs. The 2MSL time starts from the time the client sends ACK after receiving FIN. If the client receives a FIN message resent by the server during the TIME-WAIT time because the client's ACK is not transmitted to the server, the 2MSL time will be restarted. In Linux, 2MSL defaults to 60 seconds, so 1MSL is 30 seconds. The Linux system stays in TIME_WAIT for a fixed 60 seconds. Its name defined in the Linux kernel code is TCP_TIMEWAIT_LEN:
If you want to change the length of TIME_WAIT, you can only modify the value of TCP_TIMEWAIT_LEN in the Linux kernel code and recompile the Linux kernel. 4. Why is TIME_WAIT state needed? The TIME-WAIT state will only occur on the party that actively initiates the closing of the connection. The TIME-WAIT state is needed mainly for two reasons:
(1) Reason 1: Preventing data packets from old connections Assuming that TIME-WAIT has no waiting time or the time is too short, what will happen after the delayed data packet arrives? Exception in receiving historical data
Therefore, TCP has designed such a mechanism. After 2MSL, the data packets in both directions are discarded, so that the data packets of the original connection disappear naturally in the network, and the data packets that appear again must be generated by the newly established connection. (2) Reason 2: Ensure the connection is closed correctly RFC 793 points out that another important role of TIME-WAIT is:
That is to say, the role of TIME-WAIT is to wait for enough time to ensure that the final ACK can be received by the passive closing party, thereby helping it to close normally. Assuming that TIME-WAIT has no waiting time or the time is too short, what problems will the disconnection cause? Exceptions that do not ensure a normal disconnect
If TIME-WAIT waits long enough, two situations will occur:
Therefore, after the client waits for 2MSL time in the TIME-WAIT state, it can be guaranteed that the connection between both parties can be closed normally. 5. What are the dangers of too much TIME_WAIT? If the server has TCP in TIME-WAIT state, it means that the disconnection request was actively initiated by the server. There are two main hazards of too much TIME-WAIT state:
The second hazard will cause serious consequences. You should know that port resources are also limited. Generally, the ports that can be opened are 32768~61000, which can also be specified by the following parameter settings:
If there are too many TIME_WAIT states on the server side and all port resources are occupied, new connections cannot be created. 6. How to optimize TIME_WAIT? Here are several ways to optimize TIME-WAIT, all with advantages and disadvantages:
(1) Method 1: net.ipv4.tcp_tw_reuse and tcp_timestamps When the following Linux kernel parameters are enabled, the socket in TIME_WAIT can be reused for new connections.
To use this option, there is another prerequisite, which is to enable support for TCP timestamps, namely:
This timestamp field is in the "options" of the TCP header and is used to record the current timestamp of the TCP sender and the latest timestamp received from the peer. With the introduction of timestamps, the 2MSL problem we mentioned earlier no longer exists, because duplicate data packets will be naturally discarded due to expired timestamps. Warm reminder: net.ipv4.tcp_tw_reuse should be used with caution, because it must enable the timestamp support net.ipv4.tcp_timestamps. When the client and server host time are not synchronized, the client's message will be directly rejected. Xiaolin encountered this at work... It took a long time to troubleshoot. (2) Method 2: net.ipv4.tcp_max_tw_buckets The default value is 18000. Once the number of connections in TIME_WAIT in the system exceeds this value, the system will reset the status of all TIME_WAIT connections. This method is too violent and only treats the symptoms rather than the root cause. It creates far more problems than it solves, so it is not recommended. (3) Method 3: Using SO_LINGER in the program We can set the behavior of calling close to close the connection by setting the socket options.
If l_onoff is non-zero and l_linger is 0, a RST flag will be sent to the peer immediately after close is called. The TCP connection will skip four waves, i.e. the TIME_WAIT state, and will be closed directly. However, this provides a possibility to cross the TIME_WAIT state, but it is a very dangerous behavior and is not worth promoting. 7. What if the connection is established but the client suddenly fails? TCP has a keep-alive mechanism. The principle of this mechanism is as follows: Define a time period. During this period, if there is no connection-related activity, the TCP keep-alive mechanism will start to work. At every time interval, a probe message will be sent. The probe message contains very little data. If several consecutive probe messages are not responded to, the current TCP connection is considered to be dead, and the system kernel will notify the upper-level application of the error information. In the Linux kernel, there are corresponding parameters to set the keep-alive time, the number of keep-alive detections, and the time interval of keep-alive detections. The following are the default values:
That is to say, in Linux system, it takes at least 2 hours, 11 minutes and 15 seconds to find a "dead" connection. This time is a bit long. We can also set the above keep-alive related parameters according to actual needs. If TCP keepalive is enabled, the following situations need to be considered:
4. Socket Programming 1. How to program socket for TCP?
It should be noted here that when the server calls accept, if the connection is successful, a connected socket will be returned, which will be used to transmit data later. Therefore, the listening socket and the socket actually used to transmit data are "two" sockets, one is called the listening socket and the other is called the completed connection socket. After a successful connection is established, both parties begin to read and write data through the read and write functions, just like writing something to a file stream. 2. What is the meaning of the backlog parameter when listening? Two queues are maintained in the Linux kernel:
SYN Queue and Accpet Queue
In the early Linux kernel backlog was the SYN queue size, that is, the outstanding queue size. After Linux kernel 2.2, backlog becomes the accept queue, that is, the length of the queue for completed connection establishment, so now backlog is generally considered to be the accept queue. 3. At which step of the three-way handshake is accept sent? Let's first look at what the client sends when it connects to the server. Client connects to server
From the above description, we can know that the client connect successfully returns in the second handshake, and the server accept successfully returns after the three-way handshake is successful. 4. The client calls close, what is the process of disconnecting the connection? Let's see what happens when the client actively calls close? The client calls the close procedure
|
<<: Shifting gears to 5G: Operators will experience both hardship and sweetness in 2019
>>: How Desktop Cloud Helps Application Innovation
As the saying goes, a single tree cannot make a f...
UUUVPS is now holding a three-year anniversary ev...
Some media reported that "the first tens of ...
The CCBN-BDF Forum was held at the same time as t...
VRRP is a commonly used fault-tolerant protocol t...
7×24 hours uninterrupted protection Communication...
"Talented people emerge in every generation,...
According to the "Economic Operation of the ...
This question is actually very simple. As long as...
In 2016-2017, the trend of IoT was widely accepte...
The development of 5G services has put forward hi...
[51CTO.com original article] In 2017, the total r...
Reducing network outages is becoming an increasin...
The concept of network automation has been around...
Today, let’s continue with the network administra...