Many people always think that learning TCP/IP protocol is useless, and that daily programming development only requires knowing how to use the socket interface. If you have located online problems, you will know that this is not the case. If the application is in a local area network and the equipment is normal, it may be true, but if there are problems such as instability of the intermediate switch, downtime of the physical server, or other abnormal situations, the problems caused at this time will not be solved if you only stay at the level of understanding the socket interface. Therefore, a deep understanding of the TCP/IP protocol is of great help to us in analyzing abnormal problems.
The following figure is a common architecture in network communication, which is also called CS architecture. The program consists of two parts, namely the client and the server. Of course, the actual environment is much more complicated. There may be many different types and numbers of devices between the client and the server. These devices will increase the complexity of network communication. Naturally, it will also increase the complexity of program development fault tolerance. Figure 1 Basic architecture Basic process of TCP Before analyzing the abnormal situation, let's recall the basic logic of the TCP protocol. Before the client and server can send and receive data, a connection must be established. The establishment of the connection is also completed by sending and receiving data packets at the protocol level, but at the user level, the client calls a connect function. The connection process is commonly known as the "three-way handshake", and the specific process is shown in Figure 2. Figure 2 TCP three-way handshake process Disconnecting a TCP connection is also complicated, and requires the so-called "four-wave" process. The reason is that TCP is a duplex communication, and the connection needs to be disconnected from both the client and the server. Figure 3 TCP's four waves Another important content is the state transition of the TCP protocol. Only by understanding this content can we clearly understand the content of the data packet in various abnormal situations. Figure 4 TCP state transition diagram This article is just a brief review of the basic process of TCP. For detailed content, please refer to the previous article of this account "From TCP to Socket, a thorough understanding of network programming Abnormal situation analysis After understanding the basic process of TCP, let's take a look at various abnormal situations. These abnormal situations are the key to solving problems later. After understanding these abnormal situations and their principles, we can solve the problems with ease later. 1. Trying to establish a connection to a non-existent port (the host is normal) The non-existent port here means that there is no program listening on the port on the server side. Our client calls connect to try to establish a connection with it. What will happen at this time? In this case, we usually receive the following exception content on the client side: [Errno 111] Connection refused For the specific meaning, you can check the relevant Linux manuals or search with a search engine. Imagine that there is no program listening on this interface on the server side, so the server side cannot complete the connection establishment process. Referring to the process of the "three-way handshake", we can know that when the client's SYNC packet arrives at the server, the TCP protocol does not find the listening socket, and it will send an error message to the client to tell the client that an error has occurred. The error message is a message containing RST. This abnormal situation is also easy to simulate. We only need to write a small program to connect to the port that is not listened on the server. The following is a data packet captured by wireshark, and you can see the RST message in the red part. Figure 5 Data packet screenshot Let's take a deeper look. At the operating system level, the TCP server actually reads data from the register of the network card and then parses it. TCP will naturally parse out the key information of the destination port, and then check whether there is such a socket based on this information. What is this socket? At the user level, it is a file handle, but in the kernel it is actually a data structure that records a lot of information. This data structure is stored in a hash table, and the function __inet_lookup_skb (net/inet_hashtables.h) can be used to find the data structure. In the above situation, the socket cannot be found, so the TCP server will handle the error by sending a RST to the client (sent through the function tcp_v4_send_reset). 2. Trying to establish a connection with a port but the host is down (host down) This is also a common situation. When a server host crashes, the client does not know it and still tries to establish a connection with it. This scenario is also divided into two situations, one is that the server has just crashed, and the other is that the server has been down for a long time. Why are these two situations divided? This is mainly related to the ARP protocol. The ARP cache will become invalid locally, and the TCP client will not be able to send data packets to the destination server. (192.168.1.100) at 08:00:27:1a:7a:0a [ether] on eth0 After understanding the above situation, let's analyze the situation just now when the server crashed. At this time, the client can send data packets to the server. However, since the server crashed, it will not send any reply to the client. Figure 6 Data packet screenshot Since the client is unaware that the server is down, it will repeatedly send SYNC packets. As shown in Figure 6, the client sends a SYNC packet to the server every few seconds. The specific time here is related to the TCP protocol, and the specific time may be slightly different in different operating system implementations. 3. The server application is blocked (or dead) when establishing a connection Another situation is that the server application is in a deadlock state during the client connection establishment process. This situation often occurs in practice (we assume that only the application is deadlocked, but the kernel is not). What state will appear at this time? Can TCP's three steps be completed? Can the client send and receive data? At the user level, we know that the server returns a new socket through the accept interface, and then data can be exchanged with the client. That is, at the user level, the result returned by accept indicates that the three-way handshake is completed, otherwise accept will be blocked. In our hypothetical situation, it is equivalent to the application being unable to perform the accept operation. If you want to fully understand the problem we assumed above, you need to understand two points. One is what the accept function does specifically, and the other is the essence of the TCP three-way handshake. Let's try to understand the first point first. Accept will fall into the kernel through a soft interrupt, and finally call the inet_csk_accept function of the TCP protocol, which will check whether there is a socket in the ESTABLISHED state from the queue. If there is, the socket will be returned, otherwise the current process will be blocked. In other words, this is just a query process and does not participate in any logic of the three-way handshake. What is the essence of the three-way handshake? In fact, it is a process of continuous communication between the client and the server, and this communication process is completed through three data packets. The sending and processing of this data packet are actually completed in the kernel. For the TCP server, when it receives the SYNC data packet, it will create a socket data structure and reply ACK to the client. When it receives the client's ACK again, it will convert the state of the socket data structure to ESTABLISHED and send it to the ready queue. And this whole process has nothing to do with the application. When the above socket is added to the ready queue, the accept function is awakened, and then the new socket can be obtained and returned. But let's look back and see that before accept returns, the three-way handshake has actually been completed, that is, the connection has been established. Another question is, if accept does not return, can the client send data? The answer is yes. Because the sending and receiving of data are both done in kernel mode. After the client sends the data, the server's network card will receive it first, then notify the IP layer through an interrupt, and then upload it to the TCP layer. The TCP layer stores the data in the associated buffer according to the destination port and address. If the application has a read operation (such as read or recv) at this time, the data will be copied from the kernel buffer to the user cache. Otherwise, the data will remain in the kernel buffer. In general, whether the TCP client can send data has nothing to do with whether the server program works. Of course, if the entire machine is stuck, that is another situation. This situation is the same as the second situation we analyzed earlier. Because the machine is completely stuck, the TCP server cannot receive any messages, and naturally cannot send any response messages to the client. Summarize Today we mainly introduced various abnormal situations in the process of establishing a connection. There is another situation in the process of data transmission. For example, the server suddenly loses power during the transmission process, or the program crashes, etc. Later, we will explain in detail the performance of these abnormal situations at the protocol layer. |
<<: In the 5G era, what is the United States worried about?
>>: Detailed explanation of TCP data segment format + UDP data segment format
At the "5G and Network Development Strategy ...
Leifeng.com: To understand cellular technology, y...
RAKsmart has some new changes in this month's...
Recently, IDC released the "2020 Network Mar...
At the HAS Analyst Conference recently, Chen Jinz...
1 Question Explore different VGG networks. 2 Meth...
Modernity brought new and groundbreaking things t...
On June 6, 2019, my country's 5G license was ...
My hometown is in the rural area of Hebei. My b...
During the just-concluded Spring Festival holiday...
Cisco's next-generation network can continuou...
Listen to the strongest voice of open source in C...
If you are considering a structured cabling envir...
Some time ago, I shared the news of 80VPS's n...
While the pandemic delayed global 5G construction...