In our daily development, we will more or less be involved in network transmission. This article mainly summarizes some key points of TCP. As a developer, although there are so many infrastructures (frameworks, components) that help us shield these details, I still think it is helpful to understand some of its basic principles, especially when you encounter some difficult problems in a distributed environment, some principle knowledge may help you find the answer quickly.
1. Origin TCP is a transport layer protocol. Its full name is Transmission Control Protocol. This protocol is defined in IETF RFC 793. Before the Internet, our computers were independent of each other, and each machine had its own operating system and kept itself running. Therefore, in order to connect these computers and enable the transmission and interaction of data and resources based on a "channel", IETF developed the TCP protocol. So, what is IETF? It is a respected technical organization called Internet Engineering Task Force. This is an open organization founded in 1985. The important network protocols we mention now, such as HTTP, TCP, and IP, all come from this organization. It can be said that IETF is the originator of the Internet. Without it, there would be no prosperous Internet today. It is worth mentioning that IETF is not a powerful organization. It is a self-organized and self-managed team "from the people" that highly values the spirit of freedom and equality. The underlying mechanism of the entire Internet is composed of a set of standard network protocols. In order to make it easier to understand, people have defined the so-called "network layered model". When studying computer network courses, two network models are mentioned, as follows:
In the past, many people were often confused by OSI and ISO due to the numerous terms.
As can be seen from the figure above, TCP/IP is basically a simplified version of the OSI model, and of course it is easier to understand. Below the network layer, some technical means and concepts involved in the physical layer and data link layer are relatively obscure and difficult to understand. For example, optical cables, repeaters, switches, etc. require some professional background to fully understand. For most software applications, it is undoubtedly simpler to refer to the parts below the network layer as the "network interface layer". Therefore, although the OSI model is very complete and comprehensive, it has been eliminated by the TCP/IP model and is rarely mentioned today when Internet applications are prevalent. Figure - TCP/IP Network Model 2. TCP Protocol TCP is the most important transport layer protocol in the entire TCP/IP protocol suite. It defines a connection-oriented, reliable, stream-based transmission method. HTTP is based on TCP, so it is not an exaggeration to say that TCP is one of the protocols of the entire Internet. At the same time, when we use the HTTP protocol to implement interaction between application systems, we often have to deal with TCP, so it is necessary to understand some basic mechanisms. 1. What are the characteristics of TCP?
This is completely different from the message-based protocol (UDP). Of course, stream-based transmission also ensures the order of data transmission and reception, so each data packet is accompanied by a sequence number belonging to the current connection. 2. How to understand full-duplex? Full-duplex is a term in communications and is not often mentioned in the field of software development. This means that data is transmitted in both directions at the same time, and TCP is a full-duplex based trusted transmission protocol. Of course, UDP can also achieve full-duplex transmission, but TCP can only achieve point-to-point transmission and cannot support broadcast or multicast (packet) Blackboard: The difference with half-duplex is that only one direction can be transmitted at a time 3. How are TCP packets organized? The most primitive way to see through a protocol is to look at its data packets. The format of a TCP message is as follows: The fields here include: (1) Source port: indicates the port number used by the sender for the target host to respond. (2) Destination port: indicates the port number of the target host to be connected. (3) Sequence number: Indicates the order of the data packets sent, usually the sequence number of the last sent packet + 1. If the data packet is the first packet in the entire TCP connection (SYN packet), the value is randomly generated. (4) Acknowledgement number: Indicates the data that the local TCP has received. Its value represents the sequence number of the next byte expected to be sent by the other end. In fact, it tells the other party that the bytes before this sequence number minus 1 have been received correctly. If the data packet is the first packet in the entire TCP connection (SYN packet), the confirmation number is generally 0. (5) Data offset: Indicates the total length of the TCP packet header (header length) in units of 32 bits (4 bytes), which is used to determine the starting position of the user data area. In the absence of variable content, the TCP header size is 20 bytes, corresponding to a value of 5. (6) Flag: Urgent flag (URG): When turned on, it indicates that this data packet is in an urgent state and should be processed first.
(7) Window size: indicates the number of bytes of data packets expected to be received, used for congestion control. (8) Checksum: Check the TCP message header and data area. (9) Urgent pointer: In the emergency state (URG is on), it indicates the location (end) of the urgent data in the window. (10) Options (variable): used to support some special variables, such as the maximum packet size (MSS). (11) Padding: used to ensure that the variable option is an integer multiple of 32 bits. Blackboard: Generally, the TCP header is 20 bytes, plus the 20-byte IP header, a data packet contains at least 40 bytes of header 3. TCP workflow Chain refers to link, which is a concept at the physical layer, such as optical cable or wireless electromagnetic waves. But the link mentioned here actually means network connection, that is, the concept of the IP upper layer. Then, a normal TCP communication process includes link establishment (establishing a connection), data transmission, and link teardown (closing the connection). As shown in the following figure: (Picture from the Internet) As shown in the figure above, when TCP is used for data transmission, it is inevitable to go through these two stages:
Next, we will focus on the process of chain building and chain tearing down. 4. Three-way handshake When establishing a TCP connection, three interactions are required, also known as a three-way handshake.
When talking about the three-way handshake, there are several issues that need attention: Question 1. Why is it a three-way handshake? This question is always asked in technical interviews. The original question is, can we shake hands twice, or four times? The answer is that TCP is a reliable transmission, and when establishing a connection, it should go through the confirmation process at both ends, as shown in the above process. Only in the case of a three-way handshake, when both the client and the server have gone through a true (SYN+ACK) confirmation process, is the connection considered credible. In addition, if there are only two handshakes, once the network is unstable and causes the SYN packet to be retransmitted, it will directly lead to repeated connection establishment, wasting resources. Question 2. What is a syn flood attack? SYN flood is a classic DDOS attack method that exploits the vulnerability in the TCP three-way handshake. In the figure above, you can see that when the server receives SYN, it enters the SYN-RECV state. The connection at this time is called a half-connection and will be written into a half-connection queue by the server. Imagine if the attacker continuously sends a large number of SYN packets to the server in a short period of time without responding, the server's semi-connected queue will soon be filled up, making it unable to work. The means of implementing syn flood can be to forge the source IP so that the server's response will never reach the client (the handshake cannot be completed); Of course, the same purpose can be achieved by setting client firewall rules. It is difficult to block syn floods. It can be mitigated by enabling syn_cookies, but this is usually not the best solution. The best way is to solve it through a professional firewall. Basically all the big cloud computing companies have this capability. Question 3. How to optimize semi-connected queues and fully connected queues Here we mention a "semi-connection queue" (syns queue), and there is also a "full connection queue" (accept queue) corresponding to it. The former is used to temporarily store incomplete connections, and the latter is a queue that a connection enters after it is successfully established. The default size of the semi-connection queue can be adjusted via kernel parameters:
Blackboard: tcpmaxsynbacklog is invalid when syncookies is enabled. The two options conflict. For a full connection queue, if the server fails to remove the connection in it through the accept call in time, it will cause the queue to overflow (connection failure) Kernel tuning method for the size of the full connection queue:
So, is kernel tuning the only method that can affect these two parameters? The answer is no. In fact, when calling socket listen at the application layer, it also supports setting a backlog parameter. The relationship between these parameters is as follows:
Blackboard: General application servers such as Netty and Tomcat support setting backlog parameters, but when actually tuning, you also need to consider the configuration of kernel parameters. 5. Four Waves When releasing the connection, since TCP is full-duplex, both ends must close it separately. The process is as follows:
There are two ways to close a connection: active closing and passive closing. To simplify the understanding, we take the client as the active closing party and the server as the passive closing party. Issues that require attention during the four waves: Question 1. Why four waves? The party that sends FIN is actively closing (client), while the other party is passively closing (server). When one party sends a FIN, it means that no more data will be sent on this side. When the passive closing party receives the other party's FIN, there may be data to be sent at this time, so it cannot send FIN immediately (that is, it cannot send FIN and ACK together). Instead, it waits for its own data to be sent before sending a FIN separately, so the whole process requires four interactions. Question 2. What is half-closed After receiving the ACK response to the first FIN, the client enters the FINWAIT2 state. At this time, the server is in the CLOSEWAIT state, which is called half-closed. From half-close to full-close, it is necessary to wait for the second FIN confirmation to complete. At this time, the client must wait for the server's FIN to enter TIMEWAIT. If the other party does not send FIN for a long time, it will time out after waiting for a while. This can be controlled by the kernel parameter tcpfin_timeout, which defaults to 60s. Question 3. Why does the server have a large number of closewait A server connection in a half-closed state will be in the closewait state until the server sends a FIN. Then at the application layer, calling socket.close() will execute the sending of FIN. If the server has a large number of connections in the CLOSE_WAIT state, the possible reasons are:
Question 4. What problems does timewait bring? When the client receives the FIN from the other party, it will enter the TIMEWAIT state, where it will remain for a while before entering the CLOSE state. The main reason for doing this is to close the connection reliably. When TCP was designed for reliability, many factors of network instability were taken into consideration, such as: The ACK sent to the other party may not be received in time. At this time, the other party may retransmit FIN. If CLOSE is entered in advance, RST instead of ACK will be returned, which will affect the closing process. Therefore, the TIMEWAIT state will last for a period of time by default, and will be closed safely after confirming that there will be no more retransmitted data packets. Blackboard: The default duration of timewait here is 2*MSL (a total of 1 minute). This MSL is called Max Segment Lifetime, which is the preset maximum life cycle of a data packet transmitted in the network. The default MSL is 30s, of course, this value can be greatly reduced now. It can be seen how bad the network conditions were when it was first designed. So what problems does timewait bring? If you frequently actively close connections, a large number of timewait connections may be generated. Since the timewait connection occupies a handle and a small amount of memory (4K), it may affect the establishment of other connections, such as:
How to solve it:
Blackboard: The timewait problem was discovered in the HTTP protocol, so KeepAlive was defined in HTTP 1.1 to support connection reuse. Question 5. What is RST and why does it appear? RST is a special mark used to indicate that the connection should be terminated immediately. The following situations will generate RST:
The RST mechanism is sometimes used to perform port scans, as follows: -> Port is open and can accept SYN -> Port closed, responds to RST summary The original article just wanted to summarize some details of TCP parameter tuning. I didn't expect that TCP involves so many things. There are so many details and pitfalls in just a simple handshake and waving process. It can be said that in order to ensure the reliability of data transmission, the early designers did consider too many things. Of course, this also paved the way for the implementation of upper-level applications. |
>>: Hot Topic | Why is the United States determined to "kill" Huawei?
With the popularization of the Internet, 5G integ...
The Ministry of Industry and Information Technolo...
There is no doubt that more pervasive 5G technolo...
On January 25, 2017, Shanghai Huichang Communicat...
In the 5G era, real-time communication is still a...
As a TV series that has been rebroadcast thousand...
[51CTO.com original article] The sudden outbreak ...
Since 2018, some Western countries, led by the Un...
CloudCone's Christmas Sale has begun. The mer...
The tribe has shared a lot of cheap RackNerd VPS ...
5G is the fastest growing mobile technology in hi...
By 2025, Wi-Fi 6 and Wi-Fi 6E are expected to exc...
In previous generations of mobile networks, outdo...
Standard Interconnect is a Chinese hosting compan...