A few days ago, I posted a circle of friends and found that the girl I had a long-time crush on liked my post. So I tossed and turned that night and couldn't sleep all night! I wondered if the girl had feelings for me? Otherwise, why would she suddenly like my post? How about taking this opportunity to confess my feelings? So the next day I simulated the words of confession many times in my mind, and even practiced my breathing repeatedly. At night, I called the girl on WeChat. Before she even started talking, I couldn’t hold back my inner thoughts and started talking to myself frantically… It took five minutes to finish, and everything was so natural! But after I finished speaking, I didn’t get a response from the girl for a long time... It took a long time before I heard the other party's voice: "Hello! Hello! The signal is bad here. I didn't hear a word you said just now. I'm shopping with my boyfriend..." I hung up the phone, and I also made an in-depth summary of my failed confession! The reason is that I didn’t learn TCP well! If I understand TCP, I would at least ask "Are you there?" before confessing my love! I would first establish a reliable connection and make sure the connection is normal before I start confessing my love! If I understand TCP, I need the other party to constantly confirm when I am speaking, so that I can ensure that the other party can hear every word I say! Only in this way can I successfully express my feelings! So it all started with me not learning TCP well, so I went to the library... Let's first look at the definition of TCP: TCP stands for Transmission Control Protocol, which is a connection-oriented, reliable, byte-stream-based transport layer communication protocol. TCP is a transmission protocol designed specifically to provide reliable end-to-end byte streams on unreliable Internet networks. We know every word here, but it is not so easy to understand when they are put together! Then let's extract some key words, which are the ones I highlighted above: connection-oriented, reliable, based on byte stream, transport layer, protocol, end-to-end! Understanding these keywords also means understanding the implementation principle of TCP, so let's start the analysis from these keywords! Transport Layer Let's talk about the transport layer first, because we can look at TCP from a higher level. Let's first look at the classic OSI seven-layer network reference model: When we need to exchange data on the Internet, we need to go through these layers. Each layer has a related implementation. The TCP we are going to talk about today is a kind of implementation of the transport layer. When we talk about the transport layer, we may naturally think of TCP, but TCP is only one implementation of the transport layer. Other common transport layer protocols include UDP, etc. I know dry text is too abstract for you, so I'll grab a package and make these layers more concrete! All the packages in this article are sent through postman and then captured by wireShark! If you don’t know these two softwares, you can learn about them first. I won’t explain them in detail here. We enter the domain name of www.17coding.info in Postman, and then send a request, and Wireshark will be able to capture the data packet. The diagram shows the relationship between each layer and the captured data packet! Hey! Didn’t we talk about the 7-layer network reference model above? Why does the data packet only have 5 layers? Note the word "reference". The 7-layer model is a theoretical model. In actual networks, the application layer, session layer, and presentation layer are often collectively referred to as the application layer! What is a protocol? When it comes to an agreement, it is an agreement that both parties abide by! For example, in this article I wrote, you can understand every word I wrote and understand what I mean. That's because we all follow the grammar of Chinese, which itself is an agreement. For example, when we write code, we must write it according to the prescribed syntax so that the compiler can compile it correctly. There are also many protocols in computer networks, such as common application layer protocols http, ftp, dns protocols and so on. Common transport layer protocols include TCP, UDP, etc. In fact, these protocols are specifications followed by both the sender and the receiver. If we follow its specifications, we can also become the implementer of the protocol, such as writing a web server to handle user requests. We can even define a set of protocols for others to use! TCP header format We talked about the definition of the protocol before, so the TCP protocol must also have certain specifications! In this way, the communicating parties can identify each other's data packets and exchange data. Let's first look at the TCP message format: The TCP message contains a data header and a data body. The header has 5 lines of fixed length and 1 line of variable length! The first 5 lines in the figure are fixed length! Each fixed-length line occupies 4 bytes (32 bits). Therefore, the fixed length of the header is 5*4=20 bytes! At this point we can capture a packet to deepen our impression. We still send a request to www.17coding.info and then look at the TCP data packet: Next, let's analyze the TCP header line by line: First line: 1. Source port: sender port 2. Destination port: receiving port We mentioned earlier that TCP is end-to-end, and this is well reflected here! Each data packet contains the sender and receiver ports. Each port occupies 2 bytes (16 bits). Second and third lines: 1. Sequence number: TCP is byte-stream oriented. Data is stored and sent in blocks in the cache. The sequence number is used to mark the first byte of a data packet as the byte of the entire data. 2. Confirmation number: After receiving each request, the receiver will reply to the sender, telling the sender how many bytes it has received and the byte from which the next data packet needs to be sent. The value here is generally equal to the received sequence number + the length of the data portion of the received data packet. The sequence number and confirmation number here are indispensable to ensure the reliability of TCP. We will analyze them in detail later by capturing packets! The sequence number and confirmation number each occupy 4 bytes (32 bits)! Fourth row: 1. Data offset: It is more appropriate to call it the header length. As mentioned earlier, part of the TCP header length is variable, so it is necessary to identify where the data portion of the data packet starts. This value occupies 4 bits. 2. Reserved: Not used, for expansion. This value occupies 3 bits. 3. Flags: There are 9 flags in total, each flag takes up 1 bit, a total of 9 bits. You can see these 9 flags in the above packet capture screenshot! 3.1. NS: Nonce, related to ECN explicit congestion notification. 3.2. CWR: The CWR flag and the subsequent ECE flag are both used for the ECN field of the IP header. When the ECE flag is 1, it notifies the other party that the congestion window has been reduced. 3.3. ECE: ECN-Echo. If this flag is set, the other party will be notified that the network from the other party to this side is blocked. 3.4. URG: Urgent, used to add queues on the sender. For example, when downloading a file, if you need to stop downloading halfway, you need to send an urgent request to tell the other party to stop sending data. Data packets are not queued. 3.5. ACK: Acknowledgment, marked as a confirmation. 3.6. PSH: Push, corresponding to URG, used for receiver jamming. 3.7, RST: Reset, indicates a serious error, and may need to re-establish the TCP connection. If we open a certain website and it does not appear, we press F5 to refresh, and the previous data packets will be rejected. 3.8, SYN: used for synchronization, used when establishing a request. This mark will be included during the handshake! 3.9, FIN: Communication ends, used when releasing the connection. This mark will be included when waving! 4. Window: Both the sender and the receiver have corresponding sending window and receiving window. Before communication, the two parties will negotiate the size of the window. The sender sets its own send window according to the receiver's receive window. At the same time, the send window is also limited by the congestion window, which will be mentioned in the congestion control section! During the sending process, the window will be adjusted according to the processing capacity of the receiver. This value plays a great role in TCP's reliable transmission and flow control! This value occupies 16 bits. Fifth row: 1. Checksum: used to check whether the data packet is complete or modified. This value occupies 16 bits. 2. Urgent pointer: A pointer used to mark the urgent data in this message segment, that is, to indicate that the data from the head of the data packet to the specified position is urgent data, which only works when the flag bit URG is set. This value occupies 16 bits. Line 6: 1. Options: There are also some important data in the options. Let’s talk about a few of them. 1.1. MSS: The full name of MSS is Maximum Segment Size, which is the maximum data length that each message segment can carry (excluding the segment header) negotiated by both parties. 1.2. WS: WS stands for Window scale, also known as window factor! It is used to adjust the window size. We have mentioned the window size field before, so what is the window factor used for? In the early days, network bandwidth and hardware configuration were relatively poor, so only 16 bits were reserved for the maximum window size, which means that the maximum value that could be set was 65535. With the development of hardware and network, 65535 is no longer enough. So a WS option is added for expansion! If WS is set, the actual window size is equal to the window size multiplied by the window factor. 1.3, SACK: SACK stands for Selective ACK. Selective ACK is based on cumulative ACK (discussed later)! SACK is only sent when an out-of-order packet is received. If the receiver receives the subsequent data packets and finds that the previous data packets are lost, it will notify the sender which segments are lost and need to be resent! 2. Padding: This field is used to make the entire header a multiple of 4 bytes. There are many similar usages in Java! We find a data packet and look at its detailed header data:
How to understand connection-oriented As you can see from the example of my failed confession, I started confessing before ensuring a normal connection, which resulted in the other party not being able to hear me because of a bad signal. If I had ensured that the connection was normal in advance, this would not have happened! We said earlier that TCP is connection-oriented, so how is TCP connection-oriented? What does the three-way handshake convey? That's right, it all starts with a handshake! We all know that TCP needs to go through three handshakes to establish a connection. So what does each handshake explain? Is it okay to only do two handshakes? Let's first look at a scene where a phone call is connected: A: Hello, can you hear me? B: I can hear you, can you hear me? A: I can hear you too. ……. Before a formal call, in order to ensure the reliability of the call, the above three conversations are often required for confirmation. Are these three conversations necessary? What is the necessity of each conversation? A: Hello, can you hear me? (Let B know that A can speak) B: I can hear you, can you hear me? (Let A know that B can hear and speak) A: I can hear you too. (Let B know that A can hear) ……. Only after three conversations can you confirm that your voice can be heard by the other party and that you can hear the other party's voice. Only then can you start a subsequent conversation. Here we have to use the classic three-way handshake diagram: We analyze the three-way handshake process and the status after each handshake as follows:
A few points we should pay attention to here are:
We still send a request to www.17coding.info. The following is the three-way handshake package: In the info column, we can clearly see that the header of the sent data packet has the flags we mentioned above, as well as header information such as Seq and ACK, and header option data such as Win and MSS! Therefore, the three-way handshake is not just about establishing a connection, but also about negotiating some parameters! When I select a row with the mouse, if this data packet contains confirmation of a certain data packet (that is, it has an ACK mark), I can see a small check mark on the No column of the corresponding data packet. For example, in the figure above, the data packet selected by my mouse is the data packet of the third handshake, and there is a small check mark in front of the data packet of the second handshake. Why does a handshake only take three steps but a wave takes four? Through the three-way handshake, the two parties have established a reliable connection and can transmit data! When the data transmission is completed, the connection must be closed because the connection is also a resource! The closing of the connection requires four waves! Why can a handshake be done in three steps, but a wave requires four steps? Can I do it in three steps? Actually, there is nothing wrong with that! For example, the following conversation scenario: A: I'm done, hang up now! B: OK, I'm done too, you can hang up now! A: OK, bye. Hang up... In this way, three conversations can achieve hand waving, but in the actual network, when I send a request, the server's response body may be relatively large and take a long time to transmit! So when the client actively initiates a disconnect request, the server first responds with a confirmation, and then sends a server disconnect request after all data transmission is completed. A: I’m done, hang up now! B: Okay… B: … B: I’m done too, you can hang up nowA: Okay, bye, hang up… So in most cases, four waves are required! However, in my personal packet capture practice, there are also cases where three waves are enough to complete the disconnection. Here we have to use the classic four-wave picture again: We analyze the four waving processes and the status after each waving as follows:
In the figure we can see that the TIME_WAIT state of A will last for 2MSL and then become CLOSED. MSL (Maximum Segment Lifetime) can be translated into Chinese as "maximum message lifetime"! It is the maximum time that any message exists on the network. The message will be discarded if the time exceeds this. So what is the purpose of TIME_WAIT maintaining 2MSL?
Let's take a look at the hand-wave packet that sends a request to www.17coding.info: Maybe you can't immediately see the four-wave data packet when you capture the packet! That's because in HTTP 1.1 and later, long connections are enabled by default! That is, after a request, the established connection will not be closed immediately, but will be used for subsequent requests to reduce the resource consumption of each re-establishment of the connection! If you want to capture the four-wave packet immediately after sending the request, you can set the HTTP header Connection: close. In this way, every time you send a request, you can see the complete three-way handshake and four-wave process! How does TCP ensure reliable transmission? We have already mentioned connection-oriented, and establishing a connection is the first step to ensure data transmission. So how can we ensure reliable data transmission after the connection is established? Let's go back to the scene of our phone call. Generally, in the process of conversation, both parties have to interact and respond to each other. It's not that one person keeps talking while the other party doesn't respond! For example, the following scenario: A: Let me tell you, I met a girl online last week. B: Wow, awesome! A: Then I made an appointment to meet her yesterday. B: 666! Then what happened? A: Then we @#¥%…& B: Damn, I didn’t hear what you just said, can you say it again?... Such confirmation and response ensures that the communication between the two parties is complete and reliable. TCP also adopts this mechanism of y response and confirmation retransmission to ensure reliable transmission on unreliable networks. As long as I don't receive confirmation, I will think that the transmission is not successful and will resend it. Stop Waiting Protocol The stop-and-wait protocol means that after each data packet is sent to the other party, it needs to wait for the other party's response before sending the next data packet! The following situations may occur in the Stop and Wait protocol: ① Error-free situation: A sends packet M1 to B. After receiving it, B will give A a confirmation. When A receives B's confirmation, it will send packet M2. ② Timeout retransmission: A sends packet M1 to B. If the packet is lost during the transmission process, A will resend it. The time A waits for retransmission is slightly longer than the round-trip time (RTT) of a message. ③ Confirmation loss: If B loses the confirmation when sending it to A, A will resend the M1 packet to B. Since B has already processed the M1 data packet, B will discard the message and then retransmit the confirmation M1 to A. ④ Late confirmation: If A sends data packet M1 to B, and B is delayed in replying the confirmation, A will resend packet M1 to B, and B will discard the data packet after receiving it, and then retransmit the confirmation M1 to A. At this time, A will receive multiple confirmations, and when A receives the late confirmation for the second time, it will also discard the confirmation. We can see from the above that the stop-and-wait protocol waits until confirmation is received before sending the next data packet. As long as I don't receive your confirmation, I will think that you have not received the data packet I sent, and I will resend it! Although this is reliable, it will result in low channel utilization! Pipeline transmission Pipeline transmission means sending multiple groups of data packets each time, without having to stop and wait for the other party's confirmation after each group is sent. Since there is always uninterrupted data transmission on the channel, a higher channel utilization rate can be achieved! How does pipeline transmission ensure reliability? The sender needs to maintain a sending window. If the sending window is 5, then 5 data packets will be sent at the same time, and then wait for confirmation! If the receiver receives a confirmation, the window will slide and send the sixth data packet. If it is a single confirmation, the efficiency may be relatively low, so there is a cumulative confirmation! That is to say, if the sender sends data packets 1, 2, 3, and 4, the receiver only needs to reply with an acknowledgment of data packet 4, which means that data packets 1234 have been received, and the fifth data packet can be sent! If data packets 1, 2, 3, and 4 are sent, and the third packet is lost, how can we confirm it? TCP will only reply with an acknowledgment for packet 2 and perform a selective acknowledgment (SACK as mentioned in the TCP header option) for packet 4, so that the sender knows that packet 4 has been successfully sent and only needs to resend packet 3. Continuing with the previous packet capture example, the receiver does not confirm each data packet, but instead confirms multiple data packets cumulatively: Here we can see that the client confirmed it only once after the server sent multiple packets. Flow Control and Congestion Control As we know above, by establishing a reliable connection and confirmation mechanism, the reliability of TCP connection is guaranteed! However, the processing power of the computers used by each person is different. What if I send too fast and the other party cannot handle it? How do the two communicating parties coordinate the frequency of sending and receiving data? Sliding window technique in bytes When introducing the TCP header, we have mentioned the sliding window and the related control parameter Win! We also talked about the receive window and the send window! So what is the relationship between them? Assume that A needs to transmit data to B. B must first tell A how big its receiving window is. A sets its sending window based on B's receiving window! A's sending window cannot be larger than B's receiving window! Before starting to transfer data, the initial window settings are as follows: As we can see in the figure above, if B's receive window is set to 10 bytes, then A's send window cannot be set to more than 10 bytes! If data transmission starts, A will encapsulate the data into multiple data packets for transmission, as shown below: A's window will not slide before receiving B's confirmation, which means that it can send a maximum of 10 bytes of data. If B receives the data and replies to A, A's window will slide, as shown below: In this way, A can send the 11th and 12th bytes again! If B's processing power becomes weaker, A can also be notified to reduce the sending window! This also coordinates the receiving and sending capabilities of both parties very well! This also implements TCP's reliable transmission and flow control very well! The above data packet continues to be sent. If during the sending process, the data packet consisting of bytes 3, 4, and 5 is lost, but the following data is received, will A's sending window move at this time? If this is the case, A's send window will not move. When B receives the next data packet, the ACK it sends to A will be set to 3, and a SACK will be set in the option (described in the TCP header option) to tell A which part of the data has been received and which part of the data needs to be resent! Congestion Control Using sliding window technology, the sending and receiving capabilities of both parties can be well coordinated. However, the network conditions are very complex, and there may be thousands of senders and receivers on the same network! If everyone needs to transmit data and occupy the network, without proper control measures, the entire network will be blocked or even paralyzed. If I want to drive from Shenzhen to Guangzhou, I will take the expressway. If I am the only one driving, I will definitely be able to drive smoothly! But the expressway is not mine, everyone can use it! So when holidays come, everyone rushes to use them, but the high-speed transport capacity will not be adjusted because of holidays! At this time, traffic control, flow control and other measures are often needed to ease traffic!
The network is like a highway, the transmitted data packets are like the vehicles that need to pass, and TCP is more like a traffic policeman, maintaining the order of data transmission! So how does TCP do it? Slow Start and Congestion Avoidance The sender maintains a cwnd (congestion window, note that the congestion window here cannot be larger than the send window mentioned earlier!), and the congestion window is set to 1 at the beginning. If it is found that the packet is not lost, the congestion window is adjusted to 2! If there is no packet loss, the congestion window is adjusted to 4! In this way, it increases by 2 times each time until it reaches 16! Then it increases one by one to 17, 18, and 19 until the size is consistent with the sending window. This is called slow start and congestion avoidance, 16 is the slow start threshold... Do you feel like you are getting too far? I just can’t get in… I just can’t get in… I just can’t move… I just… If packet loss is found during the sending process, the congestion window size will be adjusted to 1, and the new slow start threshold will be set to half of the congestion threshold. That is to say, if packet loss occurs when the congestion window is 24, the new slow start threshold will be adjusted to 12! If you understand the above text description, the following picture is not difficult to understand: Fast Retransmission I mentioned cumulative confirmation and selective confirmation earlier. This is related to fast retransmission! If the receiver finds that the packet is lost, it will not wait for the cumulative confirmation, but will notify the sender three times to confirm and notify the other party to resend the lost packet. When the receiver receives three repeated confirmations, it realizes that the data packet is lost and retransmits! From the figure below, we can see that when packet loss occurs, the receiver's ACK is always equal to 50, and SACK selectively confirms bytes between 60 and 89! At this point, the sender knows that data from 50 to 59 is lost and retransmits it! Fast recovery If once packet loss occurs, the congestion window becomes 1, this method is too stupid. It would be great if there was a fast recovery mechanism! TCP uses a fast recovery mechanism! When packet loss occurs, the slow start will not be performed again, but congestion avoidance will be directly switched! That is, the new slow start threshold will be added! After reading the full text, let's go back to the definition of TCP. Do you have a better understanding of it? TCP stands for Transmission Control Protocol, which is a connection-oriented, reliable, byte-stream-based transport layer communication protocol. TCP is a transmission protocol designed specifically to provide reliable end-to-end byte streams on unreliable Internet networks. Author: sullivan06 Editor: Tao Jialong Source: Reprinted from the public account 17coding technical blog. 17coding.info is a public account that adheres to the concept of diligent recording and sharing. It is used to record what is learned in daily life and share programming experiences. |
<<: IoT and 5G: A blessing or a curse?
>>: Will enterprises have dedicated 5G networks in the future?
The 5G era is coming. With the issuance of licens...
If a cabling project is to be successful, you fir...
Nowadays, smartphones and the Internet have broug...
The stable operation of a data center is insepara...
It can be said that 5G and intelligent automation...
[51CTO.com Beijing report] On August 29, Intel...
Most of the discussion about 5G has centered arou...
Megalayer's promotion this month still offers...
[[413152]] Commercial building renovation Commerc...
On the occasion of the Mid-Autumn Festival and Na...
Recently, China Mobile's online business hall...
On October 19, the 7th Global Ultra-Broadband For...
Preface I've been reading about HTTP recently...