TCP

TCP

[[381851]]

This article is reprinted from the WeChat public account "sowhat1412", author SoWhat1412. Please contact the public account sowhat1412 to reprint this article.

3 Transport layer TCP/UDP

Continuing from the HTTP above, data goes to the transport layer after passing through the application layer, but before the data reaches the transport layer, it is necessary to obtain the IP address of the server, which involves DNS domain name resolution.

3.1 DNS

3.1.1 DNS Explanation

The real address of the host is IP. The problem is that IP address is not easy for people to remember. It's like when you use your mobile phone to call Zhang San, can you tell Zhang San's phone number instantly? You can just make a mapping between the name and the phone number in the mobile phone. When you want to make a call, you can directly find Zhang San in the address book to find the corresponding mobile phone number. Mapping is also required when making network requests, and the domain name server Domain Name System does this. Before talking about DNS in depth, you should first understand the domain name.

Every address we enter in the browser address bar is a domain name, such as www.baidu.com. A domain name is composed of . and domain names of different levels. Usually, we omit the root domain name, that is, the . at the end of the domain name, such as www.baidu.com. Since domain names were invented by foreigners, they gradually increase in size from left to right and are separated by ..

DNS Hierarchy

From top to bottom, domain names are mutually inclusive and nested. The root domain name server is the key and must be well known. Only when it is found can the domain name servers at all levels below be found. Otherwise, domain name resolution is out of the question. Let's look at the DNS resolution process for requesting www.baidu.com:

  1. First, visit the root domain name server. The root domain name does not do domain name resolution, it is just for giving you directions. Now you get the address of the com top-level domain name server.
  2. Request the com top-level domain name server and return the address of the baidu.com domain name server.
  3. Then request the baidu.com domain name server and return the address of www.baidu.com.

This is a good way to do DNS, but the problem is that there are billions of PCs in the world. If every computer requests to access the Internet according to the above process, the core DNS resolution system will explode instantly! The solution is to use cache. Many large companies and operators will build their own DNS servers to replace users to request the core DNS system. If they find it, they can cache the query record. If the number of the request is received again, if there is a cached result or the cache has not expired, the original cached result will be returned directly. The well-known Google 8.8.8.8 DNS resolution server is a non-authoritative domain name server built by Google. In addition to non-authoritative domain name servers, we often see browser caches and operating system caches, such as /etc/hosts files.

3.1.2 DNS Example

DNS domain name resolution

  1. When the user enters a URL, check whether the browser's DNS cache has expired. If it has not expired, use it directly. If it has expired, check the local operating system cache /etc/hosts file, etc.
  2. Request the locally configured non-authoritative domain name server DNS resolver.
  3. DNS resolver forwards the URL to the root domain name request and returns the com domain name address.
  4. The DNS resolver forwards the URL to the server request of the com domain name and returns the authoritative DNS resolver related to baidu.com.
  5. The DNS resolver forwards the URL to the authoritative DNS resolver to continue the request, and then returns the actual target domain name IP.
  6. The DNS resolver eventually returns the target IP address to the user to continue the next access request.

3.2 TCP

3.2.1 TCP header explanation

TCP is a connection-oriented, reliable, byte-stream-based data transmission service that works at the transport layer. Using TCP to transmit data ensures that the network packets received by the receiving end are undamaged, gap-free, non-redundant, and in order. It should be noted that TCP is a one-to-one connection.

TCP header + HTTP

  1. Sending port: It is a 16-bit number greater than 1023, randomly selected by the user process based on the TCP application.
  2. Destination port: Indicates the port number used by the receiver, usually specified by the application.
  3. Sequence number: When establishing a connection, the client generates a random number as the initial value, which is transmitted to the receiving host through the SYN packet. Each time data is sent, it is accumulated. When the sequence number reaches the maximum value, the sequence number will wrap around and start again from 0. The core function is to remove duplicate data and receive in order at the receiving end.
  4. Confirmation number: It is used to solve the problem of packet loss and specify the sequence number of the data that you want to receive next time. After the sender receives this confirmation response, it can be considered that the data before this sequence number has been received normally.
  5. Data offset: indicates the header length of the TCP segment. The maximum length of 4-bit binary is 15. Since the TCP header contains a variable length option, it is necessary to specify the length of the TCP segment. It indicates how far the data start of the TCP segment is from the start of the TCP segment. The unit of this field is 4 bytes, so the maximum length of the TCP header is 15*4 = 60 bytes.
  6. Reserved: 6 bits are reserved, not used, and should be set to zero.

The following 7 to 12 are control bits, which are used to indicate the nature of the message segment.

7.URG: Indicates whether the data sent in this message segment contains urgent data. The urgent pointer field is only valid when URG=1.

8. ACK: Indicates whether the previous confirmation number field is valid. The previous confirmation number field is valid only when ACK=1. TCP stipulates that ACK=1 after the connection is established. The TCP segment with the ACK flag is called an acknowledgment segment.

9.PSH: prompts the receiving end to read data from the TCP receive buffer immediately to make room for receiving subsequent data. A value of 1 indicates that the other party should submit the data to the upper layer application immediately. If the application does not read the received data, it will remain in the TCP receive buffer.

10.RST: Receiving a message with RST=1 indicates that a serious error has occurred in the connection with the host and the connection must be released and then re-established. Or it indicates that there is a problem with the data sent to the host last time and the host refuses to respond. The TCP segment with the RST flag is called a reset segment.

11.SYN: used to synchronize sequence numbers when establishing a connection. SYN=1 indicates that this is a message requesting or agreeing to establish a connection. SYN is set to 1 only in the first two handshakes. The TCP message segment with the SYN flag is called a synchronization message segment.

  • When SYN=1 and ACK=0, it means this is a message segment requesting to establish a connection.
  • When SYN=1 and ACK=1, it means the other party agrees to establish a connection.

12FIN: Notify the other end that the connection is to be closed, marking whether the data has been sent. If FIN=1, it tells the other end to release the connection. The TCP segment with the FIN mark is called the end segment.

13. Window size: Indicates the amount of data that the other party is allowed to send now. It tells the other party the amount of data that the other party is allowed to send starting from the confirmation number of this message segment. When this value is reached, ACK confirmation is required before the subsequent data can be sent.

14. Checksum: Provides additional reliability.

15. Urgent pointer: The bit that marks urgent data in the data field.

16. Option part: The maximum length of the option part can be calculated based on the length of the TCP header. The TCP header length is represented by 4 bits, and the maximum length of the option part is: (2^4-1)*4-20=40 bytes.

TCP only specifies one option, namely the maximum length of the TCP segment MSS, which is usually 1460 bytes. The length of the entire TCP segment = the length of the data field + the length of the TCP header.

17. Padding: It should be noted here that for the convenience of network device hardware design and processing, the header length must be an integer multiple of 4 bytes during data transmission.

3.2.2 TCP three-way handshake

TCP three-way handshake

  1. At the beginning, both the client and the server are in the closed state, and then the server actively listens to a client port. At this time, the server is in the listening state.
  2. The client randomly initializes the sequence number seq = client_isn, and sets SYN = 1 to indicate that this is a SYN message. It then sends the SYN message to the server. Note that the message does not contain reference layer data at this time, and the client is in the syn-sent state.
  3. After receiving the SYN message from the client, the server also randomly initializes the sequence number seq = server_isn, and sets the confirmation number ack = client_isn + 1, then sets SYN = 1 and ACK = 1, and then sends the message to the client. The server is in the syn-rcvd state.
  4. After receiving the message from the server, the client sets ACK = 1, confirms the response number ack = server_isn + 1, and then sends the message to the server. This message can send data, and the client is in the established state.
  5. After receiving the response message from the client, the server also enters the established state.
  6. The client and server have established a connection and can send data to each other.

Here you may find that the initialization sequence numbers of the client and the server are random. The reason is that the messages in the network may be retransmitted, delayed, or lost. To avoid mutual influence, it is better to use their own. At the same time, through the process, it is found that the first two handshakes do not carry data, but the third one can carry data.

3.2.3 TCP data transmission process

As mentioned before in HTTP, data is split and sent to the TCP layer and the IP layer. Some people may ask: Since IP can be divided into frames, why is the TCP layer also divided into layers? The reason is that if TCP is not divided into layers, and only the IP layer is used to divide the data into frames for sending, if one frame is lost, the entire IP message frame will be retransmitted. The essence is that the IP layer has no retransmission mechanism, while the TCP layer can achieve timeout retransmission and loss retransmission of data.

General process of information transmission

3.2.4 TCP Status Query

The server generally uses netstat to view the TCP, UDP ports and processes and other related information. netstat -tunlp | grep port number

  1. -t (tcp) only displays tcp related options
  2. -u (udp) only displays udp related options
  3. -n refuses to display aliases, and converts all numbers that can be displayed into numbers
  4. -l only lists the status of services in Listen
  5. -p Displays the name of the program that creates the relevant link

netstat Example

3.2.5 Why TCP three-way handshake

TCP does not distinguish between client and server, and the establishment of a connection is a two-way process. Therefore, two handshakes are necessary for the client to communicate with the server.

  1. In the first handshake, the client sends a connection request to the server. After receiving it, the server knows that it can connect to the client.
  2. But the client doesn't know this at this time, so it must perform a second handshake and feedback the information to the client.
  3. If the first handshake request is delayed due to the network, and the information does not reach the server until the connection is released, the server will also send a second handshake reply to the client. The key is that the client no longer wants this connection. At this time, the server will continue to wait to receive client information, resulting in a waste of resources.
  4. If a three-way handshake is used, the client will send a RST message to inform the server to terminate the old connection.

If you still don't understand, let us explain it with a common sense. You are walking in the community at night, and you see a beautiful girl coming towards you not far away. Because the street lights are a little dim, you can't be 100% sure, so you have to wave to make sure whether the other party knows you.

1. You wave to the girl first.

2. When a girl sees you waving at her, she will nod and smile at you.

3. She also needs to confirm whether you are looking at someone else. The girl also waves at you.

4. You see the girl smile and say ack, which means she has successfully recognized you and you enter the established state.

5. The girl waves to you and syn, you also smile and ack in reply, and after receiving it, the girl also enters the established state.

Because the girl performed two actions in succession, first nodding and smiling, and then waving again, these two actions can be combined into one action, nodding and smiling while waving. So these four actions are simplified into three actions.

Your acquaintance with the girl

3.2.6 The significance of TCP three-way handshake

1. Avoid historical connections

The client sends multiple SYN packets when establishing a connection. Due to network congestion, the old SYN packet may reach the server before the new SYN packet. The server will reply SYN + ACK to the client upon receiving it, regardless of whether it is old or new. In the case of a three-way handshake, the client can determine whether the replied connection is a historical connection based on the sequence number or timeout period. If it is a historical connection, it will directly send a RST packet to the server to terminate the connection.

2. Synchronize the initial sequence numbers of both parties

Both parties in the TCP protocol maintain their own sequence numbers and must let the other party know. This can only be achieved through a three-way handshake.

3. Avoid waste of server resources

In the case of a two-way handshake, if the client's SYN is blocked and causes multiple SYN messages to be sent repeatedly, the server will establish multiple redundant invalid links after receiving the request, resulting in unnecessary waste of resources. However, if an invalid link is found in a three-way handshake, a termination instruction can be sent to the server in the third time.

3.2.7 What should I do if the client suddenly hangs up during a TCP connection?

TCP also has a keep-alive timer. The server resets the timer every time it receives a request from the client. The time is usually set to 2 hours. If no data is received from the client in two hours, the server will send a probe segment, and then send it every 75 seconds. If there is still no response after sending 10 probe segments in a row, the server will think that the client has failed and then close the connection.

3.2.8 How does TCP avoid SYN attacks?

A TCP connection will go through three handshakes. After the first handshake, when the server receives a SYN message, it will send an ACK + SYN message and enter the SYN_RCVD state at the same time. If a hacker forges n different IPs to send requests, the server's SYN_RCVD queue will be full and it will eventually be unable to provide services to the outside world.

Solution:

  1. Set the maximum value of SYN_RCVD: When the server exceeds the processing capacity, the new SYN request RST is directly discarded.
  2. Shorten the SYN Timeout time: By shortening the time from receiving a SYN message to determining that the message is invalid and discarding the connection, the server load can be reduced.
  3. Set SYN Cookie: assign a cookie to each IP address that requests a connection. If repeated SYN messages are received from the same IP within a short period of time, the packets from this IP address will be discarded.

3.2.9 TCP Four-Wave Handshake

Both the client and the server can send port requests, and TCP disconnects by waving four times.

TCP four times wave

  1. The client stops sending data and sends a message to release the connection. In the message, FIN = 1. Even if the FIN segment does not carry data, it consumes a sequence number. At this time, the sequence number seq = u, where u = the sequence number of the last byte of the previously transmitted data plus 1. The client enters the FIN-WAIT-1 state.
  2. The server receives the connection release message and sends a confirmation message, ACK=1, and the response confirmation ack=u+1, and carries its own sequence number seq=v. At this time, the server enters the CLOSE-WAIT state. The TCP server notifies the high-level application process to enter the semi-closed state, that is, the client has no data to send, but if the server sends data, the client still has to accept it. This state will continue for a period of time, which is the duration of the entire CLOSE-WAIT state.
  3. After the client receives the confirmation request from the server, it enters the FIN-WAIT-2 state and waits for the server to send a connection release message. Before this, it needs to receive the last data sent by the server.
  4. After the server sends the last data, it sends a connection release message to the client, FIN=1, ack=u+1. Since it is in a semi-closed state, the server is likely to have sent some more data. Assuming that the sequence number at this time is seq=w, the server enters the LAST-ACK state and waits for the client's confirmation.
  5. After receiving the connection release message from the server, the client must send a confirmation, ACK=1, ack=w+1, seq=u+1, and then the client enters the TIME-WAIT state. Note that the TCP connection has not been released at this time. It must wait for the longest segment life of 2MSL, and then enter the CLOSED state when the client cancels the corresponding TCB.
  6. As soon as the server receives the confirmation from the client, it immediately enters the CLOSED state. Similarly, after revoking the TCB, the TCP connection is terminated. It can be seen that the server terminates the TCP connection earlier than the client.

Let’s take the example of you meeting and communicating with a girl. After you two confirm each other and communicate for a few minutes, you plan to end the conversation. After all, if you communicate for too long without your wife noticing, it will be a waste of time.

You wave goodbye to your sister

3.2.10 Why TCP waves four times

In fact, if you analyze the entire closing process, you will know why it must be waved four times instead of three times.

  1. When closing the connection, when the client sends a FIN to the server, it only means that the client will no longer send data but can still receive data.
  2. When the server receives the FIN message from the client, it first returns an ACK message, which means that it will no longer accept data, but the server may still have data to send out. When the server no longer sends data, it will send a FIN message to the client to indicate that it agrees to close the connection now. Note that the ACK and FIN on the server are sent separately.
  3. After the client receives the ACK from the server, it sends an ACK to the server, and finally both the client and the server enter the close state.

3.2.11 Why does TCP handshake require TIME_WAIT state?

MSL Definition:

Maximum Segment Lifetime is the maximum lifespan of a message, which means the maximum time a message transmitted on the network can exist on the network. If the message exceeds this time, it will be discarded. The reason why data can be discarded is that the IP layer below the TCP layer has a TTL to record the maximum number of routes that the message has passed during transmission.

TIME_WAIT definition:

TIME_WAIT = 2*MSL. The reason is that after the sender sends data to the receiver, the receiver will return a response, so this round trip is exactly 2 times the MSL.

Time_wait starts from the time when the client sends ACK after receiving FIN. If the client's ACK is not transmitted to the server within the TIME-WAIT time, and the client receives the FIN message resent by the server, 2MSL will be restarted.

TIME_WAIT exists for the following reasons:

Prevent old connection data packets from being re-consumed: If the server data wanders around the network due to network oscillation during the last connection, the new connection may receive wandering messages again if the time_wait time is too short. The delay time can avoid consuming wandering data.

Ensure that the connection is closed correctly: The purpose of TIME-WAIT is to wait for enough time to ensure that the final ACK can be received by the passive closing party, thereby helping it to close normally.

TIME_WAIT occurs in the following scenarios:

On a TCP server with high concurrency and short connections, the server will immediately and normally close the connection after processing the request. In this scenario, a large number of sockets will be in the TIME_WAIT state. If the client's concurrency continues to be high, due to limited ports and memory, some clients will be unable to connect.

In Linux kernel TIME_WAIT = 60 seconds.

Avoid excessive TIME_WAIT:

  • Cancel the short connection and use the long connection mode instead.
  • Set a threshold. Once the threshold is exceeded, the system will reset all time_wait connections.
  • Modify the client program code.

3.2.12 How does TCP ensure reliable data transmission?

  • Checksum: Both the sent and received data will be checked. If they are inconsistent, then the transmission is incorrect.
  • Confirmation response sequence number: When TCP is transmitted, the data is numbered, and each time the receiver returns ACK, there is a confirmation sequence number.
  • Timeout retransmission: If the sender does not receive an ACK after sending data for a period of time, the data will be resent. It also has a built-in deduplication function.
  • Connection management: three-way handshake and four-way wave process.
  • Flow control: The TCP protocol header contains a 16-bit window size. The receiver will fill in its own immediate window when returning ACK, and the sender will control the sending speed according to the size of the window in the message.
  • Congestion control: When data is first sent, the congestion window is 1. Each time an ACK is received, the congestion window is increased by 1. The smaller value of the congestion window and the received window is used as the actual sending window. If a timeout retransmission occurs, the congestion window is reset to 1. The purpose of this is to ensure the efficiency and reliability of the transmission process.

3.3 UDP

UDP provides a way for applications to send encapsulated IP data packets without establishing a connection. Its protocol is very simple, with only eight bytes in the header:

UDP Header

  • Two sixteen-bit port numbers: the source port and the destination port.
  • Packet length: This field = UDP header length + data length.
  • Checksum: The checksum of the entire data message. This field is used to detect errors in header information and data.

3.3.1 UDP Features

UDP has the disadvantages of not providing data packet grouping, assembly, and inability to sort data packets. After a message is sent, it is impossible to know whether it arrives safely and completely.

1. Connectionless

UDP does not perform a three-way handshake to establish a connection. It will establish a connection if it wants to, and it is only a carrier of data messages. It will not split or splice data messages.

At the sending end, the application layer passes the data to the UDP protocol of the transport layer. UDP only adds a UDP header to the data to identify it as the UDP protocol, and then passes it to the network layer.

At the receiving end, the network layer passes the data to the transport layer, and UDP only removes the IP header and passes it to the application layer without any splicing operation.

2. It has unicast, multicast and broadcast functions

UDP not only supports one-to-one transmission, but also supports one-to-many, many-to-many, and many-to-one. In other words, UDP provides unicast, multicast, and broadcast functions.

3.UDP is message-oriented

The sender's UDP adds a header to the message handed down by the application and then delivers it to the IP layer. UDP does not merge or split the messages handed down by the application layer, but retains the boundaries of these messages. Therefore, the application must choose a message of the appropriate size.

4. Unreliability

The unreliability is reflected in the lack of connection. Communication does not require the establishment of a connection. You can send messages whenever you want. This situation is definitely unreliable.

Whatever data is received will be transmitted, and no data will be backed up. When sending data, no one cares whether the other party has received the data correctly.

Without congestion control, data will be sent at a constant speed. A poor network may cause packet loss. In some scenarios with high real-time requirements, such as video calls, UDP needs to be used.

5. Small header overhead

UDP has a small header overhead of only eight bytes, which is much less than TCP's at least twenty bytes, and is very efficient in transmitting data packets.

3.3.2 Comparison between TCP and UDP

3.3.3 TCP UDP common ports

You may often be asked, why can TCP and UDP share the same port? This is because from the perspective of the network layer, it does not know the concept of port. TCP/UDP are wrapped in the IP protocol. The IP protocol only needs to know the hardware address corresponding to the IP to send the remote network packet to the destination host.

The concept of port is divided by the operating system. Because the kernel cannot send all network data to all processes, in order to distinguish which data should be allocated to which processes, ports are defined in the transport layer protocol. The port number in TCP and UDP protocols is 16 bits, so the operating system can only bind to 65535 ports.

If you look at the socket and bind functions in C language Socket programming, you will find that the system binds ports based on protocol + ip + port, so the same ip and port with different protocols can also be bound successfully.

4 TCP Advanced

4.1 TCP Retransmission Mechanism

To ensure that data arrives safely at the receiving end, TCP introduces timeout retransmission, fast retransmission, SACK, and D-SACK.

4.1.1 Timeout Retransmission

Based on time, a timer is set when sending data. If the recipient's ACK is not received within the time limit, the data will be resent. Generally, data packet loss or confirmation response loss will cause timeout retransmission. Here we first popularize two time-related parameters and some rules.

  • RTT: Round-Trip Time, which refers to the time it takes for data to be sent and received.
  • RTO: Retransmission Timeout retransmission time.
  • Dynamic: RTT changes dynamically due to network fluctuations, and similarly, RTO also changes dynamically.
  • RTO doubled: Each time a retransmission timeout occurs, the system will double the next RTO.

RTO and RTT

The relationship between RTT and RTO is very subtle.

  1. When the RTO is small, the data may not be lost but has not been responded to. Retransmission will increase network congestion and cause more timeout retransmissions.
  2. When the RTO is large, it may take a long time for the data to be lost before it is resent.

Therefore, it is best if RTO is slightly greater than RTT in offline mode. If you are interested in the specific rules, you can search Baidu.

4.1.2 Fast Retransmit

TCP has a cumulative confirmation mechanism. When the receiving end receives a segment with a larger sequence number than the expected one, it will repeat the confirmation signal of the most recently confirmed segment. We call this a duplicate ACK.

As shown in the figure, segment 1 is successfully received and confirmed ACK 2. The expected sequence number of the receiving end is 2. When segment 2 is lost, segment 3 arrives out of order, which does not match the expectations of the receiving end. The receiving end repeatedly sends redundant ACK 2.

Fast retransmission mechanism

If the sender receives three consecutive redundant ACKs before the timeout retransmission timer overflows (actually it receives four identical ACKs, the first one is normal, and the last three are redundant), the sender will know which segment was lost during the transmission process, so it resends the segment without waiting for the timeout retransmission timer to overflow. Finally, the client receives 2, because 345 has been replied, and returns ACK6.

Why three times? You have to understand that even if the sender sends in order, the receiver will still send out of order. Out of order will also cause redundant ACKs to be sent. So is the redundant ACK caused by out of order or packet loss? After weighing the pros and cons, using three redundant ACKs as the criterion for determining loss is itself an estimate.


Data reception

A is the sender and B is the receiver. The sequence numbers of A's to-be-sent segments are [N-1, N, N+1, N+2]. Assume that segment N-1 arrives successfully.

  1. In the case of no loss, there is a 40% chance of 3 redundant ACKs, and in the case of disorder, there will definitely be 2 redundant ACKs.
  2. In the event of a loss, there must be 3 redundant ACKs.

Based on this probability, it is reasonable to select 3 redundant ACKs as the threshold. In actual packet capture, most fast retransmissions will occur after more than 3 redundant ACKs.

Fast retransmission solves the timeout problem, but it is not possible to determine whether the retransmission is the previous one or all of them.

4.1.3 SACK

Since fast retransmission cannot solve the problem, Selective Acknowledgment is used. The principle is also very simple. When the server replies to the client, it adds an extra field SACK. The content of SACK tells the sender what the server has received. In this way, the server can selectively send the lost packets based on the received information.

4.1.4 D-SACK

DSACK is an extension of SACK, mainly used to handle received duplicate messages. DSACK also uses the same message format as SACK. The core concern is whether there is a problem when sending or when replying.

If the sender sends data A late and triggers the fast retransmit mechanism, the fast retransmit mechanism sends new information A, and then the old A arrives again, the receiver will reply SACK, which means that it is caused by network fluctuations.

If the client does not receive the server's ACK, when the client resends, the server will reply with SACK, which means that your data was sent repeatedly.

4.2 TCP Sliding Window

If there is no sliding window mechanism: to transfer N files, you need to wait N times for the response time.

Total transmission time = N transmission time + N response transmission time.

TCP introduces the concept of window under the premise of ensuring reliability. Sliding window allows us to further improve transmission efficiency. Data in the window can continue to be sent without waiting for confirmation. The essence of the window is a cache space opened by the OS, and then batch transmission is performed. As long as the receiver does not confirm the response, it will always exist in the cache.

Total transmission time = N data transmission times added together into one time, N response transmission times overlapped into one time


The window size is 4000 bytes

The window size is usually determined by the receiver, who will inform the sender how much buffer it has to accept data. If the amount of data exceeds this, the receiver will not be able to receive it.

4.2.1 Send Sliding Window

Sliding Window

In state 1, the sender receives an ACK for a request sequence number of 2001, then the data before 2001 is marked as transmitted, and the system will slide the window to state 2.

  1. The left side of the window contains data that has been sent and received ACK from the server. This data can be deleted from the cache.
  2. The data in the window is actually divided into two categories: one is the data that has not received ACK, and the other is the data that has not been sent. Before receiving the ACK for the entire window, if the data is lost, the sender still needs to retransmit it. Therefore, the sender needs to have a cache to retain the data that may be retransmitted until the server ACK is received.
  3. After receiving the ACK from the server, the sender will slide the window to the position of the sequence number in the confirmation response. In this way, multiple segments can be sent simultaneously in sequence to improve communication performance. This mechanism is also called sliding window control.
  4. In window mode, the sender will also send data according to the capabilities of the receiver to perform flow control.

4.2.2 Window data loss

The data loss here is actually similar to the retransmission mechanism mentioned above, which is mainly divided into two types

The receiving end receives the information but fails to return ACK: If the ACK is lost, no processing is required. For example, if ACK 3001 is lost, but ACK 4001 has been sent to host A, it means that the data from 2001 to 3000 has also arrived successfully. It does not matter if ACK 3001 is lost. As long as the current sequence number starts, it means that the previous data has been correctly transmitted to host B.


Data received but ACK lost

The sender lost data while sending data: as shown in the figure below, data packets 1001-2000 were lost, while packets 2001-3000 and 3001-4000 arrived successfully. At this time, the ACK confirmation sequence number fed back by the receiver is always 1001. If the sender finds that the receiver sends ACK 1001 continuously, the receiver will understand that the data packet 1001-2000 was lost and will retransmit it. When the receiver receives the lost data 1001-2000 again, it directly returns ACK4001, because packets 2001-4000 have been received and placed in the buffer area, and the next ACK starts directly from 4001.


Lost while sending

4.3 Congestion Control

The flow control mentioned above is only for the sender and the receiver, but we must know that the network is generally public, and other servers may also block the network. Blockage causes retransmission, and retransmission causes more blockage, and finally falls into a vicious circle.

In order to control the amount of data sent by the sender to avoid data blocking the entire network, the sender maintains something called a congestion window. We have mentioned the send window and the receive window before. Now that there is a congestion window, the send window swnd = min (congestion window cwnd, receive window rwnd). The size of the congestion window changes dynamically. It will become larger when the network is not blocked, and it will become smaller when the network is blocked. The basis for judging blocking is that if the sender does not receive data within a specified time, it is blocked.

Congestion control is mainly achieved through slow start, fast retransmission, fast recovery and congestion avoidance.

4.3.1 Slow Start

After TCP establishes a connection, the system has a slow start process, which means increasing the number of packets sent little by little. The principle of slow start is that when the sender receives an ACK, the size of the congestion window cwnd will increase by 1. There is a state variable called slow start threshold that acts as the maximum value.

When cwnd < ssthresh, the slow start algorithm is used.

When cwnd >= ssthresh, the congestion avoidance algorithm is used.

4.3.2 Congestion Avoidance Algorithm

In general, the slow start threshold = 65535 bytes. After the system enters the congestion avoidance algorithm, each time an ACK is received, the congestion window increases by 1/congestion window. The purpose of the congestion avoidance algorithm is to change the exponential growth of the slow start into thread growth.

4.3.3 Fast Retransmission

The data entering the congestion avoidance algorithm will continue to grow and eventually cause network congestion, which will eventually lead to packet loss. Then the timeout retransmission and fast retransmission mentioned above will be used.

  • Timeout retransmission: ssthresh = cwnd/2 At the same time, cwnd is reset to 1, and then slow start is restarted, returning to the starting point.
  • Fast retransmission: cwnd = cwnd/2 and ssthresh = cwnd and then enter the fast recovery algorithm.

4.3.4 Fast recovery

Fast recovery is used in conjunction with fast retransmission. When the sender receives three consecutive repeated confirmation requests, in order to avoid network congestion, it performs fast retransmission (cwnd = cwnd/2 and ssthresh = cwnd) and executes the fast recovery algorithm.

  • cwnd = ssthresh + 3
  • Retransmit lost packets
  • If duplicate ACK is received, cwnd is incremented by 1.
  • After receiving a new ACK, cwnd = ssthresh is set to enter the congestion avoidance algorithm.

5 References

Tech Brother Network: https://t.1yb.co/gJRx

Xiaolin Network: https://t.1yb.co/fQG3

TCP/IP explanation: https://developer..com/art/201906/597961.htm

Fast retransmit: https://blog.csdn.net/whgtheone/article/details/80983882

<<:  Review: China ranks first in 5G mobile phone sales, is the world happy too?

>>:  Before 5G mobile phones become popular, these problems must be solved first

Recommend

Log Analysis for Software Defined Data Center (SDDC)

Modern infrastructure is generating log data at a...

HTTP caching is enough to read this article

Introduction HTTP caching mechanism is an importa...

Explore how Gateway API works in Service Mesh

A few days ago, Gateway API announced that it wou...

Before the 5G feast, the operators' tight restrictions have been booked

At the 2019 Mobile World Congress held in Barcelo...

Interviewer asked: What is a dynamic proxy?

[[439196]] This article is reprinted from the WeC...

Can IPFS become the next generation Internet protocol?

This article will analyze the characteristics of ...