In the first interview of Byte, I was asked two classic questions! Do you know what they are?

In the first interview of Byte, I was asked two classic questions! Do you know what they are?

Hello everyone, I am Xiaolin.

A reader of FaceByte was asked two classic TCP questions:

The first question: why there are a large number of connections in TIME_WAIT state on the server.

The second question: Why are there a large number of connections in CLOSE_WAIT state on the server?

These two questions are often asked in interviews, mainly because they are also often encountered in work.

This time, let’s talk about these two issues.

What are the reasons for a large number of TIME_WAIT status on the server?

Let's first look at the TCP four-wave process and see at which stage the TIME_WAIT state occurs.

The following figure shows the process of TCP four handshakes with the "client" acting as the "active closing party".

TCP four-wave handshake process

From the above, we can know that the TIME_WAIT state is the state that only appears when the connection is actively closed. And the TIME_WAIT state will last for 2MSL time before entering the close state. On Linux, the length of 2MSL is 60 seconds, which means that the time to stay in TIME_WAIT is fixed at 60 seconds.

Why is the TIME_WAIT state needed? (This is an old post, let me review it for you) There are two main reasons:

  • Ensure that the party that "passively closes the connection" can be closed correctly. In the four waves of the TCP protocol to close the connection, the last ACK message sent by the active closing party may be lost. At this time, the passive party will resend the FIN message. If the active party is in the CLOSE state at this time, it will respond with a RST message instead of an ACK message. Therefore, the active party must be in the TIME_WAIT state, not the CLOSE state.
  • Prevent data in historical connections from being erroneously received by subsequent connections with the same four-tuple. A TCP message may be "lost" due to router anomalies. During the lost period, the TCP sender may resend the message due to confirmation timeout. The lost message will be sent to the final destination after the router is repaired. The original lost message is called a lost duplicate. After a TCP connection is closed, a TCP connection with the same IP address and port is immediately reestablished. The latter connection is called the incarnation of the previous connection. It is possible that the lost duplicate message of the previous connection appears after the previous connection is terminated, and is thus misunderstood as belonging to the new incarnation. To avoid this situation, the TIME_WAIT state needs to last for 2MSL, because this ensures that when a TCP connection is successfully established, the duplicate message from the previous incarnation of the connection has disappeared in the network.

Many people misunderstand that only the client will have the TIME_WAIT state, which is wrong. TCP is a full-duplex protocol, and either side can close the connection first, so either side may have the TIME_WAIT state.

In short, remember that whoever closes the connection first is the active closure party, and TIME_WAIT will appear on the active closure party.

Under what scenarios will the server actively disconnect?

If a large number of TCP connections in TIME_WAIT state appear on the server, it means that the server has actively disconnected many TCP connections.

The question is, under what scenarios will the server actively disconnect?

  • The first scenario: HTTP does not use a persistent connection
  • Second scenario: HTTP long connection timeout
  • Scenario 3: The number of HTTP persistent connection requests reaches the upper limit

Next, I will introduce them one by one.

The first scenario: HTTP does not use a persistent connection

Let's first take a look at how the HTTP Keep-Alive mechanism is enabled.

In HTTP/1.0, it is disabled by default. If the browser wants to enable Keep-Alive, it must be added to the request header:

 Connection : Keep - Alive

Then when the server receives the request and responds, it is also added to the response header:

 Connection : Keep - Alive

By doing this, the TCP connection is not interrupted but remains connected. When the client sends another request, it uses the same TCP connection. This continues until the client or server asks to disconnect.

Since HTTP/1.1, Keep-Alive has been enabled by default. Most browsers now use HTTP/1.1 by default, so Keep-Alive is enabled by default. Once the client and server reach an agreement, a persistent connection is established.

If you want to turn off HTTP Keep-Alive, you need to add Connection:close information in the HTTP request or response header. In other words, as long as there is Connection:close information in the HTTP header of either the client or the server, the HTTP long connection mechanism cannot be used.

After disabling the HTTP long connection mechanism, each request will go through the following process: establish TCP -> request resource -> respond to resource -> release connection, so this method is HTTP short connection, as shown in the following figure:

HTTP short connection

As we know above, as long as there is Connection:close information in the HTTP header of either party, the HTTP long connection mechanism cannot be used. In this way, the connection will be closed after completing an HTTP request/processing.

The question is, is it the client or the server that actively closes the connection?

The RFC document does not specify who closes the connection. Both the requester and the responder can actively close the TCP connection.

However, according to the implementation of most Web services, no matter which party disables HTTP Keep-Alive, the server actively closes the connection, and then a connection in TIME_WAIT state will appear on the server.

The client has disabled HTTP Keep-Alive, and the server has enabled HTTP Keep-Alive. Who actively closes the session?

When the client disables HTTP Keep-Alive, the HTTP request header will contain the message Connection:close. The server will actively close the connection after sending the HTTP response.

Why is it designed this way? HTTP is a request-response model, and the initiator is always the client. The original intention of HTTP Keep-Alive is to reuse the connection for subsequent requests from the client. If the request header defines the connection: close information in a certain HTTP request-response model, the only time to not reuse the connection is on the server side. Therefore, it is reasonable for us to close the connection at the "end" of the HTTP request-response cycle.

The client has HTTP Keep-Alive enabled, but the server has HTTP Keep-Alive disabled. Who is the active closeer?

When the client enables HTTP Keep-Alive and the server disables HTTP Keep-Alive, the server will actively close the connection after sending the HTTP response.

Why is it designed this way? When the server actively closes the connection, it only needs to call close() once to release the connection. The rest of the work is directly handled by the kernel TCP stack. The whole process only has one syscall. If the client is required to close the connection, the server needs to put the socket into the readable queue after writing the last response, call select / epoll to wait for the event, and then call read() once to know that the connection has been closed. This involves two syscalls, one more user-mode program is activated and executed, and the socket will be kept open for a longer time.

Therefore, when a large number of TIME_WAIT state connections appear on the server, you can check whether HTTP Keep-Alive is enabled on both the client and the server. If HTTP Keep-Alive is not enabled on either side, the server will actively close the connection after processing an HTTP request. At this time, a large number of TIME_WAIT state connections will appear on the server.

The solution to this scenario is very simple: enable the HTTP Keep-Alive mechanism on both the client and the server.

Second scenario: HTTP long connection timeout

The characteristic of HTTP long connection is that the TCP connection state is maintained as long as either end does not explicitly request to disconnect.

HTTP persistent connection can receive and send multiple HTTP requests/responses on the same TCP connection, avoiding the overhead of connection establishment and release.

Some students may ask, if HTTP long connection is used, if the client does not initiate a new request after completing an HTTP request, then this TCP connection is always occupied, isn’t it a waste of resources?

That's right. In order to avoid wasting resources, web service software generally provides a parameter to specify the timeout of HTTP long connection, such as the keepalive_timeout parameter provided by nginx.

Assuming that the timeout of the HTTP long connection is set to 60 seconds, nginx will start a "timer". If the client does not make a new request within 60 seconds after completing the last HTTP request, when the timer expires, nginx will trigger the callback function to close the connection. At this time, a connection in the TIME_WAIT state will appear on the server.

HTTP long connection timeout

When a large number of connections in TIME_WAIT state appear on the server, if a large number of clients do not send data for a long time after establishing TCP connections, it is likely that the HTTP long connection has timed out, causing the server to actively close the connection, resulting in a large number of connections in TIME_WAIT state.

You can check for network problems. For example, if the data sent by the client has not been received by the server due to network problems, causing the HTTP long connection to time out.

Scenario 3: The number of HTTP persistent connection requests reaches the upper limit

Web servers usually have a parameter to define the maximum number of requests that can be processed on a HTTP long connection. When the maximum limit is exceeded, the connection will be actively closed.

For example, the keepalive_requests parameter of nginx means that after an HTTP persistent connection is established, nginx will set a counter for this connection to record the number of client requests that have been received and processed on this HTTP persistent connection. If the maximum value of this parameter is reached, nginx will actively close the persistent connection, and a connection in the TIME_WAIT state will appear on the server.

The default value of the keepalive_requests parameter is 100, which means that each HTTP persistent connection can only run up to 100 requests. This parameter is often ignored by most people because when the QPS (requests per second) is not very high, the default value of 100 is sufficient.

However, for some scenarios with higher QPS, such as more than 10,000 QPS, or even 30,000, 50,000 or even higher, if the keepalive_requests parameter value is 100, nginx will close the connection very frequently, and a large number of TIME_WAIT states will appear on the server.

For this scenario, the solution is very simple. Just increase the keepalive_requests parameter of nginx.

What are the dangers of too many TIME_WAIT states?

There are two main hazards of too much TIME-WAIT state:

The first is to occupy system resources, such as file descriptors, memory resources, CPU resources, etc.;

The second is the occupation of port resources. Port resources are also limited. Generally, the ports that can be opened are 32768 to 61000. You can also specify the range through the net.ipv4.ip_local_port_range parameter.

Excessive TIME_WAIT on the client and server have different impacts.

If the client (the one that actively initiates the connection closing) has too many TIME_WAIT states and occupies all port resources, then it will not be able to initiate a connection to a server with the same "destination IP + destination PORT". However, the used port can still continue to initiate a connection to another server. For details, please refer to my article: Can client ports be reused?

Therefore, if the client (the party initiating the connection) establishes a connection with the server with the same "destination IP + destination PORT", when the client has too many connections in TIME_WAIT state, it will be limited by port resources. If all port resources are occupied, it will no longer be able to establish a connection with the server with the same "destination IP + destination PORT".

However, even in this scenario, as long as the connections are to different servers, the ports can be reused, so the client can still initiate connections to other servers. This is because the kernel locates a connection through the four-tuple (source IP, source port, destination IP, destination port) information, and will not cause connection conflicts due to the same client port.

If the server (the one that actively initiates the connection closing) has too many TIME_WAIT states, it will not lead to port resource limitations, because the server only listens on one port, and since a four-tuple uniquely identifies a TCP connection, the server can theoretically establish many connections. However, too many TCP connections will occupy system resources, such as file descriptors, memory resources, CPU resources, etc.

How to optimize TIME_WAIT state?

Here are several ways to optimize TIME-WAIT, all with advantages and disadvantages:

  • Turn on the net.ipv4.tcp_tw_reuse and net.ipv4.tcp_timestamps options;
  • net.ipv4.tcp_max_tw_buckets
  • Use SO_LINGER in the program, and the application is forced to use RST to close.

Method 1: net.ipv4.tcp_tw_reuse and tcp_timestamps

If tcp_tw_reuse is enabled, the socket in TIME_WAIT can be reused for new connections.

One thing to note is that the tcp_tw_reuse function can only be used by the client (connection initiator), because when this function is enabled, when the connect() function is called, the kernel will randomly find a connection with a time_wait state of more than 1 second to reuse for the new connection.

 net .ipv4 .tcp_tw_reuse = 1

To use this option, there is another prerequisite, which is to enable support for TCP timestamps, that is

 net .ipv4 .tcp_timestamps = 1 (default is 1 )

The timestamp field is in the "option" of the TCP header. It consists of a total of 8 bytes to represent the timestamp. The first 4-byte field is used to save the time when the data packet is sent, and the second 4-byte field is used to save the time when the data packet was last received.

Due to the introduction of timestamps, duplicate data packets can be naturally discarded due to timestamp expiration, so the TIME_WAIT state can be reused.

Method 2: net.ipv4.tcp_max_tw_buckets

The default value is 18000. Once the number of connections in TIME_WAIT in the system exceeds this value, the system will reset the subsequent TIME_WAIT connection status. This method is relatively violent.

 net .ipv4 .tcp_max_tw_buckets = 18000

Method 3: Use SO_LINGER in the program

We can set the behavior of calling close to close the connection by setting the socket options.

 struct linger so_linger ;
so_linger .l_onoff = 1 ;
so_linger .l_linger = 0 ;
setsockopt ( s , SOL_SOCKET , SO_LINGER , & so_linger , sizeof ( so_linger ) ) ;

If l_onoff is non-zero and l_linger is 0, a RST flag will be sent to the peer immediately after close is called. The TCP connection will skip four waves, i.e. the TIME_WAIT state, and will be closed directly.

However, this provides a possibility to cross the TIME_WAIT state, but it is a very dangerous behavior and is not worth promoting.

The methods introduced above all try to skip the TIME_WAIT state, which is not a good idea. Although the TIME_WAIT state lasts a little long and seems unfriendly, it is designed to avoid messy things.

However, the book "UNIX Network Programming" says: TIME_WAIT is our friend, it is helpful to us, don't try to avoid this state, but should figure it out.

If the server wants to avoid too many connections in TIME_WAIT state, it should never actively disconnect the connection and let the client disconnect. The clients distributed in different locations will bear the TIME_WAIT.

What are the reasons for a large number of CLOSE_WAIT status on the server?

Or take this picture:

TCP four-wave handshake process

From the above picture, we can see that the CLOSE_WAIT state is the state of the "passive closing party", and if the "passive closing party" does not call the close function to close the connection, then the FIN message cannot be sent, and the connection in the CLOSE_WAIT state cannot be converted to the LAST_ACK state.

Therefore, when a large number of connections in CLOSE_WAIT state appear on the server, it means that the server program has not called the close function to close the connection.

So what will cause the server program to not call the close function to close the connection? At this time, you usually need to check the code.

Let's first analyze the process of a common TCP server:

  • Create a server socket, bind the port, listen to the port
  • Register the server socket to epoll
  • epoll_wait waits for a connection to come. When a connection comes, call accpet to get the connected socket.
  • Register the connected socket to epoll
  • epoll_wait waits for an event to occur
  • When the other party closes the connection, we call close

The possible reasons why the server did not call the close function are as follows.

The first reason: Step 2 was not done, and the server socket was not registered with epoll. In this way, when a new connection comes, the server cannot perceive this event, and cannot obtain the connected socket, so the server naturally has no chance to call the close function on the socket.

However, the probability of this happening is relatively small. This is an obvious code logic bug and can be discovered in the early read view stage.

The second reason: Step 3 was not done. When a new connection came, accpet was not called to obtain the socket of the connection. As a result, when a large number of clients actively disconnected, the server had no chance to call the close function on these sockets, resulting in a large number of connections in CLOSE_WAIT state on the server.

This may happen because the server code is stuck in a certain logic or throws an exception in advance before executing the accpet function.

The third reason: Step 4 was not done. After obtaining the connected socket through accpet, it was not registered with epoll. As a result, when the server received the FIN message later, it could not perceive this event, and the server had no chance to call the close function.

This may happen because the server is stuck in a logic or throws an exception before registering the connected socket to epoll. I have seen other people's practical articles on solving the close_wait problem before. If you are interested, you can read: Analysis of the cause of a large number of CLOSE_WAIT connections caused by a weak Netty code

The fourth reason: Step 6 was not done. When it was found that the client closed the connection, the server did not execute the close function. It may be because the code missed processing, or before executing the close function, the code was stuck in a certain logic, such as a deadlock, etc.

It can be found that when a large number of connections in the CLOSE_WAIT state appear on the server, it is usually a code problem. At this time, we need to check and locate the specific code step by step. The main analysis direction is why the server did not call close.

<<:  NTT and Cisco jointly provide hybrid office solutions to lay the foundation for the future of work

>>:  Codebeamer: Next-generation ALM product, driving digital innovation in the manufacturing industry

Recommend

Kubesphere deploys Kubernetes using external IP

Our company has always had the need to connect al...

The correct way to calculate network bandwidth requirements

Different networks have different bandwidth requi...

How to identify the protocol of an unfamiliar CAN network?

In a CAN network, all nodes share a bus for data ...

South Korea pushes for independence of 6G core technology

After South Korea launched the world's first ...

The Cybersecurity Law was promulgated: 6 highlights

On November 7, the 24th meeting of the Standing C...

What is network latency? How can I minimize it?

Network latency is not a new term. In modern Ethe...

Edgevirt: $15.75/year-1GB/25GB/5TB/10Gbps bandwidth/Seattle data center

Tribe once shared information about Edgevirt in J...

Small router, do you really understand its structure?

There are four main types of routers in the netwo...