Comprehensive understanding of TCP/IP knowledge system structure summary

Comprehensive understanding of TCP/IP knowledge system structure summary

1. TCP Knowledge System

We analyze the TCP knowledge system of server development from three dimensions: performance rules, design rules, and pitfall avoidance rules.

2. Performance rules

The performance rules can be roughly summarized as follows:

1. Reduce data transfer

The following is a quote from Zuo Erduo's article "How Programmers Can Use Technology to Make Money":

From the above we can see that reducing data transfer is very important for performance.

2. Set MTU according to the scenario

If it is an intranet application, improving performance by reasonably setting MTU is a means that cannot be ignored; for mobile applications, the MTU can generally be set to 1492; for extranet applications, the general 1500 is set.

3. Utilizing TCP offload

For applications with high bandwidth consumption, you can consider using TCP offload to improve performance.

4. TCP NODELAY

Currently, server programs generally recommend setting NODELAY to true. If small data packets need to be merged, consider merging data at the application layer (refer to the content in Wikipedia in the figure below).

For details, please refer to: "https://en.wikipedia.org/wiki/Nagle%27s_algorithm"

5. Use appropriate congestion control algorithms

The following figure shows the scenario of a data packet passing through a router queue.

The first is the most ideal situation, when the data packet arrives at the router, it can be forwarded directly without waiting; the second is that it has to wait for a while before it can be sent out; the third is that the data packet is discarded by the router because the router queue is full.

Sending too much data may lead to the third situation.

The following shows the throughput comparison of the Linux default algorithm CUBIC and the BBR algorithm under packet loss conditions:

As can be seen from the above figure, the BBR congestion control algorithm can maintain throughput below 20% packet loss rate, so BBR is more resistant to network jitter than CUBIC.

The fundamental reasons for the excellence of the BBR algorithm are as follows:

  • Make full use of bandwidth on network links with a certain packet loss rate
  • Reduce the queue occupancy rate of the router, thereby reducing latency

It is generally recommended to use the BBR algorithm in situations where packet loss is not caused by network congestion, such as mobile applications.

For application scenarios with large bandwidth and long RTT time, you can refer to this.

6. Use REUSEPORT

For short-connection applications (such as PHP applications), the Linux REUSEPORT mechanism can be used to prevent the server application from not having enough time to receive the connection request. The database middleware Cetus developed by us successfully avoids the impact of short-connection applications by using the REUSEPORT mechanism.

3. Design Principles

1. Avoid TCP HOL issues

Try to use multiple connections instead of using a single connection to transfer large amounts of data.

2. Transmission should be as smooth as possible without jitter

If data transmission is jittery, it can easily lead to the following problems:

  • Memory bloat
  • Unstable performance
  • Inefficient compression algorithms

When developing the database middleware Cetus, we controlled the amount of data transmitted each time. When using the same compression algorithm, the compression ratio of Cetus is much better than that of MySQL.

3. TCP stream transmission

TCP stream is mainly used in middleware services.

The following figure is an interaction diagram without using TCP stream. The middleware starts sending to the client only after receiving the response from the server. Many database middlewares use this working method, which results in huge memory consumption of the middleware.

The following figure uses the TCP stream method, which not only reduces latency but also reduces memory consumption (because there is no need to retain all responses).

It is best for the server middleware program to implement TCP stream, otherwise problems such as memory explosion may occur.

4. Upper-layer application pipeline mechanism

TCP itself does not have a pipeline mechanism, but upper-layer applications can use the pipeline mechanism to improve the throughput of server applications.

The following figure is an interaction diagram without pipeline. The client needs to receive the server response before sending the next request.

The following figure is an interaction diagram using pipeline. The client can send multiple requests continuously without waiting for a response.

For TCP, request 1, request 2 and request 3 are considered as one request, and response 1, response 2 and response 3 are considered as one response; for upper-layer applications, there are 3 requests and 3 responses.

Currently, many protocols or software use the pipeline mechanism to improve application throughput. For example, the HTTP v2 protocol supports pipeline sending requests, and Redis uses the pipeline mechanism to improve application throughput.

5. Merge small data

When TCPCopy is running, intercept returns the TCP/IP header of the response packet to tcpcopy. Generally, the TCP/IP header is only a few dozen bytes. If each write operation only transmits the TCP/IP header of a response packet, the efficiency will be very low. In order to improve the transmission efficiency, intercept merges the TCP/IP header information of several response packets and sends them together.

4. Avoid pitfalls 4.1 Add keepalive mechanism

The TCP keepalive mechanism can be used to detect whether the connection is still alive. For details, please refer to "Dealing with Reset rogue interference: TCP keepalive".

1. MTU

Reference: "https://wiki.archlinux.org/index.php/Jumbo_frames"

2. Ensure that the network is unobstructed

There are more or less some problems with the cloud environment, intermediate device programs, TCP offload, and load balancers. If these problems are not solved in time, they will greatly affect the performance and troubleshooting of the program.

In this regard, we can usually find out the problem by capturing packets.

The following shows that a bug in the load balancer itself caused network unavailability.

Because the load balancer does not strictly follow the TCP session method for load balancing, some TCP session data packets go to different machines, causing the application to report a request timeout.

Initial connection The data packet went to machine 180.

Later, the data packets of this connection ran to machine 176 (see the figure below).

If this kind of bug occurs in the load balancer, it will cause great trouble to users and it will be difficult to find out the cause of the problem.

At this time, you either need to replace the load balancer or ask the manufacturer to fix the bug in the load balancer. Otherwise, the upper-layer application will continue to report network timeouts and other problems.

V. Conclusion

For server developers, only after understanding the TCP knowledge system can they develop with ease and avoid some potential pitfalls.

<<:  Detailed explanation of TCP/IP acceleration principle

>>:  A brief discussion of the TCP protocol, finally understand what it does

Recommend

Cutting in while driving is annoying. WiFi actually takes up lanes too.

Friends who often drive often encounter the pheno...

Let's talk about NAT protocol???

Hey everyone, this is cxuan. Today we are going t...

What is SD-Branch? Why do you need it?

[51CTO.com Quick Translation] The deployed SD-WAN...

5G sets new standards for IoT connectivity in vertical industries

As 5G rolls out around the world, verticals acros...

Easy to understand, this article will introduce you to the HTTP protocol?

1. What is http? Http protocol is Hypertext trans...

Artificial Intelligence in the Data Center: Seven Things You Need to Know

Artificial intelligence and machine learning are ...