IntroductionWith the development of microservices and distributed systems, the mutual calls between services are becoming more and more complex. In order to ensure the stability and high availability of our own services, we need to take certain flow control measures when facing requests that exceed our own service capabilities. For example, during the May Day and National Day holidays, when tourist attractions are full, we need to limit the flow of tourists. Our services also need to limit the flow of services in high-concurrency and high-traffic scenarios such as flash sales, big promotions, 618, Double Eleven, and possible malicious attacks, crawlers, etc. Intercepting requests that exceed the service's processing capacity and limiting the traffic to the service is called service throttling. Next, let's talk about service throttling. Two current limiting methodsCommon current limiting methods can be divided into two categories: request-based current limiting and resource-based current limiting.
Request-based current limiting refers to considering current limiting from the perspective of external access requests. There are two common methods. The first is to limit the total amount, that is, to limit the cumulative upper limit of a certain indicator. The most common one is to limit the total number of users served by the current system. For example, a certain live broadcast room is limited to 1 million users, and new users cannot enter after the number exceeds 1 million. There are only 100 items in a rush sale, and the upper limit of the number of users who can participate in the rush sale is 10,000. Users above 10,000 will be directly rejected. The second is to limit the amount of time, that is, to limit the upper limit of a certain indicator within a period of time, for example, only 10,000 users are allowed to access within 1 minute; the maximum peak request per second is 100,000. advantage:
shortcoming:
application:
Request-based current limiting considers the system from the outside, while resource-based current limiting considers the system from the inside, that is, finding the key resources that affect the performance inside the system and limiting their usage. Common internal resources include the number of connections, file handles, number of threads, and request queues. For example, when the CPU usage rate exceeds 80%, new requests will be rejected. advantage:
shortcoming:
application:
Four current limiting algorithmsThere are four common current limiting algorithms. Their implementation principles and advantages and disadvantages are different. In actual design, you need to choose one based on the business scenario.
The implementation principle of the fixed time window algorithm is to count the number of requests or resource consumption within a fixed time period. If the limit is exceeded, the current limit will be activated, as shown in the following figure: advantage:
shortcoming:
In order to solve the critical point problem, the sliding time window algorithm came into being. Its implementation principle is that the two statistical periods partially overlap, so as to avoid the situation where two statistical points in a short period of time belong to different time windows, as shown in the following figure: advantage:
shortcoming:
The implementation principle of the leaky bucket algorithm is to put requests into a "bucket" (message queue, etc.), and the business processing unit (thread, process, application, etc.) takes the request from the bucket for processing. If the bucket is full, new requests are discarded, as shown in the following figure: advantage:
shortcoming:
The leaky bucket algorithm is mainly applicable to scenarios with instantaneous high concurrent traffic (such as the 0:00 sign-in and hourly flash sales mentioned above). When a large number of requests pour in within a few minutes, for better business results and user experience, even if the processing is slow, try not to discard user requests.
The difference between the token bucket algorithm and the leaky bucket algorithm is that what is put into the bucket is not a request, but a "token", which is the "license" required before business processing. In other words, when the system receives a request, it must first get a "token" from the token bucket. Only after getting the token can it process further. If it fails to get the token, the request will be discarded. Its implementation principle is shown in the following figure: advantage:
shortcoming:
The token bucket algorithm is mainly applicable to two typical scenarios. One is the need to control the speed of accessing third-party services to prevent the downstream from being overwhelmed. For example, Alipay needs to control the rate of accessing bank interfaces. The other is the need to control one's own processing speed to prevent overload. For example, if the stress test results show that the maximum processing TPS of the system is 100, then the token bucket can be used to limit the maximum processing speed. Five current limiting strategies
When the request flow reaches the current limiting threshold, the excess requests are directly rejected. The design can be used to reject requests from different sources such as specified domain names, IP addresses, clients, applications, and users.
By adding excess requests to the cache queue or delay queue, we can cope with short-term traffic surges, and gradually process the accumulated request traffic after the peak period.
Set priorities for requests from different sources and process higher priority requests first, such as VIP customers and important business applications (for example, transaction services have a higher priority than log services).
It can monitor system-related indicators, evaluate system pressure, and dynamically adjust the current limiting threshold through the registration center, configuration center, etc.
If there is an excellent service monitoring system and automatic deployment and release system, the monitoring system can automatically monitor the system operation status and issue early warnings through emails, text messages, etc. in the event of a sudden increase in service pressure or a large increase in traffic in the short term. When certain conditions are met, related services can be automatically deployed and released, achieving the effect of dynamic capacity expansion. Three current limiting positions
You can use Nginx, API routing gateway, etc. to limit the flow of domain names or IPs, and intercept illegal requests.
Each service can have its own single-machine or cluster current limiting measures, or call third-party current limiting services, such as Alibaba's Sentinel current limiting framework.
You can also limit the flow of the basic service layer.
SummarizeThis article summarizes two ways of service current limiting from a macro perspective, three locations where current limiting can be applied, four common current limiting algorithms, and five current limiting strategies. Finally, I would like to add that reasonable current limiting configuration requires understanding of the system throughput, so current limiting generally needs to be combined with capacity planning and stress testing. When external requests approach or reach the maximum threshold of the system, current limiting is triggered, and other means are taken to downgrade to protect the system from being overwhelmed. Reference: http://www.studyofnet.com/555653372.html |
<<: Which network IO model should RPC design use?
This is actually just one of the vision goals for...
[[428843]] Since the issuance of the "Action...
We are about to bid farewell to 2016 and welcome ...
[51CTO.com original article] On December 5, Venus...
1. What is 5G? The world's communication tech...
On April 22, Huawei Chairman Liang Hua shared his...
According to IDC's Worldwide Quarterly Ethern...
Documentation is often neglected in IT work. When...
On June 6, Robin Marx, a member of the IETF QUIC ...
Today I tweeted some thoughts about how the OSI m...
I was helping a friend online to mount a disk on ...
If you have been following the developments in ne...
1. What is AP? Answer: AP - Wireless Access Point...
In today’s article, let’s talk about a very popul...
I. Summary A few days ago, when I was sharing an ...