Want to handle tens of millions of traffic? You should do this!

Want to handle tens of millions of traffic? You should do this!

Your boss asks you to handle tens of millions of traffic. How do you do the architecture design? First of all, we need to establish some principles when we design the architecture.

[[284533]]

1Achieving high concurrency

Service splitting: Split the entire project into multiple sub-projects or modules, divide and conquer, and expand the project horizontally.

Service-oriented: Solve the registration and discovery problems of services after complex service calls. Message queue: Decoupling, asynchronous processing Cache: Concurrency brought by various caches

2. Achieving High Availability

Cluster, current limiting, and downgrade

3. Business Design

Idempotence: It means that the results of a single request or multiple requests initiated by a user for the same operation are consistent, and there will be no side effects due to multiple clicks, just like the number 1 in mathematics, the result is 1 no matter how many times it is raised to a power. To give the simplest example, that is payment. After the user purchases a product, the payment is successfully deducted, but when the result is returned, the network is abnormal. At this time, the money has been deducted. The user clicks the button again, and the second deduction is performed. The result is returned successfully. The user checks the balance and finds that too much money has been deducted, and the transaction record has become two.

Anti-duplicate: Prevent the same data from being submitted simultaneously

In addition to business direction judgment and the restriction that you cannot continue clicking after clicking a button, you can also prevent duplicate clicks on the server side:

Generate a unique random identification number (Token<token>) on the server side and save the token in the Session domain of the current user. Then send the token to the client's form. Use a hidden field in the form to store the token. When the form is submitted, submit it to the server together with the token. Then the server side determines whether the token submitted by the client is consistent with the token generated by the server. If they are inconsistent, then the submission is repeated. At this time, the server side can ignore the repeated submission of the form. If they are the same, the form is processed and the identification number stored in the Session domain of the current user is cleared after processing. High-availability and high-concurrency architecture reference: 9 technical architectures for high availability and high concurrency.

In the following cases, the server program will refuse to process the form request submitted by the user: 1) The Token stored in the Session domain is inconsistent with the Token submitted by the form 2) The Token does not exist in the current user's Session 3) The form data submitted by the user does not contain a Token.

State Machine

The concept of state machine in software design generally refers to the finite state machine (English: finite-state machine, abbreviation: FSM), also known as the finite state automaton, or state machine for short, which is a mathematical model that represents a finite number of states and the transitions and actions between these states.

Here we will focus on the concept and examples of current limiting.

The purpose of current limiting is to protect the availability of the system by limiting the rate of concurrent access/requests or requests within a time window. Once the rate limit is reached, the service can be denied. Just like the pre-sale of mobile phones, if you want to sell 30,000 units, you only need to receive requests from 30,000 users. Other user requests can be filtered, and you can prompt "The current server is too busy, please try again later." I recommend everyone to read this article: Interface current limiting algorithm: leaky bucket algorithm & token bucket algorithm.

Current limiting method:

1. Limit the instantaneous number of concurrent connections: for example, at the entry layer (add nginxhttplimitconnmodule to nginx) to limit the number of connections from the same IP source to prevent malicious attacks.

2. Limit the total number of concurrent connections: limit the total number of concurrent connections by configuring the database connection pool and thread pool size

3. Limit the average rate within the time window: At the interface level, control the concurrent requests of the interface by limiting the access rate.

4. Other methods: limit the call rate of the remote interface and limit the consumption rate of MQ.

Common current limiting algorithms

1. Sliding window protocol: a common flow control technology used to improve throughput.

The origin of the sliding window protocol:

Sliding window is a flow control technology. In early network communication, both parties sent data directly without considering the network congestion. Since everyone was unaware of the network congestion, they sent data at the same time, which caused the intermediate nodes to block and drop packets, and no one could send data. Therefore, there is a sliding window mechanism to solve this problem. Both the sender and the receiver maintain a sequence of data frames, which is called a window.

Definition: Sliding Window Protocol is an application of TCP protocol, used for flow control during network data transmission to avoid congestion. This protocol allows the sender to send multiple data packets before stopping and waiting for confirmation. Since the sender does not have to stop and wait for confirmation after each packet, this protocol can speed up data transmission and improve network throughput.

Send window: It is a table of serial numbers of frames that the sender is allowed to send continuously. The sender can send data continuously without waiting for a response (it can be controlled by setting the window size)

Receive window: The sequence of frames that the receiver is allowed to receive. All frames that fall within the receive window must be processed by the receiver, and frames that fall outside the receive window will be discarded. The number of frames that the receiver is allowed to receive each time is called the receive window size.


Demo address: https://media.pearsoncmg.com/aw/ecskurosecompnetwork_7/cw/content/interactiveanimations/selective-repeat-protocol/index.html

2. Leaky Bucket: The leaky bucket algorithm can forcibly limit the data transmission rate.

The idea of ​​the leaky bucket algorithm is very simple. The request first enters the leaky bucket, and the bucket discharges water at a certain speed. When the water request is too large, it will overflow directly. It can be seen that the leaky bucket algorithm can forcibly limit the data transmission rate. The input end does not need to consider the rate of the output end, just like the MQ message queue. The provider only needs to pass the message into the queue without worrying about whether the consumer has received the message.

As for the overflowing water, that is, the filtered data, it can be directly discarded or temporarily saved in some way, such as adding it to the queue, just like the four processing mechanisms for overflowing data in the thread pool.

3. Token bucket: a rate-limiting algorithm.

For many application scenarios, in addition to limiting the average data transmission rate, it is also required to allow a certain degree of burst transmission. At this time, the leaky bucket algorithm may not be suitable, and the token bucket algorithm is more suitable. The principle of the token bucket algorithm is that the system will put tokens into the bucket at a constant rate, and if a request needs to be processed, it needs to get a token from the bucket first. When there is no token in the bucket, the service is denied.

Set Rate = 2: the number of tokens put in per second

Bucket size: 100

Here we use a small demo to implement the token bucket

  1. public class TokenDemo {
  2. //qps: the number of requests processed per second; tps: the number of transactions processed per second
  3. // represents qps is 10;
  4. RateLimiter rateLimiter = RateLimiter. create (10);
  5. public void doSomething(){
  6. if (rateLimiter.tryAcquire()){
  7. //Try to get the token. If true , the token is successfully obtained.
  8. System.out.println ( "Normal processing" ) ;
  9. } else {
  10. System.out.println ( "Processing failed" );
  11. }
  12. }
  13. public   static void main(String args[]) throwsIOException{
  14. /*
  15. * CountDownLatch is implemented through a counter. The initial value of the counter is the number of threads, and this value is the number of operations that the thread will wait for (the number of threads).
  16. * When a thread waits in order to perform these operations, it uses the await() method.
  17. * This method puts the thread to sleep until the operation is complete.
  18. * When an operation ends, it uses the countDown() method to reduce the internal counter of the CountDownLatch class, and the value of the counter is reduced by 1.
  19. * When the counter reaches 0, it means that all threads have completed their tasks. This class will wake up all threads that are sleeping using the await() method to resume executing tasks.
  20. *
  21. * */
  22. CountDownLatch latch = new CountDownLatch(1);
  23. Random random = new Random(10);
  24. TokenDemo tokenDemo = new TokenDemo();
  25. for ( int i=0;i<20;i++){
  26. new Thread(()->{
  27. try {
  28. latch.await();
  29. Thread.sleep(random.nextInt(1000));
  30. tokenDemo.doSomething();
  31. }catch (InterruptedException e){
  32. e.printStackTrace();
  33. }
  34. }).start();
  35. }
  36. latch.countDown();
  37. System.in.read ( ) ;
  38. }
  39. }

Execution Result:

  1. Normal processing
  2. Normal processing
  3. Normal processing
  4. Normal processing
  5. Normal processing
  6. Handling failure
  7. Normal processing
  8. Handling failure
  9. Handling failure
  10. Handling failure
  11. Normal processing
  12. Handling failure
  13. Normal processing
  14. Handling failure
  15. Normal processing
  16. Normal processing
  17. Normal processing
  18. Normal processing
  19. Handling failure
  20. Handling failure

It can be seen from this that when the token is insufficient, the acquisition of the token will fail, achieving the effect of current limiting.

<<:  6G: Everything you want to know is here!

>>:  Talk: It's time to talk about what IPv4 and IPv6 are

Recommend

HOSTEROID: €14/year - 2GB/25GB/750GB@1Gbps/UK (London) VPS

HOSTEROID recently released two special annual pa...

Top 10 Network Monitoring Software and Visibility Tools

If you need to understand what's happening on...

Where is the domestic Wi-Fi 6 chip heading?

What was the past life like, what is the present ...

In-depth | Only IT people can understand "Journey to the West"

As a TV series that has been rebroadcast thousand...

In the era of the Internet of Things, how will smart hardware affect our lives?

In recent years, the Internet of Things (IoT) has...

...

Why is C-band spectrum important for 5G?

1. Introduction Synchronization is one of the mos...

Edge Data Centers and the Impact of 5G

A new category of data center will become a major...

Wi-Fi Sense: Your home's next sensor may not be a sensor

Part 01 How Wi-Fi Sensing Works Wi-Fi sensing is ...