Cache + HASH = high concurrency? You think the high concurrency architecture is too simple!

Cache + HASH = high concurrency? You think the high concurrency architecture is too simple!

[51CTO.com original article] In the Internet era, high concurrency, like high availability, has become a standard feature of the system. If the system's query per second (QPS) is not over 10,000, it is embarrassing to greet people (although the actual number of calls per day does not exceed 100). Especially during the Double 11 period, e-commerce companies enthusiastically shared their technical architectures with the global traffic, and almost all of them used cache + hash (HASH), as if this was the core technology of high concurrency. Of course, if you believe it, you are not far from the pit.

Cache + hash = high concurrency?

As the saying goes, knowing yourself and your enemy ensures victory in a hundred battles. Let’s first take a look at the high concurrency technologies that we often see.

Resource staticization. The flash sale page for an event is a standard high-concurrency scenario. During the event, the traffic of a single page is huge, with QPS reaching hundreds of thousands or millions per second. The core solution of the system is staticization, supported by machines and bandwidth. If CDN is not deployed, all traffic will fall to the same IP. The solution is to use Nginx file staticization. The bearing capacity of a single machine mainly depends on the bandwidth and performance of the single machine. If the traffic is more, then LVS (or F5) + cluster will be used. As far as the middle and back-end are concerned, Nginx can handle most applications, and the core still uses caching technology.

Nowadays, various caching tools such as memcache, redis and other KV databases are very mature as caches, and can basically be deployed in clusters. They are also very simple to operate and have become the synonym of high concurrency.

Read-write separation and database and table sharding Read-write separation is also a high-concurrency architecture that we often see. Because reads are generally much more than writes, the master database of the database writes, and the slave databases provide read operations, which immediately improves the concurrency performance of the database. If it is still not enough, then database and table sharding can be used to distribute the data to each machine of each database, further reducing the pressure on a single machine, thereby achieving the purpose of high concurrency.

If the database and table are sharded, hashing technology is sometimes used. A certain field is hashed and then the database and table are sharded. For read operations with read-write separation, hashing technology is basically used to place the read operations on different machines to reduce the pressure on a single machine. Whenever big data is processed or a high-concurrency system is used, hashing is a must. Random insertion has a time complexity of O(1), and random query has a time complexity of O(1). Apart from consuming some space, hashing tables have almost no disadvantages. In this era of cheap memory, hash tables have become a standard feature of high-concurrency systems.

When e-commerce promotions were held, tens of millions of people visited the website at the same time without crashing, which made many people think that high concurrency should be like this. If this scenario is further expanded, any concurrent access that can provide data in advance can use cache + hash to handle concurrency. For websites like 12306 that cannot provide data in advance, cache + hash can be used to improve user experience, and then provide services in an asynchronous way (the verification code that has been complained about is actually an asynchronous queuing method).

Is this actually the case? Obviously it is not that simple. Let's take a look at the elements that a highly concurrent system really needs to consider.

Reasonable data structure

Everyone is very familiar with the search prompt function. If it is a large search system like Google and Baidu, or an e-commerce system like JD.com and Taobao, the number of calls to the search prompt is several times the number of calls to the search service itself, because every time you enter a keystroke, the search prompt service is called once. This can be regarded as a standard high-concurrency system, right? So how is it implemented?

Many people may immediately think of a cache + hash system, where the search hint word is stored in the redis cluster, and each time a request comes in, the key is directly searched in the redis cluster, and the corresponding value is returned. Although it consumes some memory, it is acceptable to exchange space for time. However, in fact, no one will do this.

This kind of search hint function is usually implemented using a trie tree, which does not consume much memory and has a search speed of O(k), where k is the length of the string. Although it does not seem as good as the O(1) of a hash table, it reduces network overhead, saves a lot of memory, and the actual search time is not much slower than cache + hash. A core data structure suitable for the current scenario is the key to a high-concurrency system. Cache + hash can also be regarded as a data structure, but this data structure is not suitable for all high-concurrency scenarios. Therefore, the key to the design of a high-concurrency system lies in the design of a reasonable data structure, not in the application of the architecture.

Continuously optimize code performance

With the above data structure and the system designed, when it is run online, the effect is okay, but it feels like it has not reached the limit. At this time, you must not directly use external tools (such as cache) to improve performance. What you need to do is to continuously optimize the code performance. Simply put, it is to constantly reiterate your code, constantly find performance points that can be optimized, and then optimize them, because when you designed it before, you can roughly calculate the concurrency of the system through theory. For example, the search prompt above, if we assume that each search term has an average of 6 characters, it takes about 6 queries to retrieve it once, which takes 2-3 milliseconds. In this case, if the machine has 8 cores and multi-threaded programming, it can accept up to 3200 requests per second (1000ms/2.5ms*8). If it does not reach this level, then there must be a problem with the code.

At this stage, you may need to use some tools. The go tool pprof that comes with Golang can optimize performance very well.

Alternatively, you can print out the time of each module, perform stress testing to see which module is time-consuming, and then carefully check the code of that module to optimize the algorithm and data structure.

This process is a long one, and is also mentioned in "Refactoring: Improving Existing Code Design". An excellent system requires continuous optimization and refactoring at the code level. Therefore, the realization of a high-concurrency system is to continuously optimize the performance of the code and continuously approach the theoretical value at the time of design.

***Consider external general methods

If both of the above are completed and the concurrency has basically reached the theoretical value, but there is still a need for improvement, then consider external general methods, such as adding an LRU cache to reduce the query time of hot words to O(1) to further improve performance. Do not use external general methods before the system performance is fully squeezed, because there will not be much room for further optimization after using them.

In addition, it is necessary to consider operation and maintenance technologies, such as conventional load balancing, deployment into clusters, and improving service concurrency through operation and maintenance and deployment methods.

51CTO’s opinion

In fact, code is the key to high availability. The robustness of code determines high availability. In addition, it is also necessary to pay attention to the design of data structure and the ability to tune code. Many people sneer at data structure and think that data structure is not so important for existing development, but for back-end development, data structure is a very important skill.

Find the right data structure and continuously optimize the code to improve system performance in a controllable way, so that there is room for continuous optimization and better high concurrency. If external caching technology is used from the beginning, it is likely that if the performance does not meet the requirements, there will be no room for optimization because it is still difficult to modify the external system.

[51CTO original article, please indicate the original author and source as 51CTO.com when reprinting on partner sites]

【Editor's recommendation】

Why are there fewer and fewer open source projects using the GPL protocol?

Silicon materials are crying! The new favorite GaN doubles the energy efficiency of data centers and reduces the cost by 50%

   The dark web is also being "black-eat-black". Anonymous hackers take over 20% of the dark web in 21 steps

<<:  Why are there fewer and fewer open source projects using the GPL protocol?

>>:  5G commercial key equipment development completed: ready for large-scale popularization

Recommend

China 5G: I have 1G more than 4G, why do so many people still dislike me?

Introduction After the rapid development of 1G, 2...

Will the next 5G data package be an “unlimited data package”?

In the past two years, 4G unlimited data packages...

5G modem and processor shipments surge

[[389359]] Data from the Global Mobile Suppliers ...

The main problems facing 5G networks

5G networks are the next generation of wireless t...

Facebook launches new AI project to learn from videos

On March 30, according to foreign media reports, ...

SpaceX executive says Starlink can expand service to 30 million Americans

SpaceX's satellite internet service, Starlink...