1. Introduction When using Redis, we often encounter BigKey (hereinafter referred to as "big key") and HotKey (hereinafter referred to as "hot key"). If big key and hot key are not discovered and processed in time, it is likely to cause service performance degradation, user experience deterioration, and even cause large-scale failures. Definition of two major keys and hot keys We often see definitions of big keys and hot keys in the company's internal Redis development and usage specification manuals, or in a large number of Redis best practice articles on the Internet. However, the criteria for determining big keys and hot keys in these materials are not exactly the same, but it is clear that their determination dimensions are consistent: big keys are usually determined by data size and number of members, while hot keys are determined by the frequency and number of requests they receive. 1 What is a Big Key Usually we call a key containing large data or a large number of members or lists a big key. Below we will use several practical examples to describe the characteristics of a big key: A STRING type key, whose value is 5MB (data is too large) 2 What is a hot key When a key receives significantly more accesses than other keys, we can call it a hot key. Common hot keys include: The total number of visits per second for a Redis instance is 10,000, and the number of visits per second for one key reaches 7,000 (the number of visits is significantly higher than other keys) Problems caused by the three major keys and hot keys When using Redis, large keys and hot keys will cause various problems for Redis, and the most common problems are performance degradation, access timeout, data imbalance, etc. 1 Common problems caused by large keys Client finds that Redis is slow; 2 Common problems caused by hot keys Hot keys take up a lot of Redis CPU time, causing poor performance and affecting other requests; Using Redis in scenarios that are not suitable for its capabilities, resulting in the key value being too large, such as using a String type key to store large binary file data (large key); 5. Find out the big keys and hot keys in Redis The analysis of large and hot keys is not difficult. We have many ways and means to analyze the keys in Redis and find the "problem" keys, such as Redis's built-in functions, open source tools, and the key analysis function in the Alibaba Cloud Redis console. 1 Use Redis built-in functions to discover large and hot keys Some built-in commands and tools of Redis can help us find these problematic keys. When you have a clear analysis target for Redis's large and hot keys, you can use the following command to analyze the corresponding keys. Analyze the target Key through Redis built-in commands You may choose to use the debug object command to analyze the key. This command can analyze the key based on the object passed in (the name of the key) and return a large amount of data, where the value of serializedlength is the serialized length of the key. You may choose to use this data to determine whether the corresponding key meets your large key judgment criteria. It should be noted that the serialized length of the key is not equal to its actual length in memory space. In addition, debug object is a debugging command with high running cost. When it is running, other requests entering Redis will be blocked until it is executed. The running time of this command depends on the serialized length of the incoming object (Key name). Therefore, it is not recommended to use this command to analyze large keys in an online environment, which may cause failures. Since Redis 4.0, the MEMORY USAGE command has been provided to help analyze the memory usage of keys. Compared with the debug object, its execution cost is lower, but because its time complexity is O(N), there is still a risk of blocking when analyzing large keys. We recommend analyzing the key in a less risky way. Redis provides different commands for different data structures to return their length or number of members, as shown in the following table: Through the above Redis built-in commands, we can conveniently and safely analyze the key without affecting the online service. However, since the results they return are not the actual memory usage data of the key, they are not accurate enough and can only be used as a reference. Discover big keys through the bigkeys parameter of the Redis official client redis-cli If you do not have a clear target key for analysis, but want to use tools to find the big keys in the entire Redis instance, the bigkeys parameter of redis-cli can easily help you achieve this goal. Redis provides the bigkeys parameter, which enables redis-cli to analyze all the keys in the entire Redis instance in a traversal manner and summarize and return the results in a report. The advantages of this solution are convenience and security, but the disadvantage is also very obvious: the analysis results cannot be customized. Bigkeys can only output the largest key in the six data structures of Redis. If you want to analyze only the STRING type or find the HASH Key with more than 10 members, bigkeys will be powerless in such demand scenarios. There are a large number of open source projects on GitHub that can implement an enhanced version of bigkeys so that the results can be customized according to the configuration. In addition, you can also use SCAN + TYPE and the commands in the table above to implement a Redis instance-level big key analysis tool yourself. Similarly, the implementation method and return results of this solution make it inaccurate and in real time, so it is recommended to be used as a reference only. Discover hot keys through the hotkeys parameter of the Redis official client redis-cli Since Redis 4.0, the hotkeys parameter has been provided to facilitate users to perform instance-level hot key analysis. This parameter can return the number of times all keys have been accessed. Its disadvantage is that the output report cannot be customized. A large amount of information will make your analysis more complicated. In addition, the prerequisite for using this solution is to set the maxmemory-policy parameter of redis-server to LFU. Locating hot keys through the business layer Every access to Redis comes from the business layer, so we can add corresponding code in the business layer to record and asynchronously summarize the access to Redis. The advantage of this solution is that it can accurately and timely analyze the existence of hot keys, but the disadvantage is that the complexity of the business code increases and may reduce some performance. Use the monitor command to find out hot keys in an emergency Redis's monitor command can faithfully print all requests in Redis, including time information, client information, commands, and key information. In an emergency, we can execute the monitor command briefly and redirect the output to a file. After closing the monitor command, we can classify and analyze the requests in the file to find the hot keys during this period. Since the monitor command occupies a certain amount of Redis's CPU, memory, and network resources, monitor may make things worse for a Redis that is already under high pressure. At the same time, this asynchronous collection and analysis solution has poor timeliness, and since the accuracy of the analysis depends on the execution time of the monitor, the accuracy of this solution is not good enough in most online scenarios where the command cannot be executed for a long time. 2. Discovering Big Keys Using Open Source Tools The high popularity of Redis enables us to easily find a large number of open source solutions to solve the current problem we are facing: obtaining accurate analysis reports without affecting online services. Use redis-rdb-tools to find large keys in a customized way If you want to accurately analyze the actual memory usage of all keys in a Redis instance according to your own standards and avoid affecting online services, and get a concise and easy-to-understand report after the analysis, redis-rdb-tools is a very good choice. This tool can perform customized analysis on Redis RDB files, but since the analysis of RDB files is done offline, it will not affect online services. This is its greatest advantage but also its greatest disadvantage: offline analysis means that the analysis results are less timely. For a large RDB file, its analysis may take a long time. 3. Discovering large and hot keys with the public cloud’s Redis analysis service If you want to analyze all the keys in the Redis instance in real time and find the existing large keys and hot keys, understand which large keys and hot keys have appeared in the Redis running timeline, and make a comprehensive and accurate judgment on the running status of the entire Redis instance, then the public cloud Redis console will be able to meet this demand. CloudDBA in Alibaba Cloud Redis Console CloudDBA is Alibaba Cloud's database intelligent service system, which supports real-time analysis and discovery of Redis large keys and hot keys. The underlying layer of large key and hot key analysis is the key analysis function of Alibaba Cloud Redis kernel. This function directly discovers and outputs the relevant information of large key and hot key through Redis kernel. Therefore, the analysis result of this function is accurate and efficient and has almost no impact on performance. You can enter this function by clicking "Key Analysis" in CloudDBA, as shown in Figure 1-1: Figure 1-1: Alibaba Cloud Redis Console CloudDBA The Key Analysis function has two pages, which allow the analysis of the keys in the corresponding Redis instance in different time dimensions: Real-time: Starts analyzing the current instance immediately and displays all currently existing large keys and hot keys. Processing of six major keys and hot keys Now that we have found the problem keys in Redis through various means, we should immediately deal with them to prevent them from causing problems in the future. 1 Common methods for dealing with large keys Split the large key For example, splitting a HASH Key with tens of thousands of members into multiple HASH Keys and ensuring that the number of members of each Key is within a reasonable range, in the Redis Cluster structure, the splitting of large keys can play a significant role in the memory balance between nodes. Clean up the big keys Store data that is not suitable for Redis capabilities in other storages and delete such data in Redis. It should be noted that we have mentioned above that an overly large key may cause the interruption of Redis cluster synchronization. Redis has provided the UNLINK command since 4.0, which can slowly and gradually clean up the incoming keys in a non-blocking manner. Through UNLINK, you can safely delete large keys or even very large keys. Always monitor Redis memory level The sudden emergence of large key problems can catch us off guard, so discovering and handling large key problems before they occur is an important means to maintain service stability. We can monitor the system and set reasonable Redis memory alarm thresholds to remind us that large keys may be generated at this time, such as: Redis memory usage exceeds 70%, Redis memory growth rate exceeds 20% within 1 hour, etc. Through this type of monitoring, we can solve problems before they occur. For example, if a failure in the LIST consumer program causes the number of lists corresponding to the key to continue to grow, we can turn alarms into warnings to avoid failures. Regularly clean up invalid data For example, we will continue to write a large amount of data in the HASH structure in an incremental form and ignore the timeliness of this data. These large amounts of accumulated invalid data will cause the generation of large keys. The invalid data can be cleaned up through scheduled tasks. In such scenarios, it is recommended to use HSCAN and HDEL to clean up invalid data. This method can clean up invalid data without blocking. Use Alibaba Cloud's Tair (Redis Enterprise Edition) service to avoid the cleanup of invalid data If you have too many HASH keys and a large number of members are invalid and need to be cleaned up, due to the superposition of a large number of keys and a large amount of invalid data, scheduled tasks in such scenarios can no longer clean up invalid data in a timely manner. Alibaba Cloud's Tair service can solve this problem very well. Tair is Alibaba Cloud's enterprise version of Redis. It has all the features of Redis (including Redis's high performance) and provides a large number of additional advanced features. TairHash is a hash data structure that can set expiration time and version for fields. It not only supports rich data interfaces and high processing performance like Redis Hash, but also changes the limitation that hash can only set expiration time for keys: TairHash allows setting expiration time and version for fields. This greatly improves the flexibility of hash data structures and simplifies business development work in many scenarios. TairHash uses the efficient Active Expire algorithm to efficiently complete the field expiration judgment and deletion functions without affecting the response time. The rational use of such advanced functions can free up a lot of Redis operation and maintenance, troubleshooting work and reduce the complexity of business code, allowing operation and maintenance to devote their energy to other more valuable work and allowing R&D to have more time to write more valuable code. 2 Common methods for handling hot keys Replicate hot keys in Redis Cluster In Redis Cluster, due to the migration granularity problem, hot keys cannot be dispersed and the pressure on a single node cannot be reduced. In this case, the corresponding hot keys can be copied and migrated to other nodes. For example, for the hot key foo, three keys with exactly the same content and named foo2, foo3, and foo4 are copied, and then these three keys are migrated to other nodes to solve the hot key pressure on a single node. The disadvantage of this solution is that the code needs to be modified in a linked manner. At the same time, the increase in the number of keys brings about data consistency challenges: it evolves from updating one key to updating multiple keys at the same time. In many cases, this solution is only recommended to temporarily solve the current difficult problems. Use read-write separation architecture If the hot keys are generated from read requests, then read-write separation is a good solution. When using the read-write separation architecture, the read request pressure in each Redis instance can be reduced by continuously adding slave nodes. However, the read-write separation architecture increases the complexity of the business code and also increases the complexity of the Redis cluster architecture: we not only need to provide a forwarding layer (such as Proxy, LVS, etc.) for multiple slave nodes to achieve load balancing, but also need to consider the problem of increased failure rate after the significant increase in the number of slave nodes. The change in the Redis cluster architecture has brought greater challenges to monitoring, operation and maintenance, and fault handling. However, all this is extremely simple in Alibaba Cloud Redis service, which provides services out of the box. At the same time, when business development changes, Alibaba Cloud Redis service allows users to easily cope with changes by adjusting the cluster architecture through configuration changes, such as: master-slave to read-write separation, read-write separation to cluster, master-slave to cluster supporting read-write separation, and directly from community version to enterprise version Redis (Tair) supporting a large number of advanced features. The read-write separation architecture also has disadvantages. In scenarios with extremely large request volumes, the read-write separation architecture will inevitably cause delays, and there will be problems with reading dirty data. Therefore, the read-write separation architecture is not suitable in scenarios where both read and write pressures are high and write requirements for data consistency are very high. Using the QueryCache feature of Alibaba Cloud Tair QueryCache is one of the enterprise-level features of Alibaba Cloud Tair (Redis Enterprise Edition) service. Its principle is as shown in Figure 2-1: Figure 2-1: Tair QueryCache principle Alibaba Cloud Database Redis will identify the hot keys in the instance based on efficient sorting and statistical algorithms. After enabling this function, the proxy point will cache the hot key requests and query results according to the set rules (only cache the query results of the hot key, without caching the entire key). When the same request is received within the cache validity period, the proxy will directly return the result to the client without interacting with the backend Redis sharding. While improving the reading speed, it reduces the performance impact of the hot key on the data sharding and avoids request skew. At this point, the same request from the client no longer needs to interact with the Redis backend of the Proxy, but the Proxy directly returns the data. Requests pointing to hot keys are transferred from one Redis node to multiple proxies, which can greatly reduce the hot key pressure of the Redis node. At the same time, Tair's QueryCache function also provides a large number of commands to facilitate user viewing and management, such as viewing all cached hot keys through the querycache keys command, and obtaining all cached commands through querycache listall. Tair QueryCache's intelligent hot key determination and cache linkage functions can also reduce the workload of operation and maintenance and R&D. Compared with traditional Redis synchronization middleware, Alibaba Cloud Redis global distributed cache has the characteristics of high reliability, high throughput, low latency, and high synchronization correctness. |
<<: Operators' mid-term performance is impressive, and 5G development has entered a critical moment
>>: 5G will catalyze the era of large-scale innovation in the whole society
The final implementation of the regulation has sh...
[[395199]] 01Introduction to Network Interface In...
LuxVPS is a foreign hosting company founded in 20...
DogYun is a Chinese hosting company established i...
Maxthon Hosting has released a promotional plan f...
spinservers has released a promotion for March, o...
There is no doubt that it is very convenient to c...
[51CTO.com original article] What kind of CDN ven...
Recently, China Mobile released its operating dat...
In the first quarter of 2018, the number of cyber...
[[422919]] When evaluating the technologies that ...
Edge computing provides computing, storage, and n...
The formulation of the cell capacity baseline in ...
Power over Ethernet (PoE) is a revolutionary tech...
[[393100]] 1. Introduction to HTTP Protocol HTTP ...