[51CTO.com original article] The Global Software and Operation Technology Summit hosted by 51CTO was held in Beijing on May 18-19, 2018. The summit focused on 12 core hot topics such as artificial intelligence, big data, the Internet of Things, and blockchain, and brought together 60 front-line experts from home and abroad. It was a high-end technology feast and a platform that top IT technical talents should not miss to learn and expand their network. At the "High Concurrency and Real-time Processing" session on the afternoon of the 19th, Tongcheng-Elong Air Ticket Business Group CTO Wang Xiaobo delivered a keynote speech on "Cache Governance in High Concurrency Scenarios", elaborating on hot topics such as how to make cache more suitable for high concurrency, how to use cache correctly, and how to resolve cache problems through governance. After the meeting, 51CTO reporters sorted out the content of Wang Xiaobo's speech at the WOT2018 Global Software and Operation Technology Summit. Wang Xiaobo mentioned in his speech that in high-concurrency scenarios, many people regard cache (high-speed cache memory) as a panacea that can "extend life". Wherever there is high concurrency pressure, cache is uploaded to solve the concurrency problem. But sometimes, even if cache is used, the system is still stuck and crashed. Is it because of poor cache technology? No, in fact, it is because the cache management work is not done well. Take a look at the pitfalls that Tongcheng has encountered Wang Xiaobo gave a relatively systematic introduction to the “pitfalls” that Tongcheng had encountered. In order to relieve the pressure of high concurrency, Tongcheng initially chose memcache (distributed cache system) technology, and later switched to Redis architecture (data structure server, which can be used as database cache), and deployed nearly 200 servers. But the situation did not improve. The system often crashed, the scripts called in the application were messy, the resources of multi-instance deployment were unbalanced, and the data was too fragile to disappear. In order to manage these servers, Tongcheng started the master-slave + keepalived (IT layer 3, 4, and 5 exchange mechanism software) mode and chose to gradually upgrade from single-machine Redis to cluster Redis. They soon found that when clusters were deployed in large numbers, the operation and maintenance side had no way to do operation and maintenance. Although they could run them uniformly through scripts, the cluster was uncontrollable, and many operation and maintenance technical means were prone to cause high-concurrency system downtime, which directly affected the overall business side. "The system could crash at any time, and the operation and maintenance team was about to collapse." Wang Xiaobo recalled. What is the crux of all the problems encountered? Wang Xiaobo concluded that the biggest problem lies in the technical staff's specification of cache usage. People often forget its own shortcomings and only think of its advantage of "fast". He gave an example. In a system failure summary report, a technician wrote that he did not expect that Redis, which had only 30,000 lines of code in its initial state, would bring such a magical function. This idea makes programmers feel like they have a hammer in their hands and want to hammer nails when they see them. In other words, they want to use cache to solve any needs they see. Under such misleading, people began to frequently use cache-based log collectors, cache-based countdowns, cache-based counters, and cache-based order systems. After these functions appeared, people were only intoxicated by its speed, but ignored how to ensure its normal operation. What is the real fault of cache? Wang Xiaobo summarized it into four points: First, over-reliance, which is the most prominent point. Sometimes, the cache is not needed, but the technicians insist on using it. Second, data is dropped to disk, third, the capacity is too large, and fourth, the cache avalanche. Why do these faults occur? Wang Xiaobo believes that the biggest problem is the abuse, misuse, and laziness of users. In addition, the operation and maintenance of thousands of cache servers without any usage rules, the operation and maintenance personnel do not understand development, and the developers do not understand operation and maintenance, resulting in the use of cache without design and control, and too many server resources being wasted. These are all common phenomena. What kind of cache do people need? What kind of cache governance do they need? Wang Xiaobo believes that, from the perspective of true development philosophy, what people want is a "magic box" that can magically meet various high-concurrency requirements. In simple terms, developers do not need to care whether the cache is large or small, good or bad. Because developers have limited knowledge of cache technology, they are most afraid of using it indiscriminately. It is worth noting that many developers ignore the fact that many data in the cache are not always hot data, and do not make sufficient estimates before high concurrency arrives, which results in the discovery of bottlenecks too late during application. Tongcheng's "Phoenix Nirvana" In order to truly bring out the role of cache and cope with high concurrency, the Tongcheng technical team finally developed the phoenix solution. When they first designed it, they hoped that there would be a simple SDK on the application side in this architecture for developers to use. As long as the developer declares the project and related data scenarios, he will get a key. With this key, the SDK will assign a new cache warehouse to the developer, on which Redis can be run, and the entire scheduling platform can call it very quickly. In addition, phoenix can also start comprehensive monitoring from the client call. Of course, more importantly, it can prevent cache collapse and achieve dynamic expansion and contraction. Later, the Phoenix solution added a proxy layer. Because the time cost of client multi-language development is too high, and the upgrade of the client in the application is a big problem, Wang Xiaobo revealed that the upgrade of the middleware of almost all embedded applications is a huge trouble. Once upgraded, the system needs to be retested and it is easy to crash again. Therefore, it is better to control through local cache, and disk can be used as cache for some infrequently used parts. ***, containers were added to the Phoenix solution. After container deployment was implemented, Tongcheng's overall monitoring, data migration, and scaling scheduling became more flexible and easier to operate through multiple small clusters + single nodes, cluster division by scenario, and real-time balanced scheduling data. Taking data migration as an example, Tongcheng has developed a complete migration system, from traffic expansion to data expansion, from vertical and horizontal expansion, all of which have achieved a relatively good fully automatic processing. The above content is compiled by 51CTO reporter based on the interview with Wang Xiaobo, CTO of Tongcheng-Elong Air Ticket Business Group, at the WOT2018 Global Software and Operation Technology Summit. For more information about WOT, please visit .com. [51CTO original article, please indicate the original author and source as 51CTO.com when reprinting on partner sites] |
>>: Unlimited traffic ≠ unlimited traffic usage. Have you ever encountered this kind of "trap"?
[[402368]] This article is reprinted from the WeC...
For the convenience of many friends, panels are d...
I recently read a paper about 5G core network, &q...
According to foreign media reports, in December 2...
I shared information about Lisa hosts in the midd...
Introduction In the traditional historical stage,...
COVID-19 has been one of the biggest disruptors i...
SDN (Software Defined Networking) has become one ...
Brocade today announced the expansion of the Broc...
The popularity and application of 4G has opened t...
As enterprises integrate 5G technology into their...
Hostmem is a Chinese VPS service provider. The tr...
[[254871]] In today's mobile Internet era, mo...
1. Introduction to iPerf3 iPerf3 is a widely used...