Why do you want to do this? 1 What is Slow SQL? This refers to MySQL slow queries, specifically SQL that runs for longer than the long_query_time value. We often hear that in common MySQL, there are binary logs binlog, relay logs relaylog, redo rollback logs redolog, undolog, etc. For slow queries, there is also a slow query log slowlog, which is used to record statements whose response time exceeds the threshold in MySQL. Do not be misled by the name of slow query and think that the slow query log only records select statements. In fact, it also records DML statements such as insert and update whose execution time exceeds the threshold set by long_query_time.
For the AliSQL-X-Cluster or XDB we use, slow query is enabled by default and long_query_time is set to 1 second. 2 Why do slow queries cause failures? Real slow SQL is often accompanied by a large amount of row scanning, temporary file sorting, or frequent disk flushing, which directly affects the increase of disk IO. Normal SQL also becomes slow SQL, and a large area of execution timeouts. After Double 11 last year, in response to the problems exposed on the technical side, Cainiao's CTO line launched a number of special governance projects, with CTO-D leading each project as a sponsor. The large team I was in was responsible for the special project of slow SQL governance. Second, to what extent 1 How to measure the severity of slow SQL of an application? Micro average sum(aone application slow SQL execution times)-----------------------sum(aone application SQL execution times) In an extreme case, all SQL statements executed in the application are slow SQL statements, and the value is 1. In an extreme case, all SQL statements executed in the application are not slow SQL statements, and the value is 0. However, the problem with this indicator is that it has poor discrimination, especially when the SQL QPS is very high and most SQL queries are not slow queries, the occasional slow SQL will be overwhelmed. Another question is, are occasional slow SQL statements really slow SQL statements? We have encountered many SQL statements recorded in slow query logs. In fact, they may be affected by other slow SQL statements, MySQL disk jitter, optimizer selection, etc., which may turn out to be slow SQL statements that are not slow SQL statements in regular queries into slow SQL statements. Macro average
This algorithm is based on the assumption that the slow SQL statements that are caught have been executed a certain number of times, which can reduce the impact of false slow SQL statements. When the QPS of some applications is very low, that is, the number of SQL executions per day is very small, if false SQL is encountered, it will cause statistical errors. Execution times sum(aone application slow SQL execution times)----------------------- 7 Number of slow SQL templates The above dimensions all have a time limit. In order to trace the historical processing of slow SQL, we also introduced the dimension of the number of global slow SQL templates. count(distinct(aone application slow SQL template) ) 2 Objectives Core application: Solve all slow SQL 3 CTO Report Based on the weighted average of the above multi-dimensional indicators, the CTO-D is used as the unit to rank the applications from low to high, with the top 3 highlighted and reported weekly. Why should I do it? I guess it may be related to my background. I have a C/C++ background and was responsible for the design and implementation of the company-level multi-site active-active architecture in my previous company. I have a better understanding of MySQL. In addition, it may be irrelevant to interests. The business of my small team has just started, and there is no slow SQL, so it can be inserted into various business lines. Four action support 1 Group MySQL Protocol Excerpt from the index convention: [Mandatory] Do not join more than three tables. The data types of the fields to be joined must be absolutely consistent; when querying multiple tables, ensure that the associated fields have indexes. Note: Even when joining two tables, you must pay attention to table indexes and SQL performance. [Mandatory] When creating an index on a varchar field, you must specify the index length. It is not necessary to index the entire field. The index length is determined based on the actual text discrimination. Note: The length and discrimination of an index are contradictory. Generally, for string data, the discrimination of an index with a length of 20 will be as high as 90% or more. You can use count(distinct left(column name, index length))/count(*) to determine the discrimination. [Mandatory] Page search is strictly prohibited from being left-fuzzy or fully blurred. If necessary, please use a search engine to resolve the issue. Note: The index file has the leftmost prefix matching feature of B-Tree. If the value on the left is not determined, this index cannot be used. [Recommended] Prevent implicit conversions caused by different field types, which may cause index invalidation. [Reference] Avoid the following extreme misunderstandings when creating indexes: 1) It is better to have too many indexes than too few It is believed that one query requires one index. 2) Be stingy with index creation It is believed that indexes will consume space and seriously slow down the update and addition speed. 3) Resist unique index It is believed that unique indexes must all be solved at the application layer through the "check first, then insert" method. 2 DB change standards DDL needs to control the change speed, pay attention to grayscale and concurrency control, and the change release must be within the specified change release window. 5. Share some examples of optimization I participated in 1. Uneven data distribution 1) Unreasonable database and table division The business data is divided into 8 databases, each of which has 16 tables. By checking the table space, we can see that the data is almost distributed in two tables in each database. There is a problem with the strategy of dividing the database and table, and the business increment is overestimated. I have reservations about this. 2) Improper indexing A joint index of idx_logistics_corp_id_special_id is created for a single table, but even so, the discrimination is still too low. According to experiments and business feedback, the combination of the (logistics_corp_id, transport_type_id) fields has a very high discrimination, and the business has a single query scenario for transport_type_id. 2 Indexing issues
The table corresponding to this case has a store_code index, so we think there is no problem and no way to optimize it. In fact, through the execution plan, we found that MySQL chose a full table scan. In practice for this case, we found that when the number of range queries exceeds 200, the index optimizer will no longer use the field index. Finally, after pulling relevant query SQL in the recent period and combining it with the data distribution of the business, we found that the problem can be solved by using (is_deleted, quantity). Determine the index length used in the execution plan: key_len length calculation formula (>=5.6.4)
3. Being influenced by others Even though the index was used, it still required scanning 20 million rows: The index fields are highly discriminative: At the same time, regular SQL becomes a slow query: DB data disk access status: After checking other instances on the shared physical machine, we found that a library had many slow SQL statements that needed to be sorted around the problem time, and the temporary file just wrote 2 GB: Multiple MySQL instance leader nodes are deployed on the same physical machine. Although Docker isolates CPU, MEM and other resources, buffer IO isolation is not yet achieved. 4 Unable to resolve Slow queries encountered daily can often be solved by summarizing and analyzing high-frequency queries and combining them with the business to derive appropriate indexes, but this is not a panacea. For example, there may be more and more indexes, even like this: In some scenarios, such as supporting combined queries of multiple fields without mandatory items, it is obviously unreasonable to support all of them through indexes. In the query scenario, it is a good habit to set the fields with higher discrimination as required items; when there are many query combinations, consider using storage or search engines with better search support. Six daily treatments With the in-depth management of each CTO-D line, various indicators have improved significantly compared with before. For example, the core application has completed the clearing of slow queries, and some of the slow SQLs with the greatest impact have been resolved. The ranking of my team has also moved from the initial top 3 at the tail to the top 3 at the head. Slow SQL governance has become a daily routine. By regularly pushing slow SQL work orders every week, and the owner taking over the processing and closing the orders, a habit and atmosphere of regular zeroing has basically been formed. The special project on slow SQL governance has also been named and praised many times. Seven Conclusions This is a belated summary. Looking back now, I think the process of strategy formulation, problem analysis and solution is worth sharing with you. |
>>: Teach you Socket communication (TCP/IP) step by step
[51CTO.com original article] Recently, Beijing Ta...
August 30 news: The 9th "International Sympo...
This month, ZJI has upgraded the bandwidth of CN2...
[[188973]] A set of data: According to the 38th &...
The China Wuzhen Go Summit will officially kick o...
The tribe began sharing about Yunding Network in ...
CrownCloud is a long-established foreign VPS host...
When the word "radiation" is mentioned,...
CMIVPS is a Chinese VPS service provider, providi...
1. Project Background The 5G communication networ...
The three major operators officially announced th...
[[378112]] Looking back at 2020, "5G" m...
Alibaba Cloud EMR on ACK provides users with a ne...
A few days ago, we did a simple test of Gcore'...
Recently, the Ministry of Industry and Informatio...