1. Background: The Battle of Performance "If you don't agree, run a test" has become a joke in the mobile phone industry, but to be honest, "running a test" is indeed one of the most important evaluation methods in the field of operating systems. For example, the Linux kernel community often uses the score of the running software to evaluate the value of an optimization patch. There are even media such as phoronix that focus on Linux running. And today I want to say one more thing, making the software run high is a manifestation of strength and is based on a deep understanding of the kernel. The story of this article originated from a daily performance optimization analysis. When we evaluated the automated performance tuning software tuned, we found that it made some minor changes to the parameters related to the Linux kernel scheduler in the server scenario, but these changes greatly improved the performance of the hackbench running software. Isn't it interesting? Let's find out together. This article will expand on several aspects and focus on the bold parts: Related knowledge introduction 2. Introduction to relevant knowledge 2.1 CFS Scheduler Most (roughly speaking, all except real-time tasks) threads/processes in Linux are scheduled by a scheduler called CFS (Completely Fair Scheduler), which is one of the core components of Linux. (In Linux, threads and processes are only slightly different, so we will use process as the term below.) The core of CFS is the red-black tree, which is used to manage the running time of processes in the system and serves as the basis for selecting the next process to run. In addition, it also supports priority, group scheduling (based on the well-known cgroup implementation), current limiting and other functions to meet various advanced requirements. A detailed introduction to CFS. 2.2 Hackbench Hackbench is a stress testing tool for the Linux kernel scheduler. Its main task is to create a specified number of scheduling entity pairs (threads/processes), let them transmit data through sockets/pipes, and finally calculate the time overhead of the entire running process. 2.3 CFS Scheduler Parameters This article focuses on the following two parameters, which are also important factors affecting the performance of hackbench. System administrators can use the sysctl command to set them. Minimum granularity time: kernel.sched_min_granularity_ns By modifying kernel.sched_min_granularity_ns, you can affect the length of the CFS scheduling period. For example, if you set kernel.sched_min_granularity_ns = m, when there are a large number of runnable processes in the system, the larger the m, the longer the CFS scheduling period. As shown in Figure 1, each process can run on the CPU for a different amount of time. sched_min_granularity_ns ensures the minimum running time of each process (under the same priority). The larger the sched_min_granularity_ns, the longer each process can run at a time. Figure 1: sched_min_granularity_ns diagram Wake-up preemption granularity: kernel.sched_wakeup_granularity_ns kernel.sched_wakeup_granularity_ns ensures that the reawakened process does not frequently preempt the running process. The larger the kernel.sched_wakeup_granularity_ns, the less frequent the preemption of the awakened process. As shown in Figure 2, three processes, process-{1,2,3}, are awakened. Because the running time of process-3 is greater than curr (the process running on the CPU), it cannot preempt the running process. The running time of process-2 is less than curr but the difference is less than sched_wakeup_granularity_ns, so it cannot preempt the running process either. Only process-1 can preempt curr. Therefore, the smaller sched_wakeup_granularity_ns is, the faster the response time of the process after being awakened (the shorter the waiting time). Figure 2: sched_wakeup_granularity_ns diagram 3. Introduction to hackbench working mode The hackbench working modes are divided into process mode and thread mode. The main difference is whether the test is based on creating a process or a thread. The following will introduce it based on thread. Hackbench will create several threads (even number), which are divided into two types of threads: sender and receiver Figure 3: Hackbench working mode Active context switching: For the receiver, when there is no data in the buffer, the receiver will be blocked and actively give up the CPU to enter sleep. 4. Sources of Hackbench Performance Impact In the hackbench-socket test, tuned modified the two parameters of CFS, sched_min_granularity_ns and sched_wakeup_granularity_ns, which resulted in significant performance differences. The details are as follows: Next, we adjust these two scheduling parameters for further in-depth analysis. 5. Dual Parameter Optimization Note: For simplicity, m will be used to represent kernel.sched_min_granularity_ns and w will be used to represent kernel.sched_wakeup_granularity_ns. To explore the impact of dual parameters on the scheduler, we choose to fix one parameter at a time, study the impact of the other parameter change on performance, and use system knowledge to explain the principle behind this phenomenon. 5.1 Fixed sched_wakeup_granularity_ns Figure 4: Fix w and adjust m In the figure above, we fixed the parameter w and divided it into three parts according to the change trend of the parameter m: area A (1ms~4ms), area B (4ms~17ms), and area C (17ms~30ms). In area A, the four curves all show a rapid downward trend, while in area B, the four curves are in an oscillating state with large fluctuations. Finally, in area C, the four curves tend to be stable. From the related knowledge in Section 2, we can know that m affects the running time of the process, which also means that it affects the "passive context switching" of the process. For region A, preemption is too frequent and most preemptions are meaningless because the peer has no data to write/no buffer available, resulting in a large number of redundant "active context switches". At this time, a larger w can allow the sender/receiver to have more time to write/consume data to reduce the meaningless "active context switches" of the peer process. 5.2 Fixed sched_min_granularity_ns Figure 5: Fix m and adjust w In the above figure, we fixed the parameter m and divided it into three areas: In region A, the same phenomenon as in Figure 4 exists: larger m is less affected by w, while smaller m performs better as w increases. 5.3 Performance Trend Overview The following is a thermal overview of the experimental data to intuitively show the constraint relationship between m and w for reference and analysis by students in need. The three areas are slightly different from the areas in Figures 4 and 5. Figure 6: Overview 5.4 Optimal Dual Parameters (for hackbench) From the analysis in the above two sections, we can see that for scenarios with "active context switching" such as hackbench, a larger m (for example: 15~20ms) can be selected. 6. Thinking and expansion In desktop scenarios, applications tend to be more interactive, and the service quality of applications is more reflected in the response time of applications to user operations. Therefore, a smaller sched_wakeup_granularity_ns can be selected to improve the interactivity of applications. |
<<: First release | The creator of the low-code concept has proposed a new development paradigm
Author: Wang Shuyan and Wang Jiarong, Unit: China...
1. Overview This article mainly explains MaxCompu...
UAE-based telecom service provider e&T and No...
Author: Zhu Rongliang, Unit: China Mobile Smart H...
RackNerd's New Year 2024 (AMD Ryzen 7950X ser...
At 14:30 on July 31, 2020, the Kunpeng Applicatio...
During the operation and maintenance of a large p...
Continue from the previous article "Introduc...
It's the last day of the holiday. Did you get...
"5G currently covers all county towns and ur...
What kind of chemical reaction will occur between...
Recently, Maggie Shillington, a cloud computing a...
For the wiring system, the difficulty of construc...
When dealing with complex network environments, i...
[51CTO.com original article] The Huawei Enterpris...