Compared with Apache's synchronous IO model, Nginx outperforms the former due to its adoption of NIO. Nginx is lightweight, occupies less system resources, and naturally supports high concurrency. Today we will briefly discuss the thread model of nginx. Please note that it is not the process model.
Everyone should be familiar with the IO model of nginx. In simple terms, it consists of a master process and multiple worker processes (the number of processes is determined by the configuration); the master process is responsible for accepting requests and queuing them, and finally forwarding them to the worker process for the entire process of request processing and response. Nginx runs in multi-process mode. Nginx provides multi-threading in version 1.7.11, but this multi-threading is only used for local file operations in the aio model. The starting point is to use non-blocking mode to improve the efficiency and concurrency of file IO. Therefore, this multi-threading does not mean that nginx processes proxy requests through multi-threading (this part is through epoll mode), but is used to process some local static files. There are several basic instructions involved here: sendfile, aio and directio, all of which are related to the operation of local files. Next, let's take a look at their meanings one by one. sendfile This instruction has the same semantics as the system function sendfile(), which aims to improve the efficiency of sending local files through sockets. The official blog introduces how to use nginx thread pool aio to achieve 9 times the performance. It also has a more memorable name, called zero copy. So what is the difference between this and the traditional way of reading files and sending them to the network? Disks, network drives, and memory are three different transmission media. If you read a file locally and send it through a socket, it usually goes through the following steps:
It can be seen that the data transmission process involves multiple copies, which is limited by the design of the computer system. The main purpose of sendfile is to reduce data copying to improve sending efficiency. sendfile is a Linux system-level call. The socket can directly access file data through DMA (direct memory access) and send it through the transmission protocol, reducing two data copies (disk to kernel, kernel to workspace). The sendfile_max_chunk parameter is used to limit the maximum data size sent by each sendfile() call. If the size is not limited, the entire worker process will be monopolized. The default is "unlimited". This is too overbearing. For nginx, proxying static local file resources (usually small files) will be very efficient. It is recommended to enable this parameter for some static files such as HTML, pictures, etc.
directio This instruction is used to enable the use of the O_DIRECT flag (BSD, Linux), corresponding to the directio() system call. This parameter is set for large files, sendfile is for small files. You can specify a limited size through directio. For files exceeding this size, directio will be used (instead of sendfile). According to the original design of directio, it has the basic principles of sendfile, but does not use the kernel cache, but directly uses DMA, and the memory cache (page-aligned part) will also be released after use. Therefore, directio is usually suitable for reading large files, and the reading frequency is usually very low. Because for high-frequency reading, it does not improve efficiency (because it does not reuse the cache, but DMA every time). Due to the performance trade-off, this parameter defaults to off.
aio When it comes to the aio model, the semantics are basically the same, that is, asynchronous file IO. Nginx turns this feature off by default, and it needs to be supported on a high version of the Linux platform (2.6.22+). On Linux, directio can only read blocks aligned on 512-byte boundaries, and unaligned blocks at the end of the file will be read in blocking mode. Similarly, if the file is not aligned at the beginning, the entire file will be read in a blocking manner. The so-called alignment here refers to the cache status of the file data in the memory page. When both aio and sendfile are enabled, the aio mechanism will be used for files whose size is larger than the directio setting value: that is, when the file is smaller than the directio setting value, sendfile will be used directly (aio will not be involved). In simple terms, aio uses multi-threaded asynchronous mode to read large files to improve IO efficiency, but in fact there may not be any improvement. Because the reading of large files cannot use cache and is time-consuming. Even with multi-threading, the waiting time for requests is unpredictable, especially when the number of concurrent requests is high. However, aio can improve the concurrency of IO, which is certain. By default, multithreading mode is disabled. We need to enable it by using the --with-threads configuration. This feature is only compatible on platforms that support epoll and kqueue. For thread pool settings, we can declare it through thread_pool and specify it in the aio directive. Our configuration file has been further enriched.
When all threads in the thread pool are in a busy state, new task requests will be added to the waiting queue. We can use the max_queue parameter in thread_pool to specify the queue size. The default queue size is 65536. When the queue is full, subsequent requests will throw an error. END Nginx officially claims that using multi-threaded mode can improve performance by 9 times in the aio file reading scenario, but I still have some doubts about this test. Multithreading + aio can indeed improve the reading performance of file IO to a certain extent, but for large files, this does not seem to be as good as expected. This is subject to the inherent characteristics of the underlying Linux platform, unless nginx itself performs additional operations on the file cache. So far, xjjdog still has the following suggestions (for reference):
|
<<: Remember who was to blame for a thread pool-induced fault?
>>: Google reports: CBRS deployments doubled from March to April
01 Mini Program Breakthrough Plan Since Alipay op...
In order to use more affordable mobile data, I be...
Recently, I received a submission from DotdotNetw...
[[379606]] This article is reprinted from the WeC...
The last time I shared news about Ramnode was in ...
[[420379]] Let’s review the message sending seque...
[51CTO.com original article] On March 21, Huawei ...
[[400629]] Recently, 5G has become a hot topic on...
The LuxVPS domain name was registered in June 202...
Sharktech is a shark computer room (or SK compute...
There are already more than 1,100 “5G+Industrial ...
Virtualization technology is being used more and ...
This month, edgeNAT upgraded the bandwidth of its...
Lisahost has launched a new Singapore VPS host, S...