In the previous article, we learned about the five network I/O models of Unix standards, and learned about their core differences and their respective advantages and disadvantages. In particular, the I/O multiplexing model has a very good advantage in high-concurrency scenarios. Netty also uses the I/O multiplexing model. So how does Netty implement I/O multiplexing? Netty is actually also a packaged framework. Its essence is to use the Java NIO package (New IO, not the NIO of the network I/O model, Nonblocking IO) package, which uses I/O multiplexing. Therefore, this article serves as a prerequisite knowledge + high-frequency interview question chapter (manual dog head), let's take a deeper look at the I/O multiplexing model. This article is expected to take 5 minutes to read and will focus on answering the following two questions:
1. Implementation of I/O multiplexing mode This is the diagram we used in the previous article on I/O multiplexing. Let’s review the I/O multiplexing model. The IO of multiple processes can be registered to a multiplexer (selector), and then one process calls select, which will monitor all registered IO. For example. In BIO mode, one teacher (application process/thread) can only handle the problem of one student (IO stream) at a time. If there are 10 students, 10 teachers are needed to provide one-on-one explanations. In the IO multiplexing model, we configure a monitor (multiplexer Selector) for the teacher. The monitor is responsible for observing which of the 10 students in the class wants to ask questions. Once a student raises his hand, the monitor will feedback to the teacher to deal with the question of the student who raised his hand. In this way, only one teacher is needed, and the teacher only needs to pay attention to the feedback from the class monitor to deal with the corresponding students' problems in a timely manner. Let's take a closer look at three implementations of I/O multiplexing: select, poll, and epoll.
2. select The Linux system provides a function select for developers to use the select multiplexing mechanism. What this function does is: Through polling, you can monitor multiple file descriptors at the same time to see if any of the three types of IO events, read, write, or exception, has occurred. Finally, it returns the number of file descriptors where IO events occurred, as well as the file descriptors in which the three events, read events, write events, and exception events, occurred respectively (the three parameters of readfds, writefds, and errorfds).
We use the teacher-monitor-classmate model to understand this process.
Note that under the select function, the teacher only knows that some students have changed, but to know which students have changed, he needs to poll the class list (xxxfds), find the students who raised their hands, and then communicate with them. The disadvantages of select are obvious:
3. poll The implementation of poll is very similar to select, so we will not repeat it, but just introduce the difference. The poll function is as follows: The main difference is the way of describing the fd set. poll uses the pollfd structure instead of the fd_set structure. The pollfd structure uses a linked list instead of an array, which results in no limit on the length of pollfd. However, if the length of pollfd is too large, performance will be degraded. In addition, the principles of the two are basically the same, that is, multiple descriptors are also polled and processed according to the status of the descriptors. Therefore, the defects of the two are basically the same. 4. epoll The full name of epoll is eventpoll, which is implemented based on event events and is a Linux-specific I/O multiplexing function. It is very different from select\poll in implementation and usage:
Different from select, which uses three fd_sets to correspond to read/write/abnormal IO changes, epoll specifically defines an epoll_event structure, which is used as a logical encapsulation of read/write/abnormal IO changes, called an event. 4.1 Three core functions of epoll epoll divides the original select/poll call into three functions.
4.2 Implementation principle of epoll When a process calls the epoll_create() method, the kernel space creates an eventpoll structure. There are two member variables in this structure that are closely related to the use of epoll. The structure is as follows:
Use the epoll_ctl() method to add the newly added monitoring event event to the red-black tree rbr. It also registers a callback function for the kernel interrupt handler to tell the kernel that if the interrupt of this handle arrives, put it in the ready list. Once a file descriptor is ready, the kernel will use a callback mechanism similar to callback to quickly activate the file descriptor, and the triggered event will be added to the eventpoll linked list rdlist by the callback function . When calling the epoll_wait() method to check whether an event has occurred, it only needs to check whether there are elements in the rdlist linked list in the eventpoll object. If there is data in the linked list, the corresponding modified event event is copied to the events array variable of the epoll_wait() method, and the user can obtain it.
In this way, the efficiency of epoll_wait is very high. Because when calling epoll_wait, there is no need to copy all the connection handle data to the operating system, and the kernel does not need to traverse all the connections. 4.3 Is shared memory used in epoll? Many blogs have mentioned this:
But is this really the case? There is no password in front of the source code, let's take a look at the source code directly. Refer to the source code of eventpoll.c. https://github.com/torvalds/linux/blob/master/fs/eventpoll.c The specific epoll_wait calling relationship is shown in the figure below. We can see the specific instructions in put_user. Therefore, events are indeed copied from kernel space to user space, and no shared memory is used. 5. Comparison of three implementations Through the above analysis, I believe everyone has understood the implementation of select/poll/epoll. The following table summarizes their main differences. Overall, the implementation performance of epoll is better than that of select/poll. Of course, if there are always a lot of active connections, the efficiency of epoll_wait may not be high, because the callback function of epoll_wait is triggered too frequently. Therefore, epoll is most suitable for scenarios where there are a large number of connections but a small number of active connections. bibliography: 《Linux High Performance Server Programming》 Recommended collection of popular notes from the past:
|
Hosteons has recently started to experiment with ...
VPSMS is shared in the blog for the first time. I...
Previously, China Unicom officially announced tha...
[[422813]] TCP implementation principle and why t...
My previous article, "Once considered the &q...
An automation-first approach is one of the most e...
With the Chinese women's volleyball team'...
[[431016]] Preface Seeing this title, you may say...
Today I will share with you some knowledge about ...
The cities we live in are becoming more and more ...
According to the overall arrangement for the form...
Yecao Cloud has prepared some year-end and New Ye...
From the blank of 1G to the comprehensive leaders...
[[439558]] This article is reprinted from the WeC...