Interviewer: What is your understanding of IO multiplexing?

Interviewer: What is your understanding of IO multiplexing?

"IO multiplexing" is a common technical term in programming. There are many frameworks that use this technology, such as Redis, Kafka, Netty, and Nginx. So the question is, what is IO multiplexing? What are its specific implementation technologies? What are the differences between these technologies? Today we will briefly discuss it.

1. What is IO multiplexing?

IO multiplexing is a technology that allows a single thread to manage multiple network connections. It enables the server to efficiently handle a large number of concurrent connections without creating a separate thread or process for each connection.

picture

Imagine if there are tens of thousands of clients, then non-IO multiplexing will have tens of thousands of threads, and then there will be problems of excessive IO contention and multi-thread switching, because there are only a few CPU resources, but there are tens of thousands of threads to be executed.

2. IO multiplexing technology implementation

Commonly used IO multiplexing implementation technologies include: select, poll, epoll and kqueue, etc. Their specific introductions are as follows.

2.1 select

  • Features : select is the earliest multiplexed I/O model, supported on almost all platforms. It monitors multiple file descriptors with one call, waiting for any of them to become readable or writable.
  • limitation :

The number of file descriptors is limited : usually limited to 1024. This limit can be increased by modifying system parameters, but doing so will consume more system resources.

Inefficiency : Each call to select requires copying the file descriptor list to the kernel and then copying it back to user space after checking, which is very inefficient for a large number of file descriptors.

Does not support edge trigger mode : only supports level trigger mode.

Edge Trigger Mode VS Level Trigger Mode

  1. Edge-triggered mode : When a file descriptor changes from unreadable (or unwritable) to readable (or writable), the kernel notifies the application only once. If the application fails to process all available data immediately (for example, the data in the buffer is not completely read), then the kernel will not notify the application again, even if the file descriptor is still readable, until the state of the file descriptor changes again (for example, from readable to unreadable and then back to readable).

Advantages : Reduces the number of system calls and improves efficiency, especially suitable for scenarios with large amounts of data transmission.

Disadvantages : The application must read or write as much data as possible after receiving an event, otherwise the subsequent data may be missed. Therefore, the edge-triggered mode has higher requirements for programming and needs to be handled more carefully.

  1. Level-triggered mode : In level-triggered mode, the kernel will continue to notify the application as long as the file descriptor is in a readable (or writable) state, regardless of whether it has been notified before. This means that if the application fails to process all the data at once, the kernel will continue to send notifications as long as the file descriptor is still in a readable or writable state.
  2. Advantages : Programming is relatively simple, because even if the notification of an event is missed, as long as the state of the file descriptor has not changed, the application still has the opportunity to receive the same event at the next polling.
  3. Disadvantages : May cause more system calls, because even if the data has been partially processed, the kernel will continue to notify the application, which may lead to reduced efficiency.

2.2 poll

  • Features : poll is very similar to select in function, but has no limit on the number of file descriptors. poll uses a pollfd structure array to represent the set of file descriptors to be monitored.
  • Limitations : Although the problem of the number of file descriptors limited by select is solved, there are still performance problems similar to select, that is, each call needs to copy the file descriptor list to the kernel, and also needs to be copied back to user space when returning.

2.3 epoll

  • Features : epoll is a Linux-specific efficient IO multiplexing technology that overcomes all the shortcomings of select and poll. epoll uses three system calls to manage file descriptors: epoll_create creates an epoll instance, epoll_ctl adds/deletes file descriptors to be monitored, and epoll_wait waits for events to occur.
  • Advantages :

Unlimited : There is no limit on the number of file descriptors.

Efficient : Only active file descriptors are passed to user space, reducing unnecessary copy operations.

Powerful functions : Supports two working modes: edge trigger and level trigger.

2.4 kqueue

  • Features : kqueue is an IO multiplexing technology introduced by the FreeBSD operating system, and was later adopted by Mac OS X and other BSD-based operating systems. kqueue can handle multiple types of events at the same time, including but not limited to file descriptor events, signal events, etc.
  • Advantages :

More powerful : It not only supports event notification of file descriptors, but also can handle other types of events.

Excellent performance : Similar to epoll, only active file descriptors are processed, thus improving efficiency.

3. Distinction and contrast

The differences between select, poll, epoll and kqueue are as follows:

Technology Name

Supported Platforms

Connection limit

IO efficiency

Data copy method

select

Cross-platform

Default 1024

O(N)

Copy each call

poll

Cross-platform

none

O(N)

Copy each call

epoll

Linux-specific

none

O(1)

Only copy when epoll_ctl

kqueue

MacOS, FreeBSD, etc.

none

O(1)

The exact implementation may vary from system to system, but is generally efficient.

After-class thinking

What is a "file descriptor"? Why does IO multiplexing require "data copying"?

<<:  Modbus protocol: the cornerstone of industrial communication

>>: 

Recommend

Data center "cloudification" solves the embarrassment of virtualization

Virtualization technology is being used more and ...

Is the network model seven layers, five layers, or four layers?

When we are doing network development, we often h...

Fiber Optic Cable Types and Installation Tips

Expanding the presence of fiber optics has become...

Several thinking patterns that need to be changed in the 6G era

First of all, 5/6G is born for the interconnectio...

Ten major trends in the future of industrial Internet

In recent years, major countries around the world...

The three major operators released their operating data for December

China Telecom's mobile user base increased by...

A Preliminary Study on ASP.NET Core Api Gateway Ocelot

[[387094]] This article is reprinted from the WeC...

From the SPACE matrix, is 5G on the road to success?

In September 1830, the world's first intercit...