Interviewer: What is your understanding of IO multiplexing?

Interviewer: What is your understanding of IO multiplexing?

"IO multiplexing" is a common technical term in programming. There are many frameworks that use this technology, such as Redis, Kafka, Netty, and Nginx. So the question is, what is IO multiplexing? What are its specific implementation technologies? What are the differences between these technologies? Today we will briefly discuss it.

1. What is IO multiplexing?

IO multiplexing is a technology that allows a single thread to manage multiple network connections. It enables the server to efficiently handle a large number of concurrent connections without creating a separate thread or process for each connection.

picture

Imagine if there are tens of thousands of clients, then non-IO multiplexing will have tens of thousands of threads, and then there will be problems of excessive IO contention and multi-thread switching, because there are only a few CPU resources, but there are tens of thousands of threads to be executed.

2. IO multiplexing technology implementation

Commonly used IO multiplexing implementation technologies include: select, poll, epoll and kqueue, etc. Their specific introductions are as follows.

2.1 select

  • Features : select is the earliest multiplexed I/O model, supported on almost all platforms. It monitors multiple file descriptors with one call, waiting for any of them to become readable or writable.
  • limitation :

The number of file descriptors is limited : usually limited to 1024. This limit can be increased by modifying system parameters, but doing so will consume more system resources.

Inefficiency : Each call to select requires copying the file descriptor list to the kernel and then copying it back to user space after checking, which is very inefficient for a large number of file descriptors.

Does not support edge trigger mode : only supports level trigger mode.

Edge Trigger Mode VS Level Trigger Mode

  1. Edge-triggered mode : When a file descriptor changes from unreadable (or unwritable) to readable (or writable), the kernel notifies the application only once. If the application fails to process all available data immediately (for example, the data in the buffer is not completely read), then the kernel will not notify the application again, even if the file descriptor is still readable, until the state of the file descriptor changes again (for example, from readable to unreadable and then back to readable).

Advantages : Reduces the number of system calls and improves efficiency, especially suitable for scenarios with large amounts of data transmission.

Disadvantages : The application must read or write as much data as possible after receiving an event, otherwise the subsequent data may be missed. Therefore, the edge-triggered mode has higher requirements for programming and needs to be handled more carefully.

  1. Level-triggered mode : In level-triggered mode, the kernel will continue to notify the application as long as the file descriptor is in a readable (or writable) state, regardless of whether it has been notified before. This means that if the application fails to process all the data at once, the kernel will continue to send notifications as long as the file descriptor is still in a readable or writable state.
  2. Advantages : Programming is relatively simple, because even if the notification of an event is missed, as long as the state of the file descriptor has not changed, the application still has the opportunity to receive the same event at the next polling.
  3. Disadvantages : May cause more system calls, because even if the data has been partially processed, the kernel will continue to notify the application, which may lead to reduced efficiency.

2.2 poll

  • Features : poll is very similar to select in function, but has no limit on the number of file descriptors. poll uses a pollfd structure array to represent the set of file descriptors to be monitored.
  • Limitations : Although the problem of the number of file descriptors limited by select is solved, there are still performance problems similar to select, that is, each call needs to copy the file descriptor list to the kernel, and also needs to be copied back to user space when returning.

2.3 epoll

  • Features : epoll is a Linux-specific efficient IO multiplexing technology that overcomes all the shortcomings of select and poll. epoll uses three system calls to manage file descriptors: epoll_create creates an epoll instance, epoll_ctl adds/deletes file descriptors to be monitored, and epoll_wait waits for events to occur.
  • Advantages :

Unlimited : There is no limit on the number of file descriptors.

Efficient : Only active file descriptors are passed to user space, reducing unnecessary copy operations.

Powerful functions : Supports two working modes: edge trigger and level trigger.

2.4 kqueue

  • Features : kqueue is an IO multiplexing technology introduced by the FreeBSD operating system, and was later adopted by Mac OS X and other BSD-based operating systems. kqueue can handle multiple types of events at the same time, including but not limited to file descriptor events, signal events, etc.
  • Advantages :

More powerful : It not only supports event notification of file descriptors, but also can handle other types of events.

Excellent performance : Similar to epoll, only active file descriptors are processed, thus improving efficiency.

3. Distinction and contrast

The differences between select, poll, epoll and kqueue are as follows:

Technology Name

Supported Platforms

Connection limit

IO efficiency

Data copy method

select

Cross-platform

Default 1024

O(N)

Copy each call

poll

Cross-platform

none

O(N)

Copy each call

epoll

Linux-specific

none

O(1)

Only copy when epoll_ctl

kqueue

MacOS, FreeBSD, etc.

none

O(1)

The exact implementation may vary from system to system, but is generally efficient.

After-class thinking

What is a "file descriptor"? Why does IO multiplexing require "data copying"?

<<:  Modbus protocol: the cornerstone of industrial communication

>>: 

Recommend

6 ways to remove duplicate URLs! (with detailed code)

[[341325]] This article is reprinted from the WeC...

A table to understand the difference between 5G and Wi-Fi 6

[[415279]] Spectrum Type Traditionally, cellular ...

[5.1]BGPTO: Japan server $64/month, E3-1230v3/16GB/480G SSD/20M Softbank line

BGPTO is promoting a dedicated server in Tokyo, J...

Huawei Connect 2017 previews: Emphasis on cloud implementation and practice

[51CTO.com original article] As a global ICT indu...

CAN bus: operating principle, advantages and disadvantages

The CAN bus was originally designed by Bosch in t...

What is the difference between Private 4G LTE and Private 5G?

Many enterprises are deploying private 4G LTE (sh...

5G, IoT, edge and cloud: a winning combination

The number of 5G connections is expected to grow ...