An article on learning Go network library Gnet analysis

Introduction

We analyzed the Go native network model and some source code. In most scenarios (99%), using native netpoll is sufficient.

However, in the case of massive concurrent connections, native netpoll will start a goroutine process for each connection, that is, 10 million connections will create 10 million goroutines.

This provides room for optimization in these special scenarios, which is probably one of the reasons why tools like gnet and cloudwego/netpoll were created.

In essence, their underlying cores are the same, and are all based on epoll (linux). It's just that after an event occurs, each library handles it differently.

This article mainly analyzes gnet. As for the usage, I won't post it. gnet has a corresponding demo library, you can experience it yourself.

Architecture

Directly quoting a picture from the gnet official website:

gnet uses "master-slave multiple Reactors". That is, a master thread is responsible for listening to port connections. When a client connection comes, it assigns the connection to one of the sub-threads according to the load balancing algorithm. The corresponding sub-thread handles the read and write events of the connection and manages its death.

The picture below makes it clearer.

Core Structure

Let's first explain some core structures of gnet.

The engine is the top-level structure of the program.

The listener corresponding to ln is the listener corresponding to the listening port after the service is started.
The loadBalancer corresponding to lb is the load balancer. That is, when the client connects to the service, the load balancer will select a sub thread and hand over the connection to this thread for processing.
mainLoop is our main thread, and the corresponding structure is eventloop. Of course, our sub-thread structure is also eventloop. The structure is the same, but the responsibilities are different. The main thread is responsible for listening to the client connection events that occur on the port, and then the load balancer assigns the connection to a sub-thread. The sub-thread is responsible for binding the connection assigned to it (more than one), waiting for subsequent read and write events of all the connections it manages, and processing them.

Next, look at eventloop.

netpoll.Poller: Each eventloop corresponds to an epoll or kqueue.
Buffer is used as a buffer for reading messages.
connCoun records the number of TCP connections currently stored in eventloop.
udpSockets and connections respectively manage all udp sockets and tcp connections under this eventloop. Note their structure map. The int type here stores fd.

Corresponding to the conn structure.

There are several fields here:

Buffer: stores the latest data sent by the current conn peer (client). For example, if it is sent three times, the buffer stores the third data at this time. The code says so.
inboundBuffer: stores the remaining data sent by the peer and not read by the user, also known as a Ring Buffer.
outboundBuffer: stores data that has not yet been sent to the peer. (For example, the data that the server responds to the client. Since conn fd is not blocked, when the write call returns that it cannot be written, the data can be placed here first.)

conn is equivalent to each connection having its own independent cache space. This is done to reduce the lock problem caused by centralized memory management. Ring buffer is used to increase the reuse of space.

That’s the overall structure.

Core Logic

When the program starts,

The number of eventloop cycles, that is, the number of subthreads, will be determined according to the options set by the user. Furthermore, in the Linux environment, it is how many epoll objects will be created.

Then the number of epoll objects of the entire program is count(sub)+1(main Listener).

The above picture is what I said. The corresponding eventloop will be created according to the set quantity, and the corresponding eventloop will be registered with the load balancer.

When a new connection comes, one of the eventloops is selected and assigned to it according to a certain algorithm (gnet provides polling, least connections and hash).

Let's look at the main thread first. (Since I'm using a Mac, the implementation of IO multiplexing later is kqueue code, but the principle is the same.)

Polling is waiting for network events to arrive, passing a closure parameter, or more precisely, a callback function when an event arrives. As the name suggests, it is used to handle new connections.

As for the Polling function.

The logic is simple, a for loop waits for events to arrive and then processes the events.

There are two types of main thread events:

One is a normal fd network connection event.

One is an event that is activated immediately via NOTE_TRIGGER.

The NOTE_TRIGGER trigger tells you that there is a task in the queue, so go and execute the task.

If a normal network event arrives, the closure function is processed, and the main thread processes the accept connection function above.

The logic of accept connection is very simple, get the connection fd, set the fd to non-blocking mode (think about what would happen if the connection is blocked?), then select a sub thread according to the load balancing algorithm, and assign this connection to it through the register function.

Register does two things. First, it needs to register the current connection to the epoll or kqueue object of the current sub thread and add a read flag.

The next step is to put the current connection into the connections map structure fd->conn.

In this way, when the corresponding sub-thread event arrives, you can find which connection it is through the fd of the event and perform corresponding processing.

If it is a readable event.

The analysis is almost over here.

Summarize

In gnet, you can see that basically all operations are lock-free.

That is because when an event arrives, non-blocking operations are adopted, and each corresponding fd (conn) is processed serially. Each conn operates on its own cache space. At the same time, all events triggered in one round are processed before the next waiting cycle begins, solving the concurrency problem at this level.

Of course, users also need to pay attention to some issues when using this. For example, if users want to process logic asynchronously in a custom EventHandler, they cannot open a g and then obtain the current data in it as shown below.

Instead, you should get the data first and then process it asynchronously.

As mentioned in the issues, connections are stored using map[int]*conn. The scenario of gnet itself is massive concurrent connections, which will require a lot of memory. In addition, storing pointers in big maps will cause a great burden on GC. After all, it is not like an array, which is a continuous memory space and easy for GC to scan.

Another point is that when processing buffer data, as you can see above, the essence is to copy the buffer data to the user, so there is a lot of copy overhead. At this point, byte netpoll implements Nocopy Buffer, which I will study another day.

<<: F5 Launches NGINX for Microsoft Azure, Delivering Secure, High-Performance Applications to the Azure Ecosystem

>>: Can't catch the three-way handshake process? Then come and catch a packet with me!

SiberDC: $1.3/month Türkiye VPS - dual core/1GB RAM/30G SSD/1Gbps unlimited traffic

As the strongest voice in the field of enterprise communications, Youyin Communications has been focusing on this for 13 years!

Blog

Will Wi-Fi cost more than 5G connections?

F5 Networks Appoints Adam Judd to Lead APAC Sales Efforts to Accelerate Growth of Cloud and Security Business in the Region

Beijing, March 21, 2017 – Today, F5 Networks anno...

China Academy of Information and Communications Technology: Cumulative domestic shipments of 5G mobile phones reached 163 million units from January to December

Today, the China Academy of Information and Commu...

OneTechCloud starts at 30% off, Hong Kong BGP/CN2, US CN2, CN2 GIA high defense optional

OneTechCloud (Yikeyun) continues to offer promoti...

HostYun newly launches Japan EQ data center 10Gbps bandwidth VPS monthly payment starting from 18 yuan

HostYun has recently opened several new data cent...

An article on learning Go network library Gnet analysis

Introduction

Architecture

Core Structure

Core Logic

Summarize

SiberDC: $1.3/month Türkiye VPS - dual core/1GB RAM/30G SSD/1Gbps unlimited traffic

Yecao Cloud 618 promotion, Hong Kong CN2+BGP line VPS annual payment starts from 139 yuan

Inner Mongolia promotes "Internet +" and explores government cloud construction

A thorough investigation of the history behind Huawei's high-quality Wi-Fi ONTs

How will your life change in the next five years? Artificial intelligence may enter various industries like hydropower

As the strongest voice in the field of enterprise communications, Youyin Communications has been focusing on this for 13 years!

Will Wi-Fi cost more than 5G connections?

Design and analysis of weak current intelligent system in intelligent building

[Black Friday] HawkHost virtual hosts up to 30% off, reseller hosts up to 50% off, cloud servers up to 30% off

Recommend

Why does TCP need a three-way handshake?

Basic forms of edge computing in the 5G era

The real year of 5G: What it means for cloud technology

7 bond modes of Linux multi-NIC

How to unleash the power of the tactile internet through 5G networks

F5 Networks Appoints Adam Judd to Lead APAC Sales Efforts to Accelerate Growth of Cloud and Security Business in the Region

The best solution is to merge telecom operators into two companies, and the next five years will be the best period

Understand fiber-based LAN architecture

197 Fortune 500 companies choose Huawei to achieve win-win in the new ICT era

LOCVPS: Hong Kong Kwai Wan/Kwai Wan High Defense #2 lifetime 35% off, 20% off for all items

Let’s talk about deterministic networks

Why 5G and IoT security is more important than ever

China Academy of Information and Communications Technology: Cumulative domestic shipments of 5G mobile phones reached 163 million units from January to December

OneTechCloud starts at 30% off, Hong Kong BGP/CN2, US CN2, CN2 GIA high defense optional

HostYun newly launches Japan EQ data center 10Gbps bandwidth VPS monthly payment starting from 18 yuan