Do you really understand the connection control in Dubbo?

Do you really understand the connection control in Dubbo?

[[422543]]

This article is reprinted from the WeChat public account "Kirito's Technology Sharing", author kiritomoe. Please contact Kirito's Technology Sharing public account for reprinting this article.

Preface

I just found out that WeChat public accounts have a tagging function, so I tagged all my Dubbo-related articles. After careful counting, this is my 41st original Dubbo article. If you want to see my other Dubbo articles, you can click on the topic tags.

This is an article I wanted to write a long time ago. Recently, I saw a friend in the group share an article about Dubbo connection, which reminded me of this topic. Today I want to talk to you about the topic of connection control in Dubbo. Speaking of "connection control", some readers may not have reacted yet, but you may not be unfamiliar with the following configuration:

  1. <dubbo:reference interface= "com.foo.BarService" connections= "10" />

If you still don't understand the usage of connection control in Dubbo, you can refer to the official document: https://dubbo.apache.org/zh/docs/advanced/config-connections/. By the way, the official documentation of Dubbo has undergone a major change recently, and many familiar documents are almost nowhere to be found. Orz.

As we all know, the default communication of dubbo protocol is long connection, and the connection configuration function is used to determine the number of long connections established between consumers and providers. However, the official document only gives the usage of this function, but does not explain when to configure connection control. This article will mainly discuss this topic.

This article will also cover some knowledge points related to long connections.

##Usage

Let's first look at a simple demo built with Dubbo. We start a consumer (192.168.4.226) and a provider (192.168.4.224) and configure their direct connection.

consumer:

  1. <dubbo:reference id= "userService"   check = "false"  
  2. interface= "org.apache.dubbo.benchmark.service.UserService"  
  3. url= "dubbo://192.168.4.224:20880" />

Provider:

  1. <dubbo:service interface= "org.apache.dubbo.benchmark.service.UserService" ref= "userService" />
  2. <bean id= "userService" class= "org.apache.dubbo.benchmark.service.UserServiceServerImpl" />

Persistent connections are invisible and intangible, so we need an observable task to "see" them. After starting the provider and consumer, you can use the following command to view the TCP connection status:

  • On Mac, you can use: lsof -i:20880
  • In Linux, you can use: netstat -ano | grep 20880

Provider:

  1. [root ~]# netstat -ano | grep 20880
  2. tcp6 0 0 192.168.4.224:20880 :::* LISTEN off (0.00/0/0)
  3. tcp6 2502 0 192.168.4.224:20880 192.168.4.226:59100 ESTABLISHED off (0.00/0/0)

consumer:

  1. [root@ ~]# netstat -ano | grep 20880
  2. tcp6 320 720 192.168.4.226:59110 192.168.4.224:20880 ESTABLISHED on (0.00/0/0)

Through the above observations we can discover several facts.

The above TCP connection already exists just by starting the provider and consumer. Please note that I did not trigger the call. In other words, the default strategy of Dubbo to establish a connection is at address discovery, not at call time. Of course, you can also modify this behavior by lazy loading lazy="true", so that the establishment of the connection can be delayed until the call time.

  1. <dubbo:reference id= "userService"   check = "false"  
  2. interface= "org.apache.dubbo.benchmark.service.UserService"  
  3. url= "dubbo://${server.host}:${server.port}"  
  4. lazy= "true" />

In addition, you can find that there is only one long connection between the consumer and the provider. 20880 is the default open port of the Dubbo provider, which is the same as the default open port 8080 of tomcat, and 59110 is a port randomly generated by the consumer. (I have communicated with some friends before and found that many people don’t know that consumers also need to occupy a port)

Today's protagonist "Connection Control" can control the number of long connections. For example, we can configure it as follows

  1. <dubbo:reference id= "userService"   check = "false"  
  2. interface= "org.apache.dubbo.benchmark.service.UserService"  
  3. url= "dubbo://192.168.4.224:20880"  
  4. connections= "2" />

Start the consumer again and observe the long connection situation

Provider:

  1. [root@ ~]# netstat -ano | grep 20880
  2. tcp6 0 0 192.168.4.224:20880 :::* LISTEN off (0.00/0/0)
  3. tcp6 2508 96 192.168.4.224:20880 192.168.4.226:59436 ESTABLISHED on (0.00/0/0)
  4. tcp6 5016 256 192.168.4.224:20880 192.168.4.226:59434 ESTABLISHED on (0.00/0/0)

consumer:

  1. [root@ ~]# netstat -ano | grep 20880
  2. tcp6 0 2520 192.168.4.226:59436 192.168.4.224:20880 ESTABLISHED on (0.00/0/0)
  3. tcp6 48 1680 192.168.4.226:59434 192.168.4.224:20880 ESTABLISHED on (0.00/0/0)

As you can see, there are now two long connections.

When do you need to configure multiple long connections?

Now we know how to control the connection, but when should we configure how many long connections? At this time, I can tell you that it depends on the production situation, but if you often read my official account, you will definitely know that this is not my style. What is my style? Benchmark!

Before writing this article, I had a brief discussion with several colleagues and netizens on this topic. In fact, there is no conclusion, except for the different throughputs of single connection and multiple connections. Refer to the previous Dubbo github issues, such as: https://github.com/apache/dubbo/pull/2457. I also participated in the discussion of this PR. To be honest, I was skeptical. My view at the time was that multiple connections might not necessarily improve the throughput of the service (still quite conservative, not so absolute).

Next, let’s use benchmarks to talk about this. The test project is still our old friend. We will use the dubbo-benchmark project officially provided by Dubbo.

  • Test project address: https://github.com/apache/dubbo-benchmark.git
  • Test environment: 2 Alibaba Cloud Linux 4c8g ECS

The test project has been introduced in the previous article, so I will not go into details here. The test plan is also very simple. Two rounds of benchmarks are conducted to observe the throughput of the test method when connections=1 and connections=2 respectively.

Just do it, skip a bunch of test steps and give the test results directly.

  1. Benchmark Mode Cnt Score Error Units
  2. Client.createUser thrpt 3 22265.286 ± 3060.319 ops/s
  3. Client.existUser thrpt 3 33129.331 ± 1488.404 ops/s
  4. Client.getUser thrpt 3 19916.133 ± 1745.249 ops/s
  5. Client.listUser thrpt 3 3523.905 ± 590.250 ops/s

connections=2

  1. Benchmark Mode Cnt Score Error Units
  2. Client.createUser thrpt 3 31111.698 ± 3039.052 ops/s
  3. Client.existUser thrpt 3 42449.230 ± 2964.239 ops/s
  4. Client.getUser thrpt 3 30647.173 ± 2551.448 ops/s
  5. Client.listUser thrpt 3 6581.876 ± 469.831 ops/s

From the test results, it seems that the difference between single connection and multiple connections is very large, almost twice as much! It seems that the effect of connection control is really good, but is it really the case?

After the first test with this solution, I didn't quite believe the result, because I had previously tested multiple connections in other ways, and I had also participated in the 3rd Middleware Challenge, which gave me the understanding that most of the time, a single connection can often give the best performance. Even for hardware reasons, the gap should not be twice. With this question in mind, I began to study whether there was something wrong with my test scenario?

Identify problems with the test solution

After discussing with Flash, his words finally helped me identify the problem.

I wonder if you can immediately identify the problem after reading my conversation with the Flash.

The biggest problem with the previous test plan was that the variables were not controlled well. As everyone knows, when the number of connections changes, the number of IO threads actually used also changes.

Dubbo uses Netty to implement persistent connection communication. When it comes to the relationship between persistent connection and IO thread, we need to introduce Netty's connection model. In a nutshell, Netty's IO worker thread and channel settings are a one-to-many binding relationship, that is, after a channel is connected, all IO operations will be completely handled by one IO thread. Let's take a look at how Dubbo sets up the worker thread group of NettyClient and NettyServer:

Client org.apache.dubbo.remoting.transport.netty4.NettyClient:

  1. private static final EventLoopGroup NIO_EVENT_LOOP_GROUP = eventLoopGroup(Constants.DEFAULT_IO_THREADS, "NettyClientWorker" );
  2.    
  3. @Override
  4. protected void doOpen() throws Throwable {
  5. final NettyClientHandler nettyClientHandler = new NettyClientHandler(getUrl(), this);
  6. bootstrap = new Bootstrap();
  7. bootstrap.group (NIO_EVENT_LOOP_GROUP)
  8. . option (ChannelOption.SO_KEEPALIVE, true )
  9. . option (ChannelOption.TCP_NODELAY, true )
  10. . option (ChannelOption. ALLOCATOR, PooledByteBufAllocator. DEFAULT );
  11. ...
  12. }

Constants.DEFAULT_IO_THREADS is hard-coded in org.apache.dubbo.remoting.Constants

  1. int DEFAULT_IO_THREADS = Math. min (Runtime.getRuntime().availableProcessors() + 1, 32);

On my 4c8g machine, the default is 5.

Server org.apache.dubbo.remoting.transport.netty4.NettyServer:

  1. protected void doOpen() throws Throwable {
  2. bootstrap = new ServerBootstrap();
  3.  
  4. bossGroup = NettyEventLoopFactory.eventLoopGroup(1, "NettyServerBoss" );
  5. workerGroup = NettyEventLoopFactory.eventLoopGroup(
  6. getUrl().getPositiveParameter(IO_THREADS_KEY, Constants.DEFAULT_IO_THREADS),
  7. "NettyServerWorker" );
  8.  
  9. final NettyServerHandler nettyServerHandler = new NettyServerHandler(getUrl(), this);
  10. channels = nettyServerHandler.getChannels();
  11.  
  12. ServerBootstrap serverBootstrap = bootstrap. group (bossGroup, workerGroup)
  13. .channel(NettyEventLoopFactory.serverSocketChannelClass());
  14. . option (ChannelOption.SO_REUSEADDR, Boolean. TRUE )
  15. .childOption(ChannelOption.TCP_NODELAY, Boolean. TRUE )
  16. .childOption(ChannelOption.ALLOCATOR, PooledByteBufAllocator.DEFAULT );
  17.                  
  18. }

The server side can be configured, for example, we can control the number of IO threads on the server side through the protocol:

  1. <dubbo:protocol name = "dubbo" host= "${server.host}" server= "netty4" port= "${server.port}" iothreads= "5" />

If not set, the logic is consistent with the client, which is core + 1 thread.

Well, here is the problem. Since I did not set any IO threads, the client and server will open 5 IO threads by default. When connections=1, Netty will bind channel1 to an IO thread, and when connections=2, Netty will bind channel1 and channel2 to NettyWorkerThread-1 and NettyWorkerThread-2 in sequence, so there will be two IO threads working, and such test results are of course unfair.

The actual situation needs to be considered here. In actual production, most of the time it is a distributed scenario, and the number of connections must be greater than the number of IO threads. Therefore, it is basically impossible for the number of channels in the test scenario to be less than the number of IO threads.

The solution is also very simple. We need to control the variables to make the number of IO threads consistent and only observe the impact of the number of connections on throughput. For the server, you can configure iothreads=1 at the protocol layer; for the client, since the source code is hard-coded, I can only modify the source code and re-package it locally so that the number of client IO threads can also be specified through the -D parameter.

After the transformation, we obtained the following test results:

1 IO thread 1 connection

  1. Benchmark Mode Cnt Score Error Units
  2. Client.createUser thrpt 3 22265.286 ± 3060.319 ops/s
  3. Client.existUser thrpt 3 33129.331 ± 1488.404 ops/s
  4. Client.getUser thrpt 3 19916.133 ± 1745.249 ops/s
  5. Client.listUser thrpt 3 3523.905 ± 590.250 ops/s

1 IO thread 2 connections

  1. Benchmark Mode Cnt Score Error Units
  2. Client.createUser thrpt 3 21776.436 ± 1888.845 ops/s
  3. Client.existUser thrpt 3 31826.320 ± 1350.434 ops/s
  4. Client.getUser thrpt 3 19354.470 ± 369.486 ops/s
  5. Client.listUser thrpt 3 3506.714 ± 18.924 ops/s

It can be found that simply increasing the number of connections will not increase the throughput of the service. Such test results are more in line with my expectations.

Summarize

Judging from the results of the above tests, some configuration parameters are not necessarily better when they are larger. I have also analyzed similar examples in scenarios such as multi-threaded file writing. Only theoretical analysis + actual testing can lead to convincing conclusions. Of course, personal tests may also lead to errors due to the omission of local key information. For example, if I did not eventually find the implicit correlation between the number of IO threads and the number of connections, it would be easy to draw the wrong conclusion that the number of connections is proportional to the throughput. Of course, it does not necessarily mean that the final conclusion of this article is reliable. It may still be imperfect. You are also welcome to leave a message and make comments and suggestions.

Finally, back to the original question, when should we configure Dubbo's connection control? According to my personal experience, most of the time, the number of connections in the production environment is very large. You can pick an online host and roughly count it through netstat -ano | grep 20880 | wc -l. Generally, it is far more than the number of IO threads. There is no need to configure multiple connections. The number of connections and throughput do not have a linear growth relationship.

The Dubbo framework having this capability and everyone really needing to use this capability are two completely different things. I believe that most readers should have passed the stage where the project is driven by the novelty of technology, right? If one day you need to control the number of connections to achieve a certain special purpose, you will sincerely sigh that Dubbo is really powerful and has all these extension points.

Is Dubbo's connection control really completely useless? Not entirely. My test scenarios are still very limited, and different hardware may produce different results. For example, in the third middleware performance challenge, I achieved the best result with two connections, not a single connection.

Finally, if you only use Dubbo to maintain your microservice architecture, in most cases you don’t need to pay attention to the connection control feature. Just spend more time moving bricks. That’s it, I’m moving bricks too.

<<:  Sangfor Launches SIP Falcon Edition: A Lightweight and Cost-Effective Full-Flow Threat Analysis System

>>:  The three major operators have completed the deployment of IMS network interconnection and 2G/3G network withdrawal has been accelerated

Recommend

What do we need to do to make IPv6 a reality?

After the General Office of the CPC Central Commi...

From HTTP to HTTPS, it turns out to be so simple

[[354426]] 【51CTO.com original article】 HTTP Begi...

WebHorizon: $10.56/year-256MB/5G SSD/200GB/Japan VPS

WebHorizon is a foreign VPS hosting company estab...

What is optical network?

Optical networking is a technology that uses ligh...

Regarding "computing power", this article is worth reading

In today’s article, let’s talk about computing po...

Why do 5G mobile phones support more frequency bands?

How many 5G frequency bands a mobile phone can su...