1. Introduction to URLBefore we discuss the specific optimization of address push performance, we need to first understand something closely related to it - URL. 1. DefinitionWithout mentioning Dubbo, most of us are familiar with the concept of URL. Uniform Resource Locators (RFC1738 -- Uniform Resource Locators (URL)) should be the most well-known RFC specification, and its definition is also very simple. The resources available on the Internet can be represented by simple strings, and this document describes the syntax and semantics of such strings, which are called "Uniform Resource Locators" (URLs). A standard URL format can contain at most the following parts: protocol : // username : password@host : port / path?key = value & key = value Some typical URLs: http://www.facebook.com/friends?param1=value1¶m2=value2 Of course, there are some URLs that are not in line with the convention and are also classified as URLs: 192.168.1.3:20880 2. URLs in DubboIn Dubbo, similar URLs are also used, mainly for transferring data between various extension points. The specific parameters that make up this URL object are as follows:
Some typical Dubbo URLs dubbo : // 192.168 .1 .6 : 20880 / moe .cnkirito .sample .HelloService ?timeout = 3000 It can be said that an implementation in any field can be considered as a type of URL. Dubbo uses URL to uniformly describe metadata and configuration information throughout the entire framework. 2. Dubbo 2.71. URL StructureIn Dubbo 2.7, the structure of URL is very simple, and one class covers everything, as shown in the following figure. 2. Address push modelNext, let's take a look at the address push model solution in Dubbo 2.7. The main performance issues are caused by the following processes: The main process in the above figure is: (1) The user adds or deletes a specific Provider instance of DemoService (commonly seen in cases of capacity expansion or contraction, network fluctuations, etc.); (2) ZooKeeper pushes all instances of DemoService to the Consumer side; (3) The Consumer side regenerates the URL in full based on the data pushed by Zookeeper. According to this solution, when the number of Provider instances is small, the impact on the Consumer side is relatively small, but when a certain interface has a large number of Provider instances, there will be a large number of unnecessary URL creation processes. Dubbo 3.0 has made a series of optimizations mainly for the above push process, which we will explain in detail below. 3. Dubbo 3.01. URL StructureOf course, the optimization of the address push model is still inseparable from the optimization of the URL. The following figure shows the new URL structure used in the process of optimizing the address push model in Dubbo 3.0. From the above figure, we can see that several important attributes in the URL of Dubbo 2.7 no longer exist in Dubbo 3.0, and are replaced by the URLAddress and URLParam classes. The original parameters attribute has been moved to params in URLParam, and other attributes have been moved to URLAddress and its subclasses. Next, we will introduce three new subclasses of URL, among which InstanceAddressURL belongs to the application-level interface address and will not be introduced in this chapter. The main difference between ServiceConfigURL and ServiceAddressURL is that ServiceConfigURL is the URL generated when the program reads the configuration file, while ServiceAddressURL is the URL generated when the registration center pushes some information (such as providers). Here we would like to mention why there is a subclass of DubboServiceAddressURL. According to the current structure, ServiceAddressURL has only this subclass, so all the properties of both can be put into ServiceAddressURL. So why do we need this subclass? In fact, Dubbo 3.0 is designed to be compatible with the HSF framework, and a ServiceAddressURL is abstracted. The HSF framework can inherit this class and use HSFServiceAddressURL. Of course, this class is not reflected at present, so we will briefly mention it here without further explanation. So, let’s discuss why Dubbo 3.0 changed to this data structure, and how this structure is related to the optimization of the address push model! 2. Optimization of address push model
We can see in the class diagram in the previous section that although the original attributes have been moved to URLAddress and URLParam, the URL subclass still has several more attributes. These attributes are naturally added for optimization, so here we will talk about the functions of these attributes. ServiceConfigURL: This subclass adds the attribute property, which is mainly used to make the params of URLParam redundant. It only changes the value type from String to Object, reducing the format conversion cost of obtaining parameters each time in the code. ServiceAddressURL: This subclass and its corresponding subclasses add overrideURL and consumerURL attributes. Among them, consumerURL is the configuration information for the consumer side, and overrideURL is the value written when dynamically configuring on Dubbo Admin. When we call the getParameter() method of the URL, the priority is overrideURL > consumerURL > urlParam. In Dubbo 2.7, the dynamic configuration attributes will replace the attributes in the URL, and the consumption is not negligible when you have a large number of URLs. The overrideURL here avoids this consumption because all URLs will share the same object.
Caching is the focus of Dubbo 3.0's optimization on URLs. This part is also an optimization directly made for the address push model. Next, we will begin to introduce the specific implementation of multi-level caching. First of all, multi-level caching is mainly reflected in the CacheableFailbackRegistry class, which directly inherits from FailbackRegistry. Taking Zookeeper as an example, let's look at the difference between the inheritance structure of Dubbo 2.7 and Dubbo 3.0. You can see that in the CacheableFailbackRegistry cache, we have added three cache attributes: stringAddress, stringParam, and stringUrls. The following figure describes the specific usage scenarios of these three caches. In this solution, we use cache data in three dimensions (URL string cache, URL address cache, and URL parameter cache). In this way, the cached data can be effectively utilized in most cases, reducing the consumption of repeated Zookeeper notifications.
In addition to the optimizations mentioned above, there are actually two other small optimizations. The first is that when parsing a URL, you can directly use the encoded URL string bytes for parsing. In Dubbo 2.7, all encoded URL strings need to be decoded before they can be parsed into URL objects. This method also directly reduces the overhead of the URL decoding process. The second is that the notification mechanism after the URL change increases the delay. The following figure uses Zookeeper as an example to explain the implementation details. In this solution, when the Consumer receives a change notification from Zookeeper, it will actively sleep for a period of time. After the sleep period ends, only the last change will be retained. The Consumer will use the last change to update the listening instance, thereby reducing the overhead of creating a large number of URLs.
In the old version implementation, strings with the same attributes in different URLs are stored in different addresses in the heap, such as protocol, path, etc. When there are a large number of providers, there will be a large number of duplicate strings in the heap on the Consumer side, resulting in low memory utilization. Therefore, another optimization method is provided here, namely string reuse. And its implementation is also very simple, let's take a look at the corresponding code snippet. public class URLItemCache { As can be seen from the above code snippet, string reuse is simply using a Map to store the corresponding cache value. When you use the same string, the existing object will be obtained from the Map and returned to the caller, thereby reducing the number of duplicate strings in the heap memory to achieve an optimization effect. 3. Optimize resultsHere I quoted two figures from the article "Dubbo 3.0 Outlook: Service Discovery Supports Millions of Clusters, Bringing Scalable Microservice Architecture" to illustrate the optimization results. The figure below simulates the consumption on the Consumer side caused by the continuous changes in interface data when there are 2.2 million Provider interfaces. We can see that the entire Consumer side is almost occupied by Full GC, which seriously affects the performance. Then let's take a look at the stress test results in the same environment after optimizing the URL in Dubbo 3.0, as shown in the following figure. We can clearly see that the frequency of Full GC has been reduced to only 3 times, which greatly improves performance. Of course, there are other comparisons in the article, which will not be quoted here. Interested readers can read the article by themselves. About the Author: Wu Zhiguo, active contributor to the Apache Dubbo community |
There have been voices saying that the large-scal...
As today's corporate organizations are active...
What is UDP? UDP is the abbreviation of User Data...
[51CTO.com original article] The Global Software ...
When it comes to comparing SD-WAN vs. VPN service...
Yesterday I received an email from RackNerd, sayi...
The rollout of Wi-Fi 6 will consist of two waves ...
As digital transformation is in full swing, the n...
In IT operation and maintenance, data backup is v...
HostDare released the latest promotion in LET, of...
Recently, Baidu Netdisk released an announcement,...
HostYun has recently opened several new data cent...
ITLDC has released a 50% discount code for all VP...
The computer room of a data center often encounte...
CloudCone officially launched its 6th anniversary...