After using microservices, many problems that were originally simple have become complicated, such as the global ID issue! I just happened to use this content in my work recently, so I investigated several common global ID generation strategies on the market and made a brief comparison for your reference. After the database is divided into different libraries and tables, the original primary key auto-increment method is no longer convenient to use, and a new suitable solution needs to be found. Song Ge's demand was raised under such circumstances. Next, let’s take a look at it together. 1. Two approachesGenerally speaking, there are two different approaches to this problem:
These two ideas correspond to different solutions. Let’s look at them one by one. 2. The database is done by itselfThe database handles it by itself, which means that when I insert data, I still don't consider the primary key issue and hope to continue using the database's primary key auto-increment function. However, it is obvious that the original default primary key auto-increment function cannot be used now, and we must have a new solution. 2.1 Modify database configurationThe structure of the database after splitting is as follows (assuming that MyCat is used as the database middleware): At this time, if the original db1, db2, and db3 continue to increase their primary keys, then for MyCat, the primary key will not be self-incrementing, the primary key will be repeated, and the primary key of the data queried by the user from MyCat will have problems. Find the cause of the problem, and the rest will be easy to solve. We can directly modify the starting value and step size of the MySQL database primary key auto-increment. First, we can view the values of the two related variables through the following SQL:
It can be seen that the starting value and step size of the primary key auto-increment are both 1. The starting value is easy to change. It can be set when defining the table. The step size can be achieved by modifying this configuration:
After the modification, check the corresponding variable value and find that it has changed: Now when we insert data again, the primary key will not increase by 1 each time, but by 9 each time. As for the auto-increment starting value, it is actually very easy to set. You can set it when creating the table.
Since MySQL can modify the starting value of auto-increment and the step size of each increase, now assuming that I have db1, db2 and db3, I can set the starting value of auto-increment for the tables in these three databases to 1, 2, and 3 respectively, and the step size of auto-increment is 3, so that auto-increment can be achieved. But it is obvious that this method is not elegant enough, and it is troublesome to handle and inconvenient for future expansion, so it is not recommended. 2.2 MySQL+MyCat+ZooKeeperIf you happen to use MyCat as your database and table sharding tool, then combining it with Zookeeper can also achieve global auto-increment of the primary key. MyCat, as a distributed database middleware, shields the operation of the database cluster, allowing us to operate the database cluster just like a stand-alone database. It has its own solution for primary key auto-increment:
Here we mainly look at solution 4. The configuration steps are as follows:
server.xml
schema.xml Set the primary key to auto-increment and set the primary key to id.
Configure zookeeper information in myid.properties:
sequence_conf.properties Note that the table name should be capitalized.
Finally, restart MyCat, delete the previously created table, and then create a new table for testing. This method is more convenient and has strong scalability. If you choose MyCat as a database and table sharding tool, this is the best solution. The two methods introduced above both handle primary key auto-increment at the database or database middleware level, and our Java code does not require additional work. Next, let's look at several solutions that need to be handled in Java code. 3. Java code processing3.1 UUIDThe most obvious one is UUID (Universally Unique Identifier). The standard format of UUID contains 32 hexadecimal digits, divided into five segments by hyphens, and has 36 characters in the form of 8-4-4-4-12. This is built-in Java and is easy to use. The biggest advantage is that it is generated locally and does not consume network resources. However, any developer in a company knows that this thing is not used much in company projects. The reasons are as follows:
Therefore, UUID is not the best solution. 3.2 SNOWFLAKEThe Snowflake algorithm is a distributed primary key generation algorithm published by Twitter. It can ensure the non-repetitiveness of primary keys of different processes and the orderliness of primary keys of the same process. In the same process, it first ensures non-repetitiveness through the time bit, and if the time is the same, it ensures it through the sequence bit. At the same time, since the time bit is monotonically increasing and if the servers are roughly synchronized in time, the generated primary key can be considered to be generally ordered in a distributed environment, which ensures the efficiency of inserting index fields. For example, the primary key of MySQL's Innodb storage engine. The primary key generated by the snowflake algorithm has four parts in binary representation, from high to low: 1-bit sign bit, 41-bit timestamp bit, 10-bit work process bit, and 12-bit sequence number bit.
The reserved sign bit is always zero.
The number of milliseconds that a 41-bit timestamp can hold is 2 to the 41th power, and the number of milliseconds used in a year is: 365 * 24 * 60 * 60 * 1000. By calculation, we know: Math.pow(2, 41) / (365 * 24 * 60 * 60 * 1000L); the result is approximately equal to 69.73 years. The time epoch of ShardingSphere's snowflake algorithm starts at 0:00 on November 1, 2016, and can be used until 2086. It is believed that it can meet the requirements of most systems.
This flag is unique within a Java process. If it is a distributed application deployment, the id of each working process should be different. The default value is 0 and can be set through properties.
This sequence is used to generate different IDs within the same millisecond. If the number generated within this millisecond exceeds 4096 (2 to the power of 12), the generator will wait until the next millisecond to continue generating. Note: This algorithm has a clock dialback problem. Server clock dialback will cause duplicate sequences. Therefore, the default distributed primary key generator provides a maximum tolerated clock dialback milliseconds. If the clock dialback time exceeds the maximum tolerated milliseconds threshold, the program will report an error; if it is within the tolerable range, the default distributed primary key generator will wait for the clock to synchronize to the time of the last primary key generation before continuing to work. The default value of the maximum tolerated clock dialback milliseconds is 0, which can be set through properties. Below Song Ge gives a tool class for the snowflake algorithm, you can refer to:
Usage is as follows:
3.3 LEAFLeaf is Meituan's open source distributed ID generation system. The earliest demand was the order ID generation demand of each business line. In the early days of Meituan, some businesses directly generated IDs through DB auto-increment, some businesses generated IDs through Redis cache, and some businesses directly used UUID to generate IDs. The above methods each have their own problems, so Meituan decided to implement a set of distributed ID generation services to meet the needs. Currently, Leaf covers many business lines such as Meituan Dianping's internal finance, catering, takeaway, hotel tourism, Maoyan Movie, etc. Based on 4C8G VM, through the company's RPC method call, the QPS stress test result is nearly 5w/s, TP999 1ms (TP=Top Percentile, Top percentage is a term in statistics, which is the same as the average and median. Indicators such as TP50, TP90 and TP99 are often used in system performance monitoring scenarios, referring to situations above 50%, 90%, 99% and other percentiles). Currently, there are two different ways to use LEAF, the segment mode and the SNOWFLAKE mode. You can enable both modes at the same time, or specify a certain mode to enable (both modes are disabled by default). After we clone LEAF from GitHub, its configuration file is in leaf-server/src/main/resources/leaf.properties. The meaning of each configuration is as follows: As you can see, if the number segment mode is used, database support is required; if the SNOWFLAKE mode is used, Zookeeper support is required. 3.3.1 Number segment mode The number segment mode is still based on the database, but the idea has changed a bit, as follows:
If we use the number segment mode, we first need to create a data table. The script is as follows:
The meanings of the fields in this table are as follows:
After the configuration is complete, start the project and access the http://localhost:8080/api/segment/get/leaf-segment-test path (the leaf-segment-test at the end of the path is a business tag) to get the ID. You can access the monitoring page of the segment mode through the following address: http://localhost:8080/cache. Advantages and disadvantages of number segment mode: advantage
shortcoming
3.3.2 SNOWFLAKE mode SNOWFLAKE mode needs to be used with Zookeeper, but SNOWFLAKE's dependency on Zookeeper is weak. After starting Zookeeper, we can configure Zookeeper information in SNOWFLAKE as follows:
Then restart the project. After successful startup, the ID can be accessed through the following address:
3.4 Redis GenerationThis is mainly achieved by using Redis's incrby, which I don't think there is much to say. 3.5 Zookeeper ProcessingZookeeper can also do this, but it is more troublesome and not recommended. 4. SummaryIn summary, if MyCat happens to be used in the project, you can use MyCat+Zookeeper, otherwise it is recommended to use LEAF, both modes are acceptable. This article is reprinted from the WeChat public account "江南一点雨", which can be followed through the following QR code. To reprint this article, please contact the Jiangnan一点雨 public account. |
<<: A table to understand the difference between 5G and Wi-Fi 6
>>: How to promote 5G packages in small and medium-sized cities
There is a wind power plant abroad that mainly us...
TmhHost is a Chinese hosting company founded in 2...
Market research firm Dell'Oro Group has just ...
Aoyo Zhuji is one of the foreign hosting services...
The last time I shared information about ShockHos...
The attacks on the large-scale construction of 5G...
[[350048]] This article is reprinted from the WeC...
While for years cellular technology has been prim...
As IPv4 addresses are about to be exhausted, the ...
Introduction We analyzed the Go native network mo...
DiyVM is a long-established Chinese hosting compa...
While 2020 has brought unprecedented challenges, ...
It has only been a year since the Ministry of Ind...
5G is a new technology field that all countries a...
My previous article, "Once considered the &q...