introduceIn many business systems, we often encounter the need to generate globally unique distributed IDs, such as IM systems, order systems, etc. So what are the methods for generating globally unique distributed IDs? UUID
"Local generation, fast generation speed, but poor recognition and no order" Can be used to identify pictures, etc., but cannot be used as a database primary key Database auto-increment primary key"When we first started working on the IM system, we created a separate table to obtain the auto-increment ID as the message ID." Creating a separate table to obtain the auto-increment ID will not affect the message sub-library and sub-table Zookeeper"Every time you want to generate a new ID, create a persistent sequential node, create the node number returned by the operation, which is the new ID, and then delete the node that is smaller than your own node." This method can generate fewer IDs because there are fewer digits. Redis「This can be achieved using the incr command」 Set a key to userId with a value of 0. Each time you get userId, add 1 to it and get it again.
Each time you get an ID, there will be a network interaction process with redis, so it can be improved to the following form Directly obtain the maximum value of a userId, cache it locally and slowly accumulate it. When the maximum value of the userId is almost reached, obtain another segment. If a user service crashes, at most a small segment of the userId will not be used.
Snowflake Algorithm"The snowflake algorithm is the most common solution, which satisfies global uniqueness and increasing trend, so it can be used as the database primary key." The Snowflake algorithm is a distributed primary key generation algorithm published by Twitter. It can ensure the non-duplication of primary keys of different processes and the orderliness of primary keys of the same process. In the same process, it is first guaranteed to be non-repetitive through the time bit, and if the time is the same, it is guaranteed through the sequence bit. At the same time, because the time bit is monotonically increasing, and if each server is roughly synchronized, the generated primary key can be considered to be generally ordered in a distributed environment, which ensures the efficiency of inserting index fields. For example, the primary key of MySQL's Innodb storage engine. The primary key generated by the snowflake algorithm has four parts in binary representation, which are divided into 1-bit sign bit, 41-bit timestamp bit, 10-bit work process bit and 12-bit sequence number bit from high to low. 「Sign bit (1 bit)」 The reserved sign bit is always zero. 「Timestamp bit (41bit)」 The number of milliseconds that a 41-bit timestamp can hold is 2 to the 41th power. The number of milliseconds used in a year is: 365 * 24 * 60 * 60 * 1000. By calculation, we know that:
The result is approximately equal to 69.73 years. The time epoch of ShardingSphere's snowflake algorithm starts at 0:00 on November 1, 2016, and can be used until 2086. I believe it can meet the requirements of most systems. 「Work process bit (10bit)」 This flag is unique within a Java process. If it is a distributed application deployment, the id of each working process should be different. The default value is 0 and can be set through properties. "Generally, these 10 bits will be split into two 5-bit segments." The first 5 bits represent the computer room ID, which can represent up to 2^5 computer rooms (32 computer rooms). The last 5 bits represent the machine ID, which can represent 2^5 machines (32 machines) in each computer room. "Therefore, this service can be deployed on a maximum of 2^10 machines, or 1024 machines." 「Serial number bit (12 bits)」 This sequence is used to generate different IDs in the same millisecond. If the number generated in this millisecond exceeds 4096 (2 to the power of 12), the generator will wait until the next millisecond to continue generating. Now that we understand the implementation idea, let's implement the algorithm again
"This code divides the workerid into datacenterid and machineid. If we don't need to make a distinction in our business, we can just use the 10-digit workerid." WorkerID generationWe can use Zookeeper's ordered nodes to ensure the global uniqueness of the id. For example, I create a permanent ordered node with the following command:
"The IP and port can be the IP and port of the application. You set the rules. Just don't repeat them." The 0000000000 in /test/ip-port-0000000000 is our workerid Let me tell you about a workerid duplication problem we encountered in our original production environment. The way to generate workid is very concise.
"Can you see why this code causes duplicate workids?" It uses the number of uid child nodes as the workid. When two applications execute the first line of code at the same time, the number of child nodes is the same, and the obtained workerId will be repeated. What’s interesting is that this code ran fine for several years until the operation and maintenance team improved the application release efficiency a little bit, and then the online version started to report errors. This is because the application was initially released in serial, but later changed to parallel release. "When using the snowflake algorithm, clock rollback may occur. It is recommended to use an open source framework, such as Meituan's Leaf." The snowflake algorithm has been used in many middlewares, such as Seata, to generate a globally unique transaction ID. This article is reprinted from the WeChat public account "Java Knowledge Hall", written by Li Limin. Please contact the Java Knowledge Hall public account to reprint this article. |
<<: Don’t be too eager to “eat meat” with 5G messaging
>>: Cure the difficulty of choosing! What are the differences between 5G, Wi-Fi 6, and Wi-Fi 6E?
According to foreign media reports, Canada will l...
IPSec (Internet Protocol Security) is a security...
DesiVPS sent a new email saying that it has launc...
spinservers is still offering discounts for some ...
In Prague Square, white doves are facing the suns...
The tribe has been sharing that there are a few C...
Fiber optic networks use a variety of devices tha...
The day before yesterday, I received an email fro...
On May 24, Ning Jizhe, deputy director of the Nat...
Topic fuse: What is network slicing? This new con...
Under the severe constraints of the COVID-19 epid...
A few days ago, the 5G logo appeared on the mobil...
[[416673]] In the first year of the 14th Five-Yea...
Since 5G is still in the development and deployme...
API Gateway The reason for the emergence of API g...