Accident review: We duplicated the order ID!

Accident review: We duplicated the order ID!

[[428490]]

introduce

In many business systems, we often encounter the need to generate globally unique distributed IDs, such as IM systems, order systems, etc. So what are the methods for generating globally unique distributed IDs?

UUID

  1. // 3eece1c6-5b57-4bce-a306-6c49e44a1f90
  2. UUID.randomUUID().toString()

"Local generation, fast generation speed, but poor recognition and no order"

Can be used to identify pictures, etc., but cannot be used as a database primary key

Database auto-increment primary key

"When we first started working on the IM system, we created a separate table to obtain the auto-increment ID as the message ID." Creating a separate table to obtain the auto-increment ID will not affect the message sub-library and sub-table

Zookeeper

"Every time you want to generate a new ID, create a persistent sequential node, create the node number returned by the operation, which is the new ID, and then delete the node that is smaller than your own node."

This method can generate fewer IDs because there are fewer digits.

Redis

「This can be achieved using the incr command」

Set a key to userId with a value of 0. Each time you get userId, add 1 to it and get it again.

  1. set userId 0
  2. incr usrId //Return 1

Each time you get an ID, there will be a network interaction process with redis, so it can be improved to the following form

Directly obtain the maximum value of a userId, cache it locally and slowly accumulate it. When the maximum value of the userId is almost reached, obtain another segment. If a user service crashes, at most a small segment of the userId will not be used.

  1. set userId 0
  2. incr usrId //Return 1
  3. incrby userId 1000 //Return 10001

Snowflake Algorithm

"The snowflake algorithm is the most common solution, which satisfies global uniqueness and increasing trend, so it can be used as the database primary key."

The Snowflake algorithm is a distributed primary key generation algorithm published by Twitter. It can ensure the non-duplication of primary keys of different processes and the orderliness of primary keys of the same process.

In the same process, it is first guaranteed to be non-repetitive through the time bit, and if the time is the same, it is guaranteed through the sequence bit. At the same time, because the time bit is monotonically increasing, and if each server is roughly synchronized, the generated primary key can be considered to be generally ordered in a distributed environment, which ensures the efficiency of inserting index fields. For example, the primary key of MySQL's Innodb storage engine.

The primary key generated by the snowflake algorithm has four parts in binary representation, which are divided into 1-bit sign bit, 41-bit timestamp bit, 10-bit work process bit and 12-bit sequence number bit from high to low.

「Sign bit (1 bit)」

The reserved sign bit is always zero.

「Timestamp bit (41bit)」

The number of milliseconds that a 41-bit timestamp can hold is 2 to the 41th power. The number of milliseconds used in a year is: 365 * 24 * 60 * 60 * 1000. By calculation, we know that:

  1. Math.pow(2, 41) / (365 * 24 * 60 * 60 * 1000L);

The result is approximately equal to 69.73 years. The time epoch of ShardingSphere's snowflake algorithm starts at 0:00 on November 1, 2016, and can be used until 2086. I believe it can meet the requirements of most systems.

「Work process bit (10bit)」

This flag is unique within a Java process. If it is a distributed application deployment, the id of each working process should be different. The default value is 0 and can be set through properties.

"Generally, these 10 bits will be split into two 5-bit segments."

The first 5 bits represent the computer room ID, which can represent up to 2^5 computer rooms (32 computer rooms). The last 5 bits represent the machine ID, which can represent 2^5 machines (32 machines) in each computer room.

"Therefore, this service can be deployed on a maximum of 2^10 machines, or 1024 machines."

「Serial number bit (12 bits)」

This sequence is used to generate different IDs in the same millisecond. If the number generated in this millisecond exceeds 4096 (2 to the power of 12), the generator will wait until the next millisecond to continue generating.

Now that we understand the implementation idea, let's implement the algorithm again

  1. public class SnowFlake {
  2.  
  3. /**
  4. * Starting timestamp
  5. */
  6. private final static long START_STMP = 1480166465631L;
  7.  
  8. /**
  9. * The number of bits each part occupies
  10. */
  11. private final static long SEQUENCE_BIT = 12; //Number of digits occupied by the sequence number
  12. private final static long MACHINE_BIT = 5; //Number of bits used by the machine ID
  13. private final static long DATACENTER_BIT = 5; //The number of bits occupied by the data center
  14.  
  15. /**
  16. * The maximum value of each part
  17. */
  18. private final static long MAX_DATACENTER_NUM = -1L ^ (-1L << DATACENTER_BIT);
  19. private final static long MAX_MACHINE_NUM = -1L ^ (-1L << MACHINE_BIT);
  20. private final static long MAX_SEQUENCE = -1L ^ (-1L << SEQUENCE_BIT);
  21.  
  22. /**
  23. * The displacement of each part to the left
  24. */
  25. private final static long MACHINE_LEFT = SEQUENCE_BIT;
  26. private final static long DATACENTER_LEFT = SEQUENCE_BIT + MACHINE_BIT;
  27. private final static long TIMESTMP_LEFT = DATACENTER_LEFT + DATACENTER_BIT;
  28.  
  29. private long datacenterId; //Data center
  30. private long machineId; //machine ID
  31. private long sequence = 0L; //Serial number
  32. private long lastStmp = -1L; //Last timestamp
  33.  
  34. public SnowFlake(long datacenterId, long machineId) {
  35. if (datacenterId > MAX_DATACENTER_NUM || datacenterId < 0) {
  36. throw new IllegalArgumentException( "datacenterId can't be greater than MAX_DATACENTER_NUM or less than 0" );
  37. }
  38. if (machineId > MAX_MACHINE_NUM || machineId < 0) {
  39. throw new IllegalArgumentException( "machineId can't be greater than MAX_MACHINE_NUM or less than 0" );
  40. }
  41. this.datacenterId = datacenterId;
  42. this.machineId = machineId;
  43. }
  44.  
  45. /**
  46. * Generate the next ID
  47. *
  48. * @return  
  49. */
  50. public synchronized long nextId() {
  51. long currStmp = getNewstmp();
  52. // Clock callback occurs
  53. if (currStmp < lastStmp) {
  54. throw new RuntimeException( "Clock moved backwards. Refusing to generate id" );
  55. }
  56.  
  57. if (currStmp == lastStmp) {
  58. //In the same millisecond, the serial number increases automatically
  59. sequence = ( sequence + 1) & MAX_SEQUENCE;
  60. //The number of sequences in the same millisecond has reached the maximum
  61. if ( sequence == 0L) {
  62. currStmp = getNextMill();
  63. }
  64. } else {
  65. //In different milliseconds, the sequence number is set to 0
  66. sequence = 0L;
  67. }
  68.  
  69. lastStmp = currStmp;
  70.  
  71. return (currStmp - START_STMP) << TIMESTMP_LEFT //Timestamp part
  72. | datacenterId << DATACENTER_LEFT //Data center part
  73. | machineId << MACHINE_LEFT //machine identification part
  74. | sequence ; //sequence number part
  75. }
  76.  
  77. private long getNextMill() {
  78. long mill = getNewstmp();
  79. while (mill <= lastStmp) {
  80. mill = getNewstmp();
  81. }
  82. return mill;
  83. }
  84.  
  85. private long getNewstmp() {
  86. return System.currentTimeMillis();
  87. }
  88.  
  89. public   static void main(String[] args) {
  90. SnowFlake snowFlake = new SnowFlake(2, 3);
  91.  
  92. for ( int i = 0; i < (1 << 12); i++) {
  93. System. out .println(snowFlake.nextId());
  94. }
  95.  
  96. }
  97. }

"This code divides the workerid into datacenterid and machineid. If we don't need to make a distinction in our business, we can just use the 10-digit workerid."

WorkerID generation

We can use Zookeeper's ordered nodes to ensure the global uniqueness of the id. For example, I create a permanent ordered node with the following command:

  1. # Create a root node
  2. create /test ''  
  3. # Create a permanent ordered node
  4. create -s /test/ip-port- ''  
  5. # Return Created /test/ip-port-0000000000

"The IP and port can be the IP and port of the application. You set the rules. Just don't repeat them."

The 0000000000 in /test/ip-port-0000000000 is our workerid

Let me tell you about a workerid duplication problem we encountered in our original production environment. The way to generate workid is very concise.

  1. // uid is an ordered persistent node in zookeeper
  2. List<String> pidListNode = zkClient.getChildren( "uid" );
  3. String workerId = String.valueOf(pidListNode. size ());
  4. zkClient.create ( "uid" , new byte[0], CreateMode.PERSISTENT_SEQUENTIAL);

"Can you see why this code causes duplicate workids?"

It uses the number of uid child nodes as the workid. When two applications execute the first line of code at the same time, the number of child nodes is the same, and the obtained workerId will be repeated.

What’s interesting is that this code ran fine for several years until the operation and maintenance team improved the application release efficiency a little bit, and then the online version started to report errors. This is because the application was initially released in serial, but later changed to parallel release.

"When using the snowflake algorithm, clock rollback may occur. It is recommended to use an open source framework, such as Meituan's Leaf."

The snowflake algorithm has been used in many middlewares, such as Seata, to generate a globally unique transaction ID.

This article is reprinted from the WeChat public account "Java Knowledge Hall", written by Li Limin. Please contact the Java Knowledge Hall public account to reprint this article.

<<:  Don’t be too eager to “eat meat” with 5G messaging

>>:  Cure the difficulty of choosing! What are the differences between 5G, Wi-Fi 6, and Wi-Fi 6E?

Recommend

Canada to launch 3500MHz 5G spectrum auction

According to foreign media reports, Canada will l...

Understanding Internet Protocol Security — IPSec

​IPSec (Internet Protocol Security) is a security...

SmartHost adds block storage (large hard drive VPS), 256GB for only $1

The day before yesterday, I received an email fro...

What’s going on? Can I use 5G network without a 5G package?

A few days ago, the 5G logo appeared on the mobil...

6G research should be prepared for a rainy day

Since 5G is still in the development and deployme...