Zookeeper achieves the final consistency of distributed transactions through the ZAB consistency protocol. ZAB Protocol IntroductionZAB stands for Zookeeper Atomic Broadcast (Zookeeper Atomic Broadcast Protocol) The ZAB protocol is a crash-recovery-supported consistency protocol designed specifically for the distributed coordination service ZooKeeper. Based on this protocol, ZooKeeper implements a master-slave system architecture to maintain data consistency between replicas in the cluster. The message broadcast process of ZAB uses the atomic broadcast protocol, which is similar to the two-phase commit. In response to the client's request, the Leader server generates a corresponding transaction proposal and sends it to all Follower servers in the cluster. Then, the respective votes are collected and the transaction is finally committed. As shown in the figure: In the ZAB protocol, two-phase commit removes the interruption logic. All follower servers either respond to the transaction proposal put forward by the leader normally, or abandon the leader server. At the same time, we can start submitting the transaction proposal after more than half of the follower servers have responded with ACK. The Leader server will assign a global monotonically increasing ID to the transaction proposal, called the transaction ID (ZXID). Since the ZAB protocol needs to ensure the strict causal relationship of each message, each transaction proposal needs to be processed in the order of its ZXID. During the message broadcast process, the Leader server will allocate a queue for each Follower server, then put the transaction proposals into these queues in turn, and send messages according to the FIFO strategy. After receiving the transaction proposal, each Follower server will write the transaction proposal to the local disk in the form of a transaction log, and after successful writing, it will feedback ACK to the Leader server. When the Leader server receives ACKs from more than half of the Follower servers, it sends a COMMIT message and completes the transaction commit. After receiving the COMMIT message, the Follower server also commits the transaction. The reason why the atomic broadcast protocol is adopted is to ensure the consistency of distributed data. More than half of the nodes save data consistency. Message BroadcastYou can think of the message broadcast mechanism as a simplified version of the 2PC protocol, which ensures the sequential consistency of transactions through the following mechanism. When the client submits a transaction request, the Leader node generates a transaction Proposal for each request and sends it to all Follower nodes in the cluster. After receiving feedback from more than half of the Follower nodes, the transaction is submitted. The ZAB protocol uses the atomic broadcast protocol. In the ZAB protocol, only more than half of the Follower nodes need to feedback Ack to submit the transaction. This also leads to data inconsistency after the Leader node crashes. ZAB uses crash recovery to handle the problem of inconsistent numbers. The message broadcast uses the TCP protocol for communication to ensure the order of receiving and sending transactions. When broadcasting messages, the Leader node assigns a globally incremented ZXID (transaction ID) to each transaction Proposal, and each transaction Proposal is processed in the order of ZXID. The Leader node allocates a queue for each Follower node and puts transactions into the queue in the order of transaction ZXID, and sends transactions according to the queue's FIFO rule. After receiving the transaction Proposal, the Follower node will write the transaction to the local disk in the form of a transaction log, and feedback the Ack message to the Leader node after success. The Leader will commit the transaction after receiving the Ack feedback from more than half of the Follower nodes, and broadcast the Commit message to all Follower nodes at the same time. After receiving the Commit, the Follower node will start to commit the transaction; Crash recoveryDuring the message broadcast process, if the leader crashes, can data consistency be guaranteed? When the leader crashes, it will enter the crash recovery mode. In fact, it mainly handles the following two situations.
To address this issue, ZAB defines two principles:
How to ensure that transactions that have been submitted by the leader are committed and transactions that have been skipped are discarded? The core is to process it through ZXID. When recovering after a crash, the largest zxid will be selected as the snapshot for recovery. The advantage of this is that the transaction submission check and transaction discard work can be omitted to improve efficiency. Data SynchronizationAfter the leader election is completed, before the work officially starts, the leader server will confirm whether all transaction proposals in the transaction log (referring to the submitted transaction proposals) have been submitted by more than half of the machines, that is, whether data synchronization is completed. The following is the data synchronization process of the ZAB protocol. The Leader server prepares a queue for each Follower server, and sends transactions that have not been synchronized by the Follower server to the Follower server one by one in the form of transaction proposals, and sends a commit message after each transaction proposal message to indicate that the transaction has been committed. After the Follower server synchronizes all its unsynchronized transaction proposals from the Leader server and applies them to the local database, the Leader server will add the Follower server to the list of truly available Followers. ZXID’s DesignZXID is a 64-bit number, as shown in the figure below. The lower 32 bits are a simple monotonically increasing counter. When the Leader server generates a new transaction proposal, it will add 1 to the counter. The upper 32 bits are used to distinguish different leader servers. Specifically, each time a new leader server is elected, the largest ZXID is taken from the local log of the leader server, the corresponding epoch value is generated, and then 1 is added, and then the value is used as the new epoch. The lower 32 bits are used to generate ZXID starting from 0. (I understand that the epoch here represents the flag of a leader server. Each time a leader server is elected, the epoch value will be updated, indicating that the new leader server will handle transaction requests during this period). The ZAB protocol uses epoch numbers to distinguish leader period changes, which can effectively prevent different leader servers from using the same ZXID. Below is the core code for generating zxid of my Leader node. You can take a look at it.
ZAB protocol implementationThe process of writing dataBelow I sorted out the process of writing data in the zookeeper source code, as shown in the following figure: Referenceshttps://www.cnblogs.com/veblen/p/10985676.html https://zookeeper.apache.org |
<<: The battle for power saving in 5G mobile phones
>>: China Telecom is the best at number portability
I recently read a paper about 5G core network, &q...
People often have good intentions in their hearts...
111.jpg The explosive marketing of the "frag...
On November 27, the National Public Security Orga...
[[311978]] Whether it is 2G, 3G, 4G or 5G, the mo...
Ookla, the parent company of the well-known speed...
In addition to popular devices such as smartphone...
SDN is more than 10 years old. When it first came...
On March 31, Huawei released its 2019 annual repo...
In the past, one had to run back and forth betwee...
[[274498]] Overview SSL (Secure Socket Layer) is ...
According to foreign media reports, British Telec...
Wen Ku, spokesman for the Ministry of Industry an...
MoeCloud also launched a promotion this month, of...
Hello everyone, I am Xiaolin. A reader of FaceByt...