This article is reprinted from the WeChat public account "The Road to Advanced Architecture", written by Zhang Zhang. Please contact the WeChat public account "The Road to Advanced Architecture" to reprint this article. Hello, everyone. I am Zhang Zhang, the author of the public account "The Road to Improving Architecture". I recently talked with my teammates about the consensus algorithm Raft, so I looked up the previously published articles and reviewed them. 1. Overview of Raft AlgorithmWhen we have only one service node, there is no problem of node consensus. When there are multiple different service nodes, the problem of distributed consistency will be introduced. Raft is a protocol for achieving distributed consensus. Consensus means that multiple nodes reach a consensus on something, even in the event of some node failures, network delays, or network partitions. Main application scenarios:
What is the main problem to be solved? Distributed storage systems usually improve system availability by maintaining multiple copies, but the cost is one of the core issues of distributed storage systems: maintaining data consistency among multiple copies. 2. Raft Algorithm Implementation ProcessTo improve understanding, Raft divides the consistency algorithm into several parts, including leader selection, log replication, and safety, and uses stronger consistency to reduce the states that must be considered. This article uses a short story as an example to facilitate quick understanding. 2.1 Leader ElectionThe department needs to set up a new service team, and now there are three students A, B, and C. In order to facilitate the unified allocation of resources and management needs in the later stage, it is now necessary to elect a group leader from among the three students. A feels that he is capable of being a leader, so he says to B and C, "Come and vote for me, I want to be a leader." At this time, A becomes a candidate and has cast a vote for himself in advance. 1) If B and C have never thought about becoming a leader, they will say "OK, vote for you" → A gets 3 votes and is elected leader 2) If B had thought about becoming the leader before, B voted for himself and C voted for A → A received 2 votes (more than half of the 3 votes) and was elected leader 3) If B and C have voted for themselves → A, B, and C each get their own vote, the election fails and a new one is initiated 4) If B had thought about becoming the leader before, and C had already voted for B → B received 2 votes (more than half of the 3 people) and was elected leader From the above election process, we can find that a node is definitely in one of the following three states at any time:
The transition process of these three states is shown in the following figure: Election Process Step 1: Follower becomes Candidate If Followers cannot hear the Leader's opinion, they can become Candidates. Step 2: Candidates win votes Vote for yourself and send a voting request to other nodes. The nodes will respond after receiving the request. Step 3: Wait for other nodes to reply If a candidate receives more than half of the votes from the nodes (including its own vote), it becomes the Leader. If the candidate is informed that a leader has been generated, it will switch to follower automatically If more than half of the votes are not received within a period of time, the candidate status will be maintained and the election will be re-initiated. Step 4: Candidate wins the election The new Leader will immediately send a message to all nodes to prevent other nodes from triggering a new election. 2.2 Log SynchronizationAfter the leader election in 2.1 above, the group leader has been selected. Here we assume that A has been elected as the leader. It can take on some operation tasks proposed by the docking party (called the client). It is stipulated that each demand connection must be approved by the group leader. When an employee makes an operation request, the leader records it after receiving it and synchronizes it with other students in the group. Only after other students have confirmed this demand will the leader confirm the operation and synchronize the execution result to the employee (Follower node). Log Replication After the Leader election process, a new Leader node is generated, and all changes to the system must be implemented through the Leader node. Step 1: Leader appends log entry Every change to the system is added as an entry to the node's log Step 2: Leader issues Append Entries RPC in parallel and waits for response The Leader will wait until more than half of the nodes have written the entry, the Leader node commits it, and then the Leader notifies the Follower that the entry has been submitted. Step 3: The leader gets the majority of responses and applies entry to the state machine State machine: It can be understood as a deterministic application. The so-called determinism means that as long as the input is the same, any state machine will calculate the same output. Step 4: Leader replies to Client and notifies Follower to apply log The cluster has now reached a consensus on the system status Schematic diagram of log-based replicated state machine: Several questions about the application process Q1 What if the Client requests access to the Follower node? Answer: The Follower node will forward the request to the Leader node. Q2 What should be done when the logs of the Leader and Follower are inconsistent? answer: 1) The Leader handles log inconsistencies by forcing Followers to copy its logs. Inconsistent logs on Followers will be overwritten by the Leader’s log. 2) In order to make the Followers’ logs consistent with its own, the Leader needs to find the place where the Followers’ logs are consistent with its own, and then overwrite the Followers’ entries after that location. 3) The Leader will try from the back to the front, and try the previous log entry each time AppendEntries fails, until it successfully finds the consistent position of each Follower's log, and then overwrites the entries of the Followers after that position one by one. 2.3 Security AssuranceIn order to ensure the stability of the team's operation, there are several default requirements: 2.3.1 Election Security That is, at most one leader is elected in any term. If there is more than one leader in the system at the same time, it is called a brain split, which will result in data overwrite and loss. A team is only allowed to have one leader at a time (except in special circumstances such as election failure). Otherwise, multiple leaders handling demands and issuing orders at the same time may easily lead to inconsistent pace within the team. In Raft, two points guarantee this property: 1) A node can only cast one vote at most during a term; 2) Only the node that obtains the majority vote will become the leader. 2.3.2 Log Matching Integrity If two students in the same team are currently working on the same task, then their previous work records should also be consistent. That is: the same initial state + the same operation = the same end state The Leader encapsulates the client requests into log entries and copies these log entries to other Follower nodes. Everyone applies these requests in sequence, so the final state is definitely consistent. Conclusion of Raft log synchronization: 1) If two entries in different logs have the same index and term, then the commands they store are the same. 2) If two entries in different logs have the same index and term, then all previous entries are exactly the same. 2.3.3 Leader Data Integrity The successor leader of the team must be aware of the previous work content of the team, because all work records during the leader’s tenure will be handed over. If a log entry is committed in a certain term, then this log entry will definitely appear in the logs of all higher-term leaders. Raft log overwriting rules: 1) A log is considered committed only when it is copied to the majority node 2) A node can become a leader only after receiving majority votes, and one of the prerequisites for node A to vote for node B is that B’s log cannot be older than A’s log. ConclusionAll algorithm implementation principles are actually reflections of real social working models. Understanding complex consistency algorithms by connecting them to actual cases in life can help us achieve twice the result with half the effort. This article aims to give you a simple introduction to the raft protocol. If you are interested in learning more, I recommend two good links: 1) Raft visual test and Raft implemented in various languages: https://raft.github.io/ 2) Raft Algorithm - Animated Demo (a good introductory tutorial): http://thesecretlivesofdata.com/raft/ |
<<: What exactly does the Communications Design Institute do?
>>: By 2028, the global 5G infrastructure market will reach US$80.5 billion
Private 5G is the next evolution of networks for ...
As traditional industrial control systems and equ...
The strategic combination of 5G and WiFi6 network...
edgeNAT has released the latest promotion for Aug...
On November 25-26, 2016, the WOT 2016 Big Data Te...
Keepalive is a high-availability component that i...
background Technology, life, opinions, originalit...
On February 14, the Ministry of Industry and Info...
As one of the earliest attempts at 5G commercial ...
Earlier this month, we shared the news that HostY...
With the changes in traffic flows used in modern ...
A few days ago, a netizen left a message asking h...
5G has been commercially available for a year, an...
[51CTO.com original article] "Huawei Enterpr...
Preface Pods can communicate with each other with...