This article is reprinted from the WeChat public account "Program New Vision", and the author is Ugly Fat Man Second Senior Brother. Please contact the WeChat public account "Program New Vision" to reprint this article. Currently, most projects are developing towards distributed systems. Once a system adopts a distributed system, more complex scenarios and solutions will be introduced. For example, when you use Elasticsearch and ZooKeeper clusters in your system, have you ever understood the "split brain" phenomenon of the cluster? Do you know how they solve the split brain problem? If you don’t understand all of this, then your understanding of distribution is too superficial. I recommend you read this article. Let's take ZooKeeper as an example to show you how to solve the split-brain phenomenon in a distributed system. What is split-brain?In cluster environments such as Elasticsearch and ZooKeeper, there is a common feature, that is, they have a "brain". For example, there is a Master node in the Elasticsearch cluster and a Leader node in the ZooKeeper cluster. The Master or Leader node in the cluster is usually elected. When the network is normal, the Leader can be elected smoothly (Zookeeper is used as an example in the following). However, when the network communication between the two computer rooms fails, the election mechanism may select two Leaders in different network partitions. When the network is restored, how should the two Leaders handle data synchronization? Who should they listen to? This is the "split brain" phenomenon. In layman's terms, split-brain means "brain splitting", where one "brain" is split into two or more. Imagine if a person has multiple brains, and they are independent of each other, it will cause the body to "dance" and "not obey orders". Now that we understand the basic concept of split-brain, let's take the scenario of Zookeeper cluster as an example to analyze the occurrence of split-brain. Split brain in zookeeper clusterWhen we use ZooKeeper, we rarely encounter split-brain phenomenon because ZooKeeper has taken corresponding measures to reduce or avoid the occurrence of split-brain. The specific solution of ZooKeeper will be discussed later. Now, let's assume that ZooKeeper has not taken these measures to prevent split-brain. In this case, let's see how the split-brain problem occurs. There are currently 6 zkServer services forming a cluster, deployed in 2 computer rooms: Split Brain Under normal circumstances, the cluster will only have one Leader. When the Leader fails, the other five services will re-elect a new Leader. If the network between Computer Room 1 and Computer Room 2 fails, and Zookeeper's majority mechanism is not considered for the time being, the following situation will occur: Split Brain That is to say, the three services in Computer Room 2 detected that there was no leader, so they started to re-elect a new leader. The original cluster was divided into two clusters, and two "brains" appeared at the same time. This is the so-called "split brain" phenomenon. Since the original one cluster has become two, both of which provide external services, the data between the two clusters may become inconsistent after a period of time. When the network is restored, the problems of who will be the leader, how to merge data, and how to resolve data conflicts arise. Of course, the above process is just a problem that will occur if we assume that Zookeeper does not take any measures to prevent split-brain. So, how does Zookeeper deal with the split-brain problem? Zookeeper's majority ruleThere are many measures to prevent brain splits, and Zookeeper adopts the "majority rule" by default. The so-called majority rule means that during the leader election process, if a zkServer obtains more than half of the votes, then this zkServer can become the leader. The underlying source code is implemented as follows:
The above code passes in the number of valid nodes in the cluster when constructing the QuorumMaj object; the containsQuorum method provides a method for determining whether a zkServer has obtained more than half of the votes, where set.size represents the number of votes obtained by a zkServer. There are two core points in the above code: first, how to calculate half; second, the comparison of which votes belong to half. Take the 6 servers in the above figure as an example: half = 6 / 2 = 3, which means that at least 4 machines must vote to become the leader in the election. Then, in the case of the two computer rooms being disconnected from the Internet, since there are only 3 servers in computer room 1 and computer room 2, it is impossible to elect a leader. In this case, the entire cluster will have no leader. Split Brain In the absence of a Leader, Zookeeper will be unable to provide external services, so when designing and building a cluster, we must avoid this situation. If the deployment requests of the two computer rooms are not 3:3, but 3:2, that is, three servers in computer room 1 and two servers in computer room 2: In the above case, first calculate half = 5 / 2 = 2, which means that more than 2 machines are needed to elect a leader. Then, for computer room 1, the leader can be elected normally. For computer room 2, since there are only 2 servers, the leader cannot be elected. At this time, the entire cluster has only one leader. For the above figure, the same is true when it is reversed. For example, if there are only two servers in computer room 1 and three servers in computer room 2, when the network is disconnected, the election situation is as follows: The Zookeeper cluster uses a majority mechanism to achieve either no Leader or only one Leader, thus avoiding the split-brain problem. In addition to preventing brain splits, the majority mechanism can also achieve fast elections. Because the majority mechanism does not need to wait for all zkServers to vote for the same zkServer before electing a leader, it is also called the fast leader election algorithm. Competition between old and new leadersThe majority rule can prevent the split-brain phenomenon caused by partitioning the computer room, but there is another situation where the Leader is suspended. Suppose a leader dies and the remaining followers elect a new leader. At this time, the old leader is resurrected and still considers itself the leader, and write requests to other followers will be rejected. Because ZooKeeper maintains a variable called epoch, every time a new leader is generated, an epoch number is generated (indicating the current reign of that leader), and the epoch is incremented. If followers confirm the existence of the new leader and know its epoch, they will reject all requests whose epoch is less than the current leader's epoch. Are there any followers who don't know about the new leader? It's possible, but it's definitely not the majority, otherwise the new leader cannot be generated. ZooKeeper's writes also follow the quorum mechanism, so writes that don't get majority support are invalid. Even if the old leader thinks he is the leader, it still has no effect. Why should ZooKeeper cluster nodes be deployed in an odd number?The majority rule was mentioned above. Since Zookeeper uses this strategy by default, this brings up another question: What is the appropriate number of clusters? The number of Zookeeper nodes we see is usually an odd number. Why is that? First of all, as long as more than half of the machines in the cluster are working properly, the entire cluster can provide external services. Then we list some situations and see the fault tolerance of the cluster in these situations. If there are two nodes, the cluster will become unavailable if one node fails. At this point, the cluster tolerance is 0; If there are 3 nodes, then if 1 node goes down, there are still 2 normal nodes left, which is more than half, and re-election can be carried out to provide normal services. At this time, the tolerance of the cluster is 1; If there are 4 nodes, then if 1 node goes down, 3 nodes will be left, which is more than half, and re-election can be carried out. But if another node goes down, and only 2 nodes are left, election and service will not be able to proceed normally. At this time, the tolerance of the cluster is 1; By analogy, for 5 nodes, the tolerance is 2; for 6 nodes, the tolerance is also 2; Since the tolerance of 3 nodes and 4 nodes, 5 nodes and 6 nodes, that is, 2n and 2n-1, is the same, all n-1, then in order to save resources and be more efficient (more nodes participate in elections and communication), why not add one more node? This is why clusters should be deployed in odd numbers. Common ways to resolve split-brainThe majority rule used by Zookeeper was mentioned above. Here we summarize the scenario methods for solving the split-brain problem. Method 1: Quorums For example, for a cluster with 3 nodes, Quorums = 2, which means that the cluster can tolerate the failure of 1 node. At this time, 1 leader can be elected and the cluster is still available. For example, for a cluster with 4 nodes, Quorums = 3. Quorums must be greater than 3, which is equivalent to the cluster's tolerance being 1. If 2 nodes fail, the entire cluster is still invalid. This is the default method used by ZooKeeper to prevent "brain split". Method 2: Add a heartbeat line Multiple communication methods are used in the cluster to prevent the nodes in the cluster from being unable to communicate due to the failure of one communication method. For example, add a heartbeat line. Originally, there was only one heartbeat line. If it is disconnected, the heartbeat report cannot be received and the other party is judged to be dead. If there are two heartbeat lines, one is disconnected, and the other can still receive heartbeat reports, which can ensure the normal operation of cluster services. The heartbeat lines can also be HA (high availability). The two heartbeat lines can also detect each other. If one is disconnected, the other will work immediately. Under normal circumstances, it will not work, saving resources. Method three: start the disk lock mode. By using disk locks, we can ensure that only one leader in the cluster can obtain the disk lock and provide services to the outside world to avoid data confusion. However, there is also a problem. If the leader node goes down, it cannot actively release the lock, and other followers will never be able to obtain shared resources. So someone designed a "smart" lock in HA. The party currently serving will only enable the disk lock when it finds that all heartbeat lines are disconnected (the other party cannot detect it). It is not locked at other times. Method four: arbitration mechanism. The consequence of a split brain is that the slave node does not know which leader to connect to. At this time, an arbitration party can solve this problem. For example, a reference IP address is provided. When the heartbeat mechanism is disconnected, the nodes ping the reference IP. If the ping fails, it means that there is a problem with the node network. In this case, the node needs to withdraw from the resource competition, release the shared resources it occupies, and give the service provision function to a node with more comprehensive functions. The above methods can be used simultaneously to reduce the occurrence of split-brain in the cluster, but they cannot be guaranteed. For example, if two machines in the arbitration mechanism fail at the same time, there will be no leader available in the cluster. At this time, manual intervention is required. summaryWe often say that our system uses distribution, but do we really understand some scenarios and solutions in distributed scenarios? Have you learned from the analysis and solution introduction of the split-brain scenario in this article? Let's learn together. |
<<: GSMA: Global 5G connections will reach 1.8 billion by 2025
>>: Large-scale replication is the key to the development of 5G industry applications
"PQ HOSTING PLUS" SRL is a foreign host...
When it comes to the hottest technology in the wo...
China's 5G era has arrived as promised! The f...
On October 22, at a press conference held by the ...
Downloading a high-definition movie in the blink ...
The situation is tense and there is little conten...
At the 2021 China Optical Network Conference whic...
New technologies always bring with them a lot of ...
5G technology has been around for more than four ...
HostDare continues to offer discount codes for CN...
Before delving into the details of layer 3 switch...
Many connected devices today are able to take adv...
As an important member of the new infrastructure ...
Digital transformation has become a social consen...