Hello, everyone, I am Zhibeijun. Almost all large websites are distributed systems, and distributed systems are becoming more and more important. You will also find that many things around you are already the best practices of distributed theory. PrefaceA distributed system has three main indicators: consistency, availability, and partition tolerance. Consistency means that after the update operation is successful, the data of all nodes at the same time is completely consistent. Availability refers to whether the system can return data within the normal response time when a user accesses data. Partition tolerance means that when a distributed system encounters a node or network partition failure, it can still provide services that meet consistency and availability. The CAP theory can basically be defined as the above three indicators cannot be achieved in full, that is, only CA, CP, and AP can be satisfied at the same time. 1. Partition ToleranceA distributed system has multiple subsystems, and each subsystem subnetwork can be called a zone. Data between nodes needs to be synchronized, and subsystems also need to communicate. Partition fault tolerance means that if a node fails, the entire service will not be greatly affected. If a service is a single-node service, then it can be said to satisfy CA. C means that the data read at any point in time is the same, ensuring consistency. A means that as long as there are nodes available, the service can be provided Since it is a single-node service, the CA conditions are always met. In a distributed system, most of the time, when a single node fails, we need to ensure that the entire system is usable. Therefore, it can be considered that P in CAP must always hold. 2. ConsistencyConsistency means that when all nodes in a distributed system are accessed at the same time, the returned data is completely consistent. From the client's perspective, it means obtaining data consistency during concurrent access. From the server's perspective, it is a synchronization issue between data nodes, that is, nodes communicate to keep data updated in real time. There are also three types of consistency: strong, weak, and eventual consistency. Strong consistency requires that after data is updated, it must be immediately synchronized and visible to other nodes. Weak consistency allows some nodes to be unable to access the latest data after the data is updated. Eventual consistency requires that the latest data cannot be accessed for a period of time, but after a period of time, all data in the distributed system needs to be unified 3. AvailabilityRelatively speaking, availability is easier to understand, that is, we can get data from the system at any time. The system can provide normal user services. There will be no operation failures, access timeouts, etc. SummarizeSince network communication will inevitably lead to delays and packet loss, partition tolerance is generally a must. Then the distributed system will make a trade-off between consistency and availability, that is, the choice between C and A. CP without A means consistency. Each request requires strong consistency between services. Partitions will extend the synchronization time indefinitely. If the network fails seriously and messages are lost, the user experience will be poor. Access can only be provided after all data is consistent. For example, Redis and Hbase require data consistency. Zookeeper also follows the CP principle. After the Leader node hangs up, the cluster will hold an election. During the election, the entire Follower is unavailable and data synchronization is required. Therefore, it can only be used after Zookeeper is restored. AP without C requires high availability and partitioning, and consistency must be abandoned at this time. When a partition node fails, in order to meet high availability, each node can only use local data to provide access to avoid service interruption. This typical application is ticket grabbing. There were tickets one second ago, but when I entered the verification code and clicked again, the tickets were sold out. The Alpaca in my heart proliferated infinitely. Eureka adopts the AP principle, sacrificing consistency to achieve the purpose of availability. After making concessions on consistency, some systems will achieve eventual consistency and try their best to ensure data consistency throughout the system. There is no fixed combination between consistency and availability in distributed systems. Instead, the choice needs to be made based on different business scenarios. Bragging is useless. What matters is whether it is suitable or not. |
>>: Let's talk about the TCP/IP protocol processing flow
Digital transformation has had ripple effects on ...
Indian telecom operator Bharti Airtel and Ericsso...
[51CTO.com original article] Recently, Huawei lau...
The pace of fiber network deployment is accelerat...
[[177287]] According to the "China Broadband...
Are 5G and 5 GHz Wi-Fi the same thing? No, but te...
Structured cabling standards help organizations a...
Investment in on-premises unified communications ...
On June 30, 2016, China Telecom and Huawei offici...
On the morning of December 8, at the 2016 GNTC Gl...
On April 16-17, the 2021 University Informatizati...
Layer.ae was founded about three years ago and is...
Justhost.ru launched a Black Friday promotion yes...
CMIVPS released a Double 11 promotion plan, which...