CAP, can it be without P? Did you know?

Hello, everyone, I am Zhibeijun.

Almost all large websites are distributed systems, and distributed systems are becoming more and more important. You will also find that many things around you are already the best practices of distributed theory.

Preface

A distributed system has three main indicators: consistency, availability, and partition tolerance.

Consistency means that after the update operation is successful, the data of all nodes at the same time is completely consistent.

Availability refers to whether the system can return data within the normal response time when a user accesses data.

Partition tolerance means that when a distributed system encounters a node or network partition failure, it can still provide services that meet consistency and availability.

The CAP theory can basically be defined as the above three indicators cannot be achieved in full, that is, only CA, CP, and AP can be satisfied at the same time.

1. Partition Tolerance

A distributed system has multiple subsystems, and each subsystem subnetwork can be called a zone. Data between nodes needs to be synchronized, and subsystems also need to communicate. Partition fault tolerance means that if a node fails, the entire service will not be greatly affected.

If a service is a single-node service, then it can be said to satisfy CA.

C means that the data read at any point in time is the same, ensuring consistency.

A means that as long as there are nodes available, the service can be provided

Since it is a single-node service, the CA conditions are always met.

In a distributed system, most of the time, when a single node fails, we need to ensure that the entire system is usable. Therefore, it can be considered that P in CAP must always hold.

2. Consistency

Consistency means that when all nodes in a distributed system are accessed at the same time, the returned data is completely consistent.

From the client's perspective, it means obtaining data consistency during concurrent access.

From the server's perspective, it is a synchronization issue between data nodes, that is, nodes communicate to keep data updated in real time.

There are also three types of consistency: strong, weak, and eventual consistency.

Strong consistency requires that after data is updated, it must be immediately synchronized and visible to other nodes.

Weak consistency allows some nodes to be unable to access the latest data after the data is updated.

Eventual consistency requires that the latest data cannot be accessed for a period of time, but after a period of time, all data in the distributed system needs to be unified

3. Availability

Relatively speaking, availability is easier to understand, that is, we can get data from the system at any time. The system can provide normal user services. There will be no operation failures, access timeouts, etc.

Summarize

Since network communication will inevitably lead to delays and packet loss, partition tolerance is generally a must. Then the distributed system will make a trade-off between consistency and availability, that is, the choice between C and A.

CP without A means consistency. Each request requires strong consistency between services. Partitions will extend the synchronization time indefinitely. If the network fails seriously and messages are lost, the user experience will be poor. Access can only be provided after all data is consistent. For example, Redis and Hbase require data consistency. Zookeeper also follows the CP principle. After the Leader node hangs up, the cluster will hold an election. During the election, the entire Follower is unavailable and data synchronization is required. Therefore, it can only be used after Zookeeper is restored.

AP without C requires high availability and partitioning, and consistency must be abandoned at this time. When a partition node fails, in order to meet high availability, each node can only use local data to provide access to avoid service interruption. This typical application is ticket grabbing. There were tickets one second ago, but when I entered the verification code and clicked again, the tickets were sold out. The Alpaca in my heart proliferated infinitely. Eureka adopts the AP principle, sacrificing consistency to achieve the purpose of availability.

After making concessions on consistency, some systems will achieve eventual consistency and try their best to ensure data consistency throughout the system.

There is no fixed combination between consistency and availability in distributed systems. Instead, the choice needs to be made based on different business scenarios. Bragging is useless. What matters is whether it is suitable or not.

<<: F5's 2022 State of Application Strategy Report shows that edge deployment and load security have become the focus in the Asia-Pacific region

>>: Let's talk about the TCP/IP protocol processing flow

The world's largest brain-like supercomputer is launched: it has millions of processor cores

Digital Ecosystem Conference | "Wisdom comes from gathering, power comes from integration" - 2021 Digital Ecosystem Conference grandly held

Blog

Hostwinds: VPS/cloud server from $4.99/month, Seattle/Dallas/Netherlands data center, supports Alipay

As we all know, self-looping of different VLAN interfaces will not cause broadcast storms, but will it be blocked by STP?

Take a simple topology: In this topology, G0/0/1 ...

5 must-know SD-WAN security myths

It is undeniable that SD-WAN security is crucial,...

CAP, can it be without P? Did you know?

Preface

1. Partition Tolerance

2. Consistency

3. Availability

Summarize

The world's largest brain-like supercomputer is launched: it has millions of processor cores

5G and AI Use Cases - How 5G Helps Implement Artificial Intelligence

It’s time to consider leaf-spine network architecture

CloudCone: $68/month-E3-1240v1/16GB/1TB SSD/40TB/5IP/Los Angeles Data Center

These specialized and innovative "little giants" gather in the Tianfu Software Park Industrial Cluster

Digital Ecosystem Conference | "Wisdom comes from gathering, power comes from integration" - 2021 Digital Ecosystem Conference grandly held

Hostwinds: VPS/cloud server from $4.99/month, Seattle/Dallas/Netherlands data center, supports Alipay

ERP, CRM, SRM, PLM, HRM, OA...what do they all mean?

Four major issues and three major directions: these are what you should know about 5G pre-commercial use!

V5.NET: Huawei Cloud Private Line (Hong Kong) Server Limited 30% Discount Monthly Payment Starting from 318 Yuan

Recommend

How will the three major operators fight in 2022?

5G standard draft released: it’s not just the speed that changes

Different Lianyu Yitong, different SD-WAN+

To build a future-oriented IT foundation, H3C Group helps Goertek build a factory in Vietnam

This article illustrates the principles of Kubernetes network communication

China's optical network system spending slowed down in Q3

DiyVM: Hong Kong/Japan/US VPS 2G memory starting from 50 yuan/month, CN2 line optional

GaN and SiC power semiconductor market expected to exceed $4.5 billion by 2027

What does the increasingly popular 5G public network dedicated service mean?

5G development: Don’t be afraid of the clouds blocking your view

IT maintenance: Five aspects of daily switch maintenance, all practical information!

Can Huawei reshape the Internet?

[Black Friday] Launchvps 40% off, 1G memory KVM monthly payment starts from $2.57

As we all know, self-looping of different VLAN interfaces will not cause broadcast storms, but will it be blocked by STP?

5 must-know SD-WAN security myths