A practical guide to running databases across regions and Kubernetes clusters

A practical guide to running databases across regions and Kubernetes clusters

Translator | Kang Shaojing

Planning | Yun Zhao

Among the many NoSQL storages, Cassandra is one of the most popular choices for enterprises and developers. It uses the architectural features introduced by Amazon Dynamo to support the Big Table data model, and its advantages are very obvious: high scalability and high availability, no single point of failure NoSQL column family implementation, very high write throughput and good read throughput, secondary index support for search adjustable consistency and support for replication flexible mode, etc. In addition, it also implements a very tricky problem of running databases (or other applications) in a Kubernetes cluster, which is here for your reference. Global applications require a data layer that is as distributed as the users they serve, and Apache Cassandra has done this well, currently handling data needs for companies such as Apple, Netflix, and Sony. Traditionally, the data layer management of distributed applications is managed by a dedicated team to manage the deployment and operation of thousands of nodes (including local nodes and nodes in the cloud).

To relieve much of the burden from DevOps teams, we have developed many of these practices and patterns in K8sSandra, leveraging the common control zone provided by Kubernetes (K8s). However, there is a problem, running a database (or any application) across multiple regions or K8s clusters is tricky without proper care and pre-planning.

To show you how to do this, let's look at a single-region K8s Cassandra deployment running on a single K8s cluster. It consists of six Cassandra nodes, spread across three availability zones within the region, with two Cassandra nodes in each availability zone. In this example, we will use the Google Cloud Platform (GCP) region name. However, the examples here are equally applicable to other clouds, even online clouds.

This is where we are now:

Existing deployment of cloud databases

The goal is to have two regions, each with a Cassandra data center. In our cloud-managed K8s deployment here, this translates to two K8s clusters - each with a separate control plane, but using a common virtual private cloud (VPC) network. By extending the Cassandra cluster to multiple data centers, we gain redundancy in the event of a regional outage, and improve response time and latency for client applications when accessing data locally.

This is our goal: to have two regions, each with its own Cassandra datacenter.

On the surface, it seems like we could accomplish this by simply spinning up another K8s cluster that deploys the same K8s YAML. Then add a few tweaks to the availability zone names and we're done, right? Ultimately, these resources are very similar in shape, all K8s objects. So, shouldn't this work? Maybe. Depending on your environment, this approach might work.

There are a lot of issues that you can avoid if you don't use a fully distributed database deployment. But unfortunately, things are rarely that simple. Even if some of these hurdles are easy to clear, there are many other harmless things that can go wrong and lead to a degraded state. Your choice of cloud provider, K8s distribution, command line flags, and yes, even DNS - these can all lead you down a dark abyss. So, let's explore some of the most common issues you might run into to avoid this.

Common obstacles

Even if some of your deployments start out running just fine, you may hit a roadblock or two as you grow to a multi-cloud environment, upgrade to another K8s version, or start using different distributions and free tools. When it comes to distributed databases, there's a lot more going on under the hood. Understanding how K8s runs containers across a range of hardware will help develop a high-level solution - one that will ultimately meet your exact needs.

Cassandra nodes require unique IP addresses

The first hurdle you may run into involves basic networking. Going back to our first cluster, let's take a look at the networking layers involved.

In the VPC shown below, we have a Classless Inter-Domain Routing (CIDR) range that represents the addresses of K8s worker instances. Within the K8s cluster scope, there is a separate address space for operating pods and containers running. A Pod is a collection of containers that share certain resources, such as storage, network, and process space. In some cloud environments, these subnets are bound to specific availability zones. Therefore, each subnet that a K8s worker is launched on may have a CIDR range. There may be other VMs in the VPC, but in this example, we will stick with K8s as the only tenant.

CIDR range used by VPC with K8s layer

In our case, we have 10.100.xx for nodes and 10.200.xx for K8s level. Each K8s worker gets a portion of 10.200.xx, the CIDR range for the Pods running on that single instance.

Thinking back to our target structure, what happens if two clusters use the same or overlapping CIDR address ranges? When you first started working with networking, you probably remember these error messages:

Common error messages when trying to connect two networks

K8s errors don’t look like this. There are no warnings popping up about the cluster not being able to communicate effectively.

If you have one cluster with an IP space and another cluster with the same IP space or overlapping locations, how does each cluster know when a particular packet needs to leave its own address space and be routed through the VPC network to the other cluster and then enter that cluster's network?

Normally, there are no caveats here. There are ways to work around this; but at a high level, if you have overlap, you are asking for trouble. The point here is that you need to understand the address space of each cluster and then carefully plan the allocation and use of those IPs. This allows the Linux kernel (where K8s routing happens) and the VPC network layer to forward and route packets as needed.

But what if you don't have enough IPs? In some cases, you can't give each pod its own IP address. So in this case, you need to take a step back and decide which services absolutely must have unique addresses and which services can run together in the same address space. For example, if your database needs to be able to communicate with every other pod, then it may need its own unique address. However, if the application tiers on the east and west coasts are just communicating with their local data tier, they can have their own dedicated K8s cluster with the same address range to avoid conflicts.

Flat Network

In our reference deployment, we dedicate non-overlapping ranges in the K8s cluster to the infrastructure layer, which must be unique and services will not communicate in overlapping CIDR ranges. Ultimately, what we are doing here is flattening the network.

With non-overlapping IP ranges, we can now proceed with routing packets to pods in each cluster. In the image above, you can see that the west coast is 10.100 and the east coast is 10.150, and the K8s pods receive IPs from these ranges. The K8s cluster has its own IP space, 200 vs. 250, and the pods are segmented as before.

How to handle Cassandra routing between datacenters

We have a bunch of IP addresses, and we have uniqueness to those addresses. Now, how do we handle routing of this data and communication and discovery of all of this? Packets destined for cluster A have no way of knowing how they need to be routed to cluster B. When we try to send a packet across the cluster boundary, the local Linux network stack figures out that this is not local to that host or any host within the local K8s cluster, and then forwards the packet to the VPC network. From here, our cloud provider must have a routing table entry to understand where this packet needs to go.

In some cases, this works out of the box. The VPC route table is updated with the pod and service CIDR ranges, telling which hosts packets should be routed to. In other environments, including hybrid and on-premises, this may take the form of advertising routes to the network layer via BGP. Yahoo! Japan has a great article that covers this exact deployment method.

However, these options may not always be the best answer, depending on what your multi-cluster architecture looks like within a single cloud provider. Is it a hybrid cloud or multi-cloud, combining on-premises and two different cloud providers? While you can certainly test these across all of these different environments, you can count on it taking a lot of time and maintenance.

Some solutions to consider

Overlay Network

A simpler answer is to use an overlay network where you build a separate IP address space for your application - in this case, it's a Cassandra database. You can then run it on top of the existing Kube Network using proxies, sidecars, and gateways. We won't go into that in depth in this post, but we have some great content on how to connect stateful workloads across K8s clusters that will show you how to do this at a high level.

So, what's next? The packets are flowing, but now there are some new K8s shenanigans to deal with. Assuming you've got your network ready and have all the appropriate routing in place, there is at least some connectivity between these clusters at the IP layer. You have IPs connecting Pods, and Cluster 1 can talk to Pods in Cluster 2, but you now have some new things to consider.

Service Discovery

In K8s networking, identity is temporary. Due to cluster events, Pods may be rescheduled and receive new network addresses. In some applications, this is not a problem. In other cases, such as databases, the network address is the identity - which can lead to unexpected behavior. Although IP addresses may change, our storage and the data represented by each pod remain constant over time. We must have a way to maintain a mapping of addresses to applications. This is where service discovery comes in.

In most cases, service discovery is implemented within K8s via DNS. Even though a pod’s IP address may change, it can have a persistent DNS-based identity that is updated as cluster events occur. This sounds great, but as we move into the multi-cluster world, we must ensure that our services can be discovered across cluster boundaries. As a pod in Cluster 1, I should be able to get the address of a pod in Cluster 2.

DNS Stub

One solution to this challenge is DNS stubs. In this configuration, we configure the K8s DNS service to route requests for a specific domain suffix to a remote cluster. With a fully qualified domain name, we can forward DNS lookup requests to the appropriate cluster for resolution and eventual routing.

The problem here is that each cluster needs to have a separate DNS suffix set via the kubelet flags, which is not an option in all K8s. Some users work around this by configuring a stub with the namespace name as part of the FQDN. This works, but is a bit of a hack and is not a proper way to set the cluster suffix.

Managed DNS

Another solution similar to DNS stubs is to use a managed DNS product. In the case of GCP, there is the Cloud DNS product, which can replicate local DNS entries to the VPC level for resolution by external clusters or even VMs within the same VPC. This option provides many benefits, including:

Eliminate the overhead of managing a cluster of hosted DNS servers – Cloud DNS requires no scaling, monitoring, or management of DNS instances because it is a managed Google service.

Local resolution of DNS queries on each Google K8s Engine (GKE) node - Similar to NodeLocal DNSCache, Cloud DNS stores DNS responses locally, providing low latency and high scalability DNS resolution.

Integration with Google Cloud's operations suite - provides DNS monitoring and logging.

VPC-scoped DNS - Provides multi-cluster, multi-environment, and VPC-scoped K8s business resolution.

Replicated managed DNS for multi-cluster service discovery

Cloud DNS abstracts away a lot of the traditional overhead. The cloud provider will manage scaling, monitoring, and security patches, along with all the other aspects you would expect from a managed offering. With some providers, GKE also offers a node-local DNS cache, which reduces latency by running a DNS cache at a lower level so that you don’t have to wait for DNS responses.

In the long run, if you are only in a single cloud, a managed service dedicated to DNS will work fine. However, if you have clusters spanning multiple cloud providers and on-premises environments, a managed offering may only be part of the solution.

The Cloud Native Computing Foundation (CNCF) provides a variety of options, and there are a large number of open source projects that have made great progress in helping alleviate these pain points, especially in cross-cloud, multi-cloud, or hybrid cloud scenarios.

About the Translator

Kang Shaojing, 51CTO community editor, is currently working in the communications industry, working in a low-level driver development position. He has studied data structures and Python, and is now interested in operating systems, databases and other related fields.

Original title: Taking Your Database Beyond a Single Kubernetes Cluster, author: Christopher Bradford

Link: https://dzone.com/articles/taking-your-database-beyond-a-single-kubernetes-cl

<<:  Millimeter wave is imperative to unleash the full potential of 5G!

>>:  Why does WiFi 7 depend on the 6GHz band?

Recommend

This may be the correct way to open 5G

I wonder what you think 5G should look like? Fast...

Industry insiders look at this: The history of 5G at the two sessions

[[327682]] A 5G+ holographic remote same-screen i...

Global spectrum auctions valued at $27.5 billion in 2020

On February 8, according to data released by GSA,...

Six common IoT wireless technologies and their use cases

The Internet of Things (IoT) starts with network ...

Why is 5G suddenly not popular anymore?

In the past two days, an article about the curren...

What does cutover mean in network engineering?

1. Main contents of this article What types of bu...

[Black Friday] TNAHosting: $9/year KVM-1GB/15G SSD/5TB/Chicago Data Center

TNAHosting's Black Friday promotion includes ...