How is Instagram expanding its infrastructure across the ocean?

【51CTO.com Quick Translation】In 2014, two years after Instagram joined Facebook, Instagram’s engineering team migrated the company’s infrastructure from Amazon Web Services (AWS) servers to Facebook’s data centers. Facebook has multiple data centers in Europe and the United States, but until recently Instagram only used data centers in the United States.

[[247768]]

The main reason Instagram wants to expand its infrastructure across the ocean is that we no longer have space in the United States. As the service continues to grow, Instagram has reached a point where we need to consider leveraging Facebook's data centers in Europe. Another benefit: local data centers mean lower latency for European users, which will hopefully create a better user experience on Instagram.

In 2015, Instagram expanded its infrastructure from one data center to three to provide much-needed resiliency: Our engineering team didn’t want to repeat the AWS disaster of 2012, when a major storm in Virginia brought down nearly half of its instances. Scaling from three data centers to five was easy; we simply increased the replication factor and copied the data to the new regions; however, it was harder to scale when the next data center was far away on another continent.

Understanding infrastructure

Infrastructure can generally be divided into two types:

Stateless services are usually used for computing and scale based on user traffic (on-demand scaling). The Django web server is an example.
Stateful services are usually used as storage and must maintain consistency across data centers, such as Cassandra and TAO.

Everyone loves stateless services, they are easy to deploy and scale, and can be started whenever and wherever needed. In fact, we also need stateful services like Cassandra to store user data. Running Cassandra with too many replicas not only increases the complexity of maintaining the database, but also wastes capacity, not to mention how slow it is to transmit quorum requests across the ocean.

Instagram also uses TAO (Distributed Data Store for Social Graphs) as a data storage system. We run TAO as a single master for each shard, without any slaves updating the shards for any write request. It forwards all writes to the master region of the shard. Since all writes are done in the master region located in the United States, the write latency in Europe is unbearable. You may have noticed that our problem feedback is basically at the speed of light.

Potential Solutions

Can we reduce the time it takes for a request to travel across the ocean (or even make the round trip disappear)? There are two ways to go about this.

1. Partitioning Cassandra

To prevent arbitration requests from traveling across the ocean, we are considering splitting the dataset into two parts: Cassandra_EU and Cassandra_US. If European users' data is stored in the Cassandra_EU partition and US users' data is stored in the Cassandra_US partition, users' requests will not have to travel long distances to get data.

For example, let's say there are five data centers in the US and three in the EU. If we deploy Cassandra in Europe by replicating the current cluster, the replication factor will be 8, and quorum requests must contact 5 of the 8 replicas.

But if we can find a way to split the data into two groups, we can have a Cassandra_US partition with a replication factor of 5 and a Cassandra_EU partition with a replication factor of 3, each partition can operate independently without affecting the other partition. At the same time, the quorum requests for each partition can remain on the same continent, thus solving the round-trip transmission latency problem.

2. TAO is limited to writing to local

To reduce latency for TAO writes, we can restrict all EU writes to the local region. This will look almost the same to the end user. When we send a write to TAO, TAO will update locally and will not block the write from being sent synchronously to the primary database; instead, it will queue the write in the local region. In the local region of the write, the data will be available from TAO immediately, while in other regions, the data will be available after it propagates from the local region. This is similar to regular writes today, where data propagates from the primary region.

While different services may have different bottlenecks, if we focus on reducing or eliminating cross-ocean traffic, we can address them one by one.

Lessons Learned

As with every infrastructure project, we learned some important lessons along the way. Here are a few of the main ones.

Don’t rush into new projects. Before you start provisioning servers in a new data center, make sure you understand why you need to deploy services in the new region, what dependencies there are, and how the system will operate when the new region is put into use. Also, don’t forget to review your disaster recovery plan and make any necessary changes.
Don't underestimate complexity. Always leave enough time in your schedule to make mistakes, find unexpected blockers, and learn new dependencies you didn't know about. You may find yourself inadvertently reinventing the way you build infrastructure.
Understand the trade-offs. Success always comes at a price. When we partitioned our Cassandra database, we saved a lot of storage space by reducing the replication factor. However, to ensure that each partition was still ready to face disasters, we needed more front-end Django capacity to accept traffic from the failed region, because now the partitions could not share capacity with each other.
Be patient. I can't remember how many times we said "Oh, shit!" while launching the European data center, but it always works out in the end. It may take longer than you expect, but be patient, and the whole team will work together, it's a super fun process.

Original title: How Instagram is scaling its infrastructure across the ocean, author: Sherry Xiao

[Translated by 51CTO. Please indicate the original translator and source as 51CTO.com when reprinting on partner sites]

<<: Detailed explanation of several wireless transmission modes!

>>: Qinghai University: Ruijie's "Beauty of Minimalism" Blooms Magnificently on the Smart Campus on the Plateau

Thanks to the broadband technician's reminder, I know that the router should be "turned off regularly". No wonder the WiFi is slow and slow.

Blog

How is Instagram expanding its infrastructure across the ocean?

ARP protocol in TCP/IP protocol suite

Can IPFS become the next generation Internet protocol?

Looking ahead to network technology trends in 2018

RAKsmart Hong Kong VPS simple test, three network direct connection/Telecom CN2

How to decide if Wi-Fi 6 is right for you?

What is the 5G Open RAN Policy Alliance established by Microsoft, Google, Samsung and other giants?

How Wi-Fi 6, WWAN and 5G make fully wireless office possible

Credit card fraud empire: Black industry pours into consumer finance and earns millions of dollars a month

Wi-Fi CERTIFIED Vantage adds support for the latest Wi-Fi features

Thanks to the broadband technician's reminder, I know that the router should be "turned off regularly". No wonder the WiFi is slow and slow.

Recommend

Virtono: 25% off San Jose VPS starting at 2.2 Euros per month, free double memory

The future of connectivity: Five breakthroughs in smart device research for 2023

DogYun: Japan Dynamic Cloud 60% off, starting at 0.0457 yuan/hour, Classic Cloud 20% off

Ruijie Networks' scenario-based wireless technology helps Suning's new shopping model of "Internet + Retail"

How to connect a switch Switch usage tutorial

5G, AI and IoT: the dream team for modern manufacturing

In the 5G era, what sparks will cloud computing and 5G create?

Gartner predicts: Global 5G network infrastructure revenue will grow 39% in 2021

LOCVPS brings you cool autumn, 60% off on monthly VPS in Singapore

AlphaVPS: AMD EYPC KVM monthly payment starts from 3.99 euros, large hard disk KVM annual payment starts from 15 euros

Standard Interconnection Lightweight Cloud Promotion, Hong Kong CN2 Yearly Payment Starting from 268 Yuan

PhotonVPS: $4/month KVM-2GB/30GB/2TB/Los Angeles & Dallas & Chicago, etc.

Life is not easy, where is the future for terminal manufacturers in the 5G era?

Will the difficulties faced by the communications industry continue in 2019?

From entry to mastery: Application and best practices of Ansible Shell modules