When designing and building a distributed storage cluster, how should the cluster network be planned?

When designing and building a distributed storage cluster, how should the cluster network be planned?

@baimmi China UnionPay Co., Ltd.

Due to the confidentiality and sensitivity of data, the isolation of services is very important within the data center. Within the data center, data access needs to be strictly controlled, and the business and management networks must be isolated from each other. The management network segment and the tenant network are interconnected at Layer 3. Tenants access the portal interface of the storage system through the management network segment and issue management instructions such as add, delete, check, and query; the business network segment is responsible for the transmission of business data. When the storage space is mounted to the front-end business system in the form of a volume through the business network segment, services are provided on this network segment.

According to the paradigm of distributed storage, the management and business of the storage system belong to two network segments, which are independent of each other and do not affect each other. Data transmission is only carried out on the business network segment. Management and business communicate through the server and cannot access each other through the network.

In the business network segment, each server is planned to be connected to two switches by two network cables. In the management network segment, each server is planned to be connected to two switches by two network cables. High network reliability is provided through node-level dual network card active/standby and cluster-level switch active/standby. The two network segments are isolated using independent physical network cards, and different VLANs are used for isolation if the conditions are not met.

[[228530]]

According to the barrel effect, the overall performance limit of a system is often determined by the weak link in the system. When the cluster adopts a hybrid storage configuration, the standard 10Gbps high-speed network can meet the pressure of load balancing and data reconstruction of a cluster of considerable size; however, when the cluster adopts an all-flash architecture, the hard disk performance will be greatly improved. At this time, the standard 10Gbps network may become the short board of the system. The 56 Gbps InfiniBand network and even the higher-speed 100 Gbps network are close to non-blocking communication, breaking through the bottleneck of internal exchange in the storage system. In the InfiniBand network, the communication delay is controlled at the nanosecond level, and the computing storage information is transmitted in a timely manner. With the high-speed reading and writing of the SSD, it has considerable performance.

@Liu Dongdongsoft Group

In the design and construction process of distributed storage clusters, the cluster network is a bottleneck. Because distributed storage is highly dependent on network bandwidth, all data exchanges need to be carried out through the network, so a high-speed and reliable network environment is required.

The specific plan is as follows:

1. Use 10G network interface as much as possible and use optical port for connection. Use 40G interface for uplink.

2. Make network equipment as redundant as possible and configure at least two 10G optical port access switches.

3. In addition to exchanging large amounts of data, distributed storage clusters may also perform virtual machine replication and synchronization activities on the network, depending on the number of virtual machines hosted in the system and the number of valid operations. If there is only a Gigabit network at this time, it will be overwhelmed, especially during virtual machine reconstruction and synchronization operations.

4. Place various traffic types (distributed storage network, management network, virtual machine migration network, virtual machine production network, etc.) in different VLANs and use shares as a quality of service (QoS) mechanism to maintain the desired performance level in possible contention scenarios.

5. Dividing different VLANs can also ensure that the distributed storage cluster network is not affected, because once an IP address conflict occurs in the distributed storage cluster network, the entire distributed storage cluster will be unavailable.

6. To achieve maximum security and performance, distributed storage cluster network traffic should be isolated to its own Layer 2 segment.

7. Configure network adapters to perform bandwidth aggregation as an availability and redundancy measure.

@Garyy China Continent Insurance

The fully redundant network path virtual network layer avoids service interruptions caused by a single network card failure by adopting technologies such as multi-network card bonding.

The communication plane of the network is divided into business plane, storage plane and management plane. In order to ensure the reliability of data on various network planes, different planes are isolated by VLAN and other technologies, and the failure of a single plane does not affect the normal operation of the other two planes. Business plane: mainly the communication plane of the virtual network card of the virtual machine, providing business applications to the outside. Storage plane: mainly provides a communication plane for iSCSI storage and provides storage resources for virtual machines, but does not communicate directly with virtual machines, but converts through the virtualization platform. Management plane: responsible for the management of the entire cloud computing system, business deployment, system loading and other traffic communications. -Network card load sharing uses dual network cards for each communication plane (business, storage, management). The dual network cards use the Bonding mode. After the two network cards are bound into a logical "one network card", they work together synchronously. It can not only load share the access traffic of the server, but also ensure that when one of the network cards fails, the other network card immediately takes over the entire load. The process is seamless and the service will not be interrupted.

<<:  How to Develop an Effective Data Center Management Services Plan

>>:  The Implications of the ZTE Incident for Operators

Recommend

Redefining the Network: Navigating the World of SD-WAN

In the evolving enterprise network environment, c...

How operators benefit from NaaS

Network as a Service (NaaS) is increasingly popul...

Talk about 5G in plain language: ten knowledge points to ensure you understand

When it comes to 5G, everyone can basically talk ...

Why is your router's ability to penetrate walls poor?

1. Is it my fault that the signal is weak? Whethe...

5 ways 5G will change the world

As communications technology goes, the switch fro...

RabbitMQ communication model publish-subscribe model

Hello everyone, I am Zhibeijun. Today, I will lea...

Let us say goodbye to TCP together!

PS: This article does not involve knowledge about...

From WiFi to NB-IoT, exploring the high-tech access methods of smart door locks

Hello everyone! I am Xiaomi, a 29-year-old who is...