The battle of data center network switching equipment architecture

The battle of data center network switching equipment architecture

Switching technology is one of the important technologies in the network, especially in the second and third layer networks in the data center, where it is widely used. Switches are typical products of switching technology and are everywhere in the data center. They are essential equipment for building data center LANs. Switches are divided into box switches and frame switches. The biggest difference is that box switches are generally only 1~2U high, with only one switching chip or only a few switching chips interconnected by themselves, and no bridge chip is required. Because frame switches have multiple plug-in cards, each of which is 1U high, bridge chips are required between the cards to complete data forwarding between the cards.

[[222192]]

Frame switches are located at the convergence and core of data center networks, handling the forwarding of massive amounts of data. Frame switches carry forward the switching technology, especially in the data exchange processing between plug-in cards, and continuously make various designs to continuously improve the switching capacity and reduce the design cost. From the perspective of switching architecture alone, there are currently two main switching architectures: one is the traditional switching based on message routing forwarding; the other is the emerging switching based on cell forwarding. Both architectures have their own advantages and disadvantages, which has triggered a battle for switching architecture selection.

Based on message routing and forwarding switching

This method has existed since the emergence of frame switches. Multiple cards are connected through bridge chips. The number of bridge chips directly determines the forwarding bandwidth between the cards. At first, many devices put bridge chips on a centralized card, but found that the forwarding capacity of the bridge chip was limited and could not meet the line-speed forwarding of traffic between multiple cards. As the number of slots in frame devices increased, the line-speed forwarding capacity of the cards became lower and lower. So some people began to design a multi-card method, using a view containing multiple bridge chips to achieve card connection. Each bridge card will provide a part of the bandwidth for the switch card. Multiple bridge cards working together can provide a large bandwidth for each switch card, so that each card can forward at line speed. This implementation method was once popular in data center networks for ten years, and almost all frame devices are implemented based on this architecture.

However, with the continuous growth of data center traffic, all ports on the cards are used, and it is found that in some special traffic scenarios, these cards cannot reach the line speed, which is inconsistent with theoretical tests. In fact, this situation is related to this architecture. The traffic from a card must be routed based on the characteristics of the message, and the message is sent to different bridge cards to achieve traffic load sharing on the bridge cards. Because a single bridge card cannot meet the line speed requirements of all switch cards in the frame switch, the message characteristics entering the card are not completely uniform, so the traffic routed to different bridge cards will not be completely uniform. If the bandwidth redundancy of the bridge card design is very small or even non-existent, a little unevenness will cause congestion, resulting in packet loss, and the switch card cannot reach line speed forwarding. This situation is not uncommon in the actual network. Once encountered, you can only adjust the routing algorithm (not necessarily useful) or change to a device with larger bandwidth capacity to leave as much redundant bandwidth as possible.

Cell-based switching

Because the message routing and forwarding switching architecture has inherent defects, the forwarding architecture based on the cell method was born. This method also requires multiple bridge cards. The switching card has a fragmentation capability. It can decompose the incoming message into multiple cells of the same size, and then send them to different bridge cards respectively. In this way, the message size sent to each bridge card is the same, and the traffic to different bridge cards is always uniform, so there will be no problem of uneven routing. This method completely solves the problem of uneven traffic and congestion in the previous switching architecture and becomes a new mainstream switching architecture. However, cell-based forwarding also has its inherent technical defects.

The switching card will fragment each message (generally according to the fixed 64-byte or 128-byte fragments, and the last byte must be padded to 64 or 128 bytes if it is not enough). After forwarding by the bridge card, it must be reassembled on the egress card to restore the complete message after forwarding. This is not necessary for the message routing method, which undoubtedly increases the forwarding overhead of the device. Therefore, compared with the message routing method, this architecture has lower forwarding efficiency and higher forwarding latency, because after many messages are fragmented, the last piece must be padded with empty data, and each cell must also have its own forwarding header. These data will occupy more data bandwidth and waste some internal bandwidth. This method will also increase the probability of failure. As long as there is a problem with a bridge card, the forwarding of the entire device will be affected, because almost every cell of the message must pass through this bridge card, but this will not happen with message routing forwarding. If a bridge card has a problem, it will only affect the business of the HASH routed to this bridge card, and the business on other bridge cards will not be affected.

Once the message enters the switch card, it is forwarded by cells to the bridge card, which is not conducive to problem location. At this time, the message content cannot be seen on the bridge card at all. The message length characteristics sent to each bridge card are the same. It is difficult to distinguish whether the problem is the switch card or the bridge card. It is often necessary to replace the test to clarify. The architecture based on message routing is very easy. According to the message characteristics, statistics are made on the internal port to confirm the location of the problem. The cause of the problem can be found quickly and maintenance is convenient. This has also caused many people to turn around and pursue the message-based routing switching architecture.

After the above introduction, the two switching architectures have their own advantages and disadvantages, and neither can replace the other. Both architecture technologies are relatively mature and have the level of practical application, and the design costs are not much different. Which switching architecture device to use in the data center (one device cannot have both architectures at the same time) still needs to be considered from a practical perspective, depending on which aspect the data center values ​​more.

If the traffic in the data center is not large enough, the message characteristics are relatively simple and the changes are relatively uniform, you can consider switching equipment based on message routing that is easy to maintain; if the traffic in the data center is extremely large, almost all the ports of the switching card will be used, and the bandwidth utilization rate is very high, it is recommended to use switching equipment based on cell forwarding to avoid the problem of not being able to reach line speed. The two switching architectures will exist for a long time in the future for data centers to choose from.

<<:  HTTP protocol interview challenges

>>:  F5 Launches Industry-Leading Training Program in Asia Pacific to Help Develop Next Generation of Technology Talent

Recommend

From ServiceMesh to Decentralized SOA Bus

I have talked about service mesh, API gateway and...

What does Huawei's ultra-high-density UPS module mean to data centers?

[51CTO.com original article] With the continuous ...

There is no optical communication without optical modules, is it true?

Over the past 100 years, human beings have develo...

CloudCone: $16.5/year-dual-core/1GB/50GB/3TB@1Gbps/Los Angeles data center

CloudCone's Christmas Sale has begun. The mer...

How network segmentation strategies work with SD-WAN

Software-defined WANs (SD-WANs) have sparked a re...

Network security attack and defense: wireless network security WEP

[[392852]] The WEP (Wired Equivalent Privacy) pro...

IDC: Edge management services market expected to explode

As enterprises seek greater process efficiency an...