The battle of data center network switching equipment architecture

Switching technology is one of the important technologies in the network, especially in the second and third layer networks in the data center, where it is widely used. Switches are typical products of switching technology and are everywhere in the data center. They are essential equipment for building data center LANs. Switches are divided into box switches and frame switches. The biggest difference is that box switches are generally only 1~2U high, with only one switching chip or only a few switching chips interconnected by themselves, and no bridge chip is required. Because frame switches have multiple plug-in cards, each of which is 1U high, bridge chips are required between the cards to complete data forwarding between the cards.

[[222192]]

Frame switches are located at the convergence and core of data center networks, handling the forwarding of massive amounts of data. Frame switches carry forward the switching technology, especially in the data exchange processing between plug-in cards, and continuously make various designs to continuously improve the switching capacity and reduce the design cost. From the perspective of switching architecture alone, there are currently two main switching architectures: one is the traditional switching based on message routing forwarding; the other is the emerging switching based on cell forwarding. Both architectures have their own advantages and disadvantages, which has triggered a battle for switching architecture selection.

Based on message routing and forwarding switching

This method has existed since the emergence of frame switches. Multiple cards are connected through bridge chips. The number of bridge chips directly determines the forwarding bandwidth between the cards. At first, many devices put bridge chips on a centralized card, but found that the forwarding capacity of the bridge chip was limited and could not meet the line-speed forwarding of traffic between multiple cards. As the number of slots in frame devices increased, the line-speed forwarding capacity of the cards became lower and lower. So some people began to design a multi-card method, using a view containing multiple bridge chips to achieve card connection. Each bridge card will provide a part of the bandwidth for the switch card. Multiple bridge cards working together can provide a large bandwidth for each switch card, so that each card can forward at line speed. This implementation method was once popular in data center networks for ten years, and almost all frame devices are implemented based on this architecture.

However, with the continuous growth of data center traffic, all ports on the cards are used, and it is found that in some special traffic scenarios, these cards cannot reach the line speed, which is inconsistent with theoretical tests. In fact, this situation is related to this architecture. The traffic from a card must be routed based on the characteristics of the message, and the message is sent to different bridge cards to achieve traffic load sharing on the bridge cards. Because a single bridge card cannot meet the line speed requirements of all switch cards in the frame switch, the message characteristics entering the card are not completely uniform, so the traffic routed to different bridge cards will not be completely uniform. If the bandwidth redundancy of the bridge card design is very small or even non-existent, a little unevenness will cause congestion, resulting in packet loss, and the switch card cannot reach line speed forwarding. This situation is not uncommon in the actual network. Once encountered, you can only adjust the routing algorithm (not necessarily useful) or change to a device with larger bandwidth capacity to leave as much redundant bandwidth as possible.

Cell-based switching

Because the message routing and forwarding switching architecture has inherent defects, the forwarding architecture based on the cell method was born. This method also requires multiple bridge cards. The switching card has a fragmentation capability. It can decompose the incoming message into multiple cells of the same size, and then send them to different bridge cards respectively. In this way, the message size sent to each bridge card is the same, and the traffic to different bridge cards is always uniform, so there will be no problem of uneven routing. This method completely solves the problem of uneven traffic and congestion in the previous switching architecture and becomes a new mainstream switching architecture. However, cell-based forwarding also has its inherent technical defects.

The switching card will fragment each message (generally according to the fixed 64-byte or 128-byte fragments, and the last byte must be padded to 64 or 128 bytes if it is not enough). After forwarding by the bridge card, it must be reassembled on the egress card to restore the complete message after forwarding. This is not necessary for the message routing method, which undoubtedly increases the forwarding overhead of the device. Therefore, compared with the message routing method, this architecture has lower forwarding efficiency and higher forwarding latency, because after many messages are fragmented, the last piece must be padded with empty data, and each cell must also have its own forwarding header. These data will occupy more data bandwidth and waste some internal bandwidth. This method will also increase the probability of failure. As long as there is a problem with a bridge card, the forwarding of the entire device will be affected, because almost every cell of the message must pass through this bridge card, but this will not happen with message routing forwarding. If a bridge card has a problem, it will only affect the business of the HASH routed to this bridge card, and the business on other bridge cards will not be affected.

Once the message enters the switch card, it is forwarded by cells to the bridge card, which is not conducive to problem location. At this time, the message content cannot be seen on the bridge card at all. The message length characteristics sent to each bridge card are the same. It is difficult to distinguish whether the problem is the switch card or the bridge card. It is often necessary to replace the test to clarify. The architecture based on message routing is very easy. According to the message characteristics, statistics are made on the internal port to confirm the location of the problem. The cause of the problem can be found quickly and maintenance is convenient. This has also caused many people to turn around and pursue the message-based routing switching architecture.

After the above introduction, the two switching architectures have their own advantages and disadvantages, and neither can replace the other. Both architecture technologies are relatively mature and have the level of practical application, and the design costs are not much different. Which switching architecture device to use in the data center (one device cannot have both architectures at the same time) still needs to be considered from a practical perspective, depending on which aspect the data center values more.

If the traffic in the data center is not large enough, the message characteristics are relatively simple and the changes are relatively uniform, you can consider switching equipment based on message routing that is easy to maintain; if the traffic in the data center is extremely large, almost all the ports of the switching card will be used, and the bandwidth utilization rate is very high, it is recommended to use switching equipment based on cell forwarding to avoid the problem of not being able to reach line speed. The two switching architectures will exist for a long time in the future for data centers to choose from.

<<: HTTP protocol interview challenges

>>: F5 Launches Industry-Leading Training Program in Asia Pacific to Help Develop Next Generation of Technology Talent

5G co-construction and sharing: the simple "saving money" mentality is not advisable

Recommend

The 4th National Industrial Control System Information Security Attack and Defense Competition of Zhuoshi Network Security Cup was successfully concluded

The "Zhuoshi Network Security Cup National F...

The three major operators have completed the deployment of IMS network interconnection and 2G/3G network withdrawal has been accelerated

Recently, the three major operators completed the...

The battle of data center network switching equipment architecture

5G co-construction and sharing: the simple "saving money" mentality is not advisable

CloudCone: $15/year KVM-512MB/15GB/5TB/Los Angeles data center

Network monitoring tool! Don't miss these 7 free open source tools

5G: How businesses can prepare

Unveiling HTTP/2: How HTTP/2 establishes a connection

Huawei Mate X: The Future of the Future

What are the future trends of mobile phone connections?

PacificRack: Windows VPS in Los Angeles Data Center starting at $12 per year

Will the popular SD-WAN really kill MPLS?

SD-WAN is just the first step in WAN automation

Recommend

The 4th National Industrial Control System Information Security Attack and Defense Competition of Zhuoshi Network Security Cup was successfully concluded

SpartanHost Seattle E5 series VPS partial restock, Dallas large hard drive VPS restock

Yunnan Yuxi and Huawei Enterprise Cloud deepen cooperation

5G universal access requires reasonable development and support from multiple parties

Is 4G enough? More than 40% of users turn off 5G function in new smartphones

IoT connections to grow 400% in four years

The three major operators have completed the deployment of IMS network interconnection and 2G/3G network withdrawal has been accelerated

The 5G era is unlikely to change the market structure of operators

20 lines of Python code to achieve encrypted communication

5G speed is already incredible, is 6G network coming?

How to use big data well? Liu Jianhui of 51 Credit Card reveals the way to advance big data application products

WiFi is getting slower and slower. Here’s how to revive it

PacificRack: $6.66/year KVM-512MB/15GB/1TB/Los Angeles

TCP: Three-way handshake and four-way wave, no blind spot answer in the interview

DiyVM upgrade KVM, Japan/US/Hong Kong CN2 line 2G memory monthly payment from 50 yuan