Using DDC to build AI networks? This may just be a beautiful illusion

Using DDC to build AI networks? This may just be a beautiful illusion

ChatGPT, AIGC, big models... a series of dazzling terms have emerged, and the commercial value of AI has attracted great attention from the society. As the scale of training models grows, the data center network that supports AI computing power has also become a hot topic. Improving computing efficiency, building high-performance networks... major manufacturers have shown their talents and strived to open up the "F1 new track" of AI networks in the grand plan of the Ethereum industry.

In this AI arms race, DDC has made a high-profile appearance and has become synonymous with revolutionary technology for building high-performance AI networks overnight. But is it really as good as it seems? Let us analyze it in detail and make a calm judgment.

Started in 2019, the essence of DDC is to replace frame routers with box routers

With the rapid growth of DCN traffic, the demand for DCI network upgrades is becoming increasingly urgent. However, the expansion capacity of DCI router frame devices is limited by the size of the frame; at the same time, the device power consumption is large, and when expanding the frame, the cabinet power, heat dissipation and other requirements are high, and the transformation cost is high. Against this background, in 2019, AT&T submitted a box router specification based on commercial chips to OCP and proposed the concept of DDC (Disaggregated Distributed Chassis). Simply put, DDC is to use a cluster composed of several low-power box devices to replace hardware units such as frame device business line cards and network boards, and the box devices are interconnected by cables. The entire cluster is managed by a centralized or distributed NOS (network operating system) in order to break through the performance and power consumption bottlenecks of DCI single-frame devices.

DDC's claimed advantages include:

Break through the expansion limitations of frame-type devices : Capacity expansion can be achieved through multi-device clustering without being restricted by the size of the frame;

Reduce single-point power consumption : Multiple low-power box devices are deployed in a dispersed manner, solving the problem of concentrated power consumption and reducing cabinet power and heat dissipation requirements;

Improve bandwidth utilization : Compared with the traditional ETH network Hash exchange, DDC uses cell exchange and performs load balancing based on cells, which helps to improve bandwidth utilization;

Alleviate packet loss : Use the large buffer capacity of the device to meet the high convergence ratio requirements of the DCI scenario. First, use the VOQ (Virtual Output Queue) technology to distribute the packets received in the network to different virtual output queues, and then use the Credit communication mechanism to determine that the receiving end has sufficient buffer space before sending these packets, thereby reducing packet loss caused by egress congestion.

DDC solution is only a flash in the pan in DCI scenarios

The idea seems perfect, but it is not all smooth sailing. DriveNets' Network Cloud product is the industry's first and only commercial DDC solution, and the entire software is compatible with universal white box routers. But so far, no clear sales cases have been seen on the market. As the proposer of the DDC architecture solution, AT&T deployed the DDC solution in grayscale in its self-built IP backbone network in 2020, but there was basically no follow-up. Why didn't this splash cause much of a wave? This should be attributed to the four major defects of DDC.

Defect 1: Unreliable device control plane

The various components of the frame-type equipment are interconnected through the PCIe bus, which is highly integrated and reliable in hardware, and all devices use a dual main control board design to ensure the high reliability of the equipment's control plane. DDC uses fragile module cables that are "replaced when broken" to interconnect, build a multi-device cluster and support the operation of the cluster control plane. Although it has broken through the scale of frame-type devices, this unreliable interconnection method brings great risks to the control plane. When two devices are stacked, problems such as brain split and table item asynchrony may occur in abnormal situations. For the unreliable control plane of DDC, this problem is more likely to occur.

Defect 2: Highly complex equipment NOS

The SONiC community already has a distributed forwarding frame design based on the VOQ architecture, and continues to iterate, supplement and modify it to meet the support for DDC. Although white boxes do have many implementation cases, few people challenge the "white box". To build a remote "white box", it is not only necessary to consider the status of multiple devices in the cluster, the synchronization and management of table information, but also to consider the systematic implementation of multiple actual scenarios such as version upgrades, rollbacks, hot patches, etc. under multiple devices. DDC requires an exponential increase in the complexity of the NOS of the cluster. There are currently no mature commercial cases in the industry, and there are great development risks.

Defect 3: Lack of maintainable solutions

The network is unreliable, so the ETH network has a lot of maintainable and locatable features or tools, such as the well-known INT and MOD. These tools can monitor specific flows, identify the flow characteristics of packet loss, and thus locate and troubleshoot. However, the cell used by DDC is only a slice of the message, without the relevant IP and other five-tuple information, and cannot be associated with a specific business flow. Once DDC has a packet loss problem, the current operation and maintenance means cannot locate the packet loss point, and the maintenance plan is seriously lacking.

Defect 4: Increased costs

In order to break through the size limitation of the frame, DDC needs to interconnect the devices in the cluster through high-speed cables/modules. The interconnection cost is much higher than the interconnection between the line cards and network boards of frame-type devices through PCB routing and high-speed connectors, and the larger the scale, the higher the interconnection cost.

At the same time, in order to reduce the power consumption concentration at a single point, the overall power consumption of the DDC cluster interconnected by cables/modules is higher than that of the frame-type equipment. For the same generation of chips, assuming that the DDC cluster devices are interconnected by modules, the cluster power consumption is 30% higher than that of the frame-type equipment.

Refuse to rehash leftovers: DDC solutions are also not suitable for AI networks

The immaturity and imperfection of the DDC solution has made it obsolete in DCI scenarios. However, it has revived under the current AI trend. The author believes that DDC is also not suitable for AI networks. Let's analyze it in detail.

Two core demands of AI networks: high throughput and low latency

The services supported by AI networks are characterized by a small number of flows and large bandwidth for a single flow. At the same time, the traffic is uneven, and many-to-one or many-to-many situations (All-to-All and All-Reduce) often occur. Therefore, problems such as uneven traffic load, low link utilization, and frequent traffic congestion leading to packet loss are prone to occur, and computing power cannot be fully released.

DDC only solves the Hash problem, but also brings many defects

DDC uses cell switching to slice messages into cells and sends them in a polling mechanism based on reachability information. The traffic load is evenly distributed to each link, making full use of bandwidth and solving the hash problem. However, DDC still has four major flaws in AI scenarios.

Defect 1: Hardware requires specific equipment and is not universal in closed private networks

The cell switching and VOQ technologies in the DDC architecture are both implemented by specific hardware chips. Currently, all DCN network equipment cannot be reused. The rapid development of the ETH network is due to its plug-and-play convenience, universalization, and standardization. DCC relies on hardware and builds a closed private network through a private switching protocol, which is not universal.

Defect 2: Large cache design increases network costs and is not suitable for large-scale DCN networking

If the DDC solution enters the DCN, in addition to the high interconnection cost, it also bears the cost burden of large chip cache. DCN networks currently use small cache devices, with a maximum of only 64M; while the DDC solution derived from the DCI scenario usually has a chip HBM of more than GB. Compared with DCI, large-scale DCN networks are more concerned about network costs.

Defect 3: Increased network static latency, not suitable for AI scenarios

As a high-performance AI network that releases computing power, the goal is to shorten the completion time of services. DDC's large cache capacity will cache messages, which will inevitably increase the static delay of hardware forwarding. At the same time, cell switching, slicing, encapsulation and reassembly of messages will also increase network forwarding delay. Through test data comparison, DDC's forwarding delay is 1.4 times longer than that of traditional ETH networks.

Defect 4: As DC scale increases, the DDC unreliability problem will worsen

Compared with the scenario where DDC replaces frame-type devices in DCI scenarios, DDC entering DCN needs to meet the needs of a larger cluster, at least one network POD. This means that the components of this remote "frame" are farther apart. Therefore, the reliability of the control plane of this cluster, the synchronous management of the device network NOS, and the operation and maintenance management of the network POD level are required to be higher. The various defects of DDC will be cracked.

DDC is at most a transitional solution

Of course, no problem is unsolvable. Accepting some constraints, this specific scenario can easily become a stage for big companies to "show off their skills". Networks pursue reliability, simplicity, and efficiency, and hate complexity. Especially in the current context of "reducing staff and increasing efficiency", the cost of DDC implementation must be considered.

In the AI ​​scenario, many cases of network load sharing have been solved by global static or dynamic orchestration of forwarding paths. In the future, it can also be solved by using packet spray and out-of-order reordering on the end-side network card. Therefore, DDC is at most a short-term transitional solution.

A deeper look reveals that the driving force behind DDC may be DNX

Finally, let's talk about Broadcom, a mainstream network chip company. We are more familiar with the two product series of StrataXGS and StrataDNX. XGS continues the high-bandwidth, low-cost route, quickly launches small-buffer, high-bandwidth chip products, and continues to dominate the DCN network occupancy rate. StrataDNX, however, carries the cost of large buffers and continues the myth of VOQ+cell switching, hoping that DDC will enter DC to survive. There seems to be no case in North America, and domestic DDC may be the last straw for DNX.

Nowadays, a large number of hardware facilities such as GPUs have been restricted to a certain extent in my country. Do we really need DDC? Let's leave more opportunities for domestically produced devices!

<<:  Laying the foundation for digital finance | H3C launches S12500G-EF, a new generation of green smart switches

>>:  How does private 5G impact Industry 4.0 transformation?

Recommend

5G sets sail to create China's "speed"

In 2023, 5G applications will enter their final y...

There are four misunderstandings about network intelligence

If you don't talk about AI after dinner, you ...

ColoCrossing: $35/month-E3-1240/16GB/1TB HDD or 240G SSD/40TB/8 data centers

ColoCrossing recently released several E3 special...

5G and AI Use Cases - How 5G Helps Implement Artificial Intelligence

Michael Baxter says 5G will unlock the potential ...

"No products, no discounts, no sales" Huawei's new "knowledgeable" approach

Not long ago, an online experience store with &qu...

15 Best Practices for Fiber Optic Cable Installation in Data Centers

CABLExpress recently released its latest Fiber Op...