The previous section introduced the evolution of queues (Understand Deterministic Networks in Seconds: Playing with Queues (Part 2)). This section analyzes the deterministic enhancement of queues, starting with the token bucket mechanism, and explaining four mechanisms in detail: credit-based shaping (CBS), time-aware shaping (TAS), round-robin queuing forwarding (CQF), and frame preemption (FP). This article not only introduces what these mechanisms are, but also strives to analyze clearly why this mechanism is designed in this way and how to use it. Improved determinism of queuesQueue scheduling is divided into three processes: enqueue, scheduling, and dequeue. The deterministic enhancement of the queue mainly affects dequeue. That is, scheduling can still choose strict priority scheduling, and restrict traffic when it leaves the queue for link transmission, so it can also be called "shaping". Deterministic enhancement cannot be "arbitrary", but is only effective for specific scenarios, and the traffic needs to be known as much as possible. Therefore, when we understand a new scheduling mechanism, the most important entry point is the characteristics of the traffic (flow distribution, flow rate, packet size, number of packets, periodic/non-periodic) and the traffic requirements (bandwidth, latency, jitter, packet loss rate). Token BucketA token bucket is a "bucket" placed at the output port of the switch, and then tokens are placed into the bucket at a certain rate. A token is a virtual data packet in bytes. For example, suppose there are two flows, red and green, passing through the same 10G port. If the size of the green flow is directly 10Gbps, the red flow will have no transmission space, resulting in a large number of packet losses. In order to make them get along well, they must be given rules, that is, rate limits, or bandwidth allocation. Assuming that the bandwidth allocated to the green flow is 8Gbps (i.e. 1GBps), then only one token needs to be put into the token bucket every 1 nanosecond (theoretical situation), or 10^6 tokens need to be put into the token bucket every 1 millisecond (considering the processing capacity of the device). Data packets that get the token are allowed to be sent, and those that do not get the token are directly discarded or stored and sent when there is a token. This can ensure the certainty of traffic scheduling at the bandwidth level. The token bucket can only guarantee a delay of one second, that is, the worst-case delay is 1 second. For web-based Internet traffic, a delay of several hundred milliseconds does not affect user usage. Therefore, traditional Internet service quality assurance mainly pursues routing optimization and bandwidth allocation, trying to select lightly loaded and uncongested paths and allocating as much excess bandwidth as possible to traffic, which can reduce packet loss and retransmission. Credit-based shapingOn the basis of ensuring bandwidth, can we further reduce latency, or even ensure that the worst latency is bounded? For example, audio and video traffic has the characteristics of continuous transmission and large volume, and the latency jitter must not be too large to avoid voice interruption and picture freeze. In order to solve the problem of ensuring the transmission service quality of audio and video traffic, Credit Based Shaper (CBS) was proposed. CBS places a shaper on the queue of the outgoing port, and the shaper contains a credit calculation component. The shaper follows five rules: 1) If there are no packets in the queue, set the queue's credit to 0. 2) If the credit of the queue is non-negative, the data packets in the queue are allowed to be transmitted, otherwise they are not allowed to be transmitted. 3) When there is at least one packet waiting in the queue, the credit of the queue increases at the rate of idleSlope, where idleSlope is the idle rate in bps. 4) When packets in the queue are transmitted, the credit of the queue decreases at the rate of sendSlope, where sendSlope is the sending rate and the unit is also bps. 5) In general, the sending rate is equal to the idle rate minus the link bandwidth. For example, if the idle rate is 200Mbps and the bandwidth is 1Gbps, the sending rate is -800Mbps. Why are these rules designed? Because the core idea of CBS is that traffic should "give way" to each other. The token bucket directly allocates a certain value of tokens to each flow. When two flows of equal priority, red and green, arrive, the order of transmission and the transmission time occupied are uncertain. The initial credit of CBS is 0. Traffic must "wait" for the credit value to increase before it can be transmitted. Transmission causes the credit to gradually decrease. Therefore, after a flow has a negative credit value for a short period of time, it stops transmitting and it is the turn of another flow to transmit. Take the following figure as an example. When the audio and video stream f1 is queued, a yellow interference stream is being transmitted. f1 is waiting for transmission and the credit value is increasing. At T0, the interference stream is transmitted, and f1 starts to transmit and continuously consumes credit. After the green and blue packets are transmitted, the credit is negative at T1. The third pink packet cannot be transmitted and can only be stored and wait until the credit is restored to 0 at T2, and the pink packet is allowed to be transmitted. The final result is that the green, blue, and pink data packets are no longer transmitted continuously, but are transmitted at intervals. It should be noted in the figure below that the horizontal axis is the timeline, and the width of the data packet in the figure is not the size of the data packet; in the queue depth, the width represents the arrival time of the data packet and the start time of the transmission. In the currently transmitted data, the width represents the start time and the completion time of the transmission. The key issue of CBS is how to configure the idleSlope parameter. IdleSlope is the bandwidth we want to reserve. The larger the idleSlope is, the easier it is to send traffic. This parameter needs to be obtained through a series of constraint solving. The typical use of CBS is to place a credit shaper behind priority queue 6 (Q6) and priority queue 5 (Q5), and set the traffic of Q6 to class A traffic with a transmission duration (one-hop delay) of 125us, and set the traffic of Q5 to class B traffic with a one-hop delay of 250us. The configuration parameter value of the idle rate can be reversed by determining the size of the one-hop delay. Time-aware shapingThere is also a type of control command traffic in industrial networks that has extremely high requirements for delay jitter, such as the control command traffic of the master robot operating the slave robot, which sends a 100-byte data packet every 1 millisecond and requires end-to-end delay to be less than 1 millisecond. For such periodic time-sensitive small flows, CBS is powerless, so Time-Aware Shaping (TAS) is proposed. TAS places a "gate" behind each outbound queue. When the gate is in the open (open, o) state, data packets are allowed to be transmitted. When the gate is in the closed (close, c) state, data packets are not allowed to be transmitted. When the gate is opened and closed is driven by a gating list. In addition, the premise of TAS is that all terminals and network devices need to use 802.1AS to achieve nanosecond-level clock synchronization across the entire network, that is, to ensure that the gating list time of all outbound ports is synchronized and the link delay can be ignored. The key issue of TAS is how to allocate time slots for control traffic to generate a global gating list. Take the following figure as an example. The red stream has two 1500-byte data packets and the green stream has three 1500-byte data packets. Assuming the port bandwidth is 1Gbps, the time slots required for red stream transmission are 24us, and the time slots required for green stream transmission are 36us. Under the no-wait scheduling model, these two time slots are completely separated hop by hop (i.e., the time slots do not overlap), and the gating list shown in the figure can be generated, that is, at T0, gate Q7 is open and gate Q6 is closed, and at T1, gate Q7 is closed and gate Q6 is open. Round RobinTAS can achieve fine-grained scheduling on a hop-by-hop and packet-by-packet basis at the microsecond level, but it requires configuring a gating list on a hop-by-hop and entry-by-entry basis, which makes the configuration very complicated. In addition, the number of gating entries for a single device generally does not exceed 1024, which poses scalability issues in scenarios with massive traffic. Therefore, Cyclic Queuing and Forwarding (CQF) was proposed. The circular queuing forwarding mechanism places a gate (labeled as Rx-gate and Tx-gate) at the entry and exit points. When the gate is open, the packet is entered or transmitted, and when the gate is closed, the packet is prohibited from entering or transmitting. The circular queuing forwarding mechanism divides the transmission time of the outbound port into a series of equal time intervals, each of which is called a cycle T. CQF Requirements: 1) Network-wide clock synchronization, 2) Link delay can be ignored. 3) The cycle T must be at least greater than one-hop delay (i.e., the sum of processing delay, queuing delay, transmission delay, and link delay). Then, by alternately performing enqueue and dequeue operations on the odd and even queues, CQF can ensure that the data packet is sent from the upstream node within one cycle, and is received at the downstream node within the same cycle, and is sent out in the next cycle. Therefore, the end-to-end delay depends only on the cycle size T and the number of path hops H, where the maximum delay bound is (H +1)T, the minimum delay bound is (H-1)T, and the maximum end-to-end jitter is 2T. For example, assuming that the link bandwidth is 1Gbps and the maximum queue depth of CQF is 10 packets, the maximum queue and transmission delay of one hop calculated based on the MTU size packet is 120us. Adding the 5us processing delay, the size of the cycle T can be set to 125us. As shown in the figure below, at the even cycle time T0, the even queue Q6 sends and the odd queue Q7 receives, so Q6 is turned on and Q7 is turned off in Tx-gate, and Q6 is turned off and Q7 is turned on in Rx-gate; at the odd cycle time T1, the odd queue Q7 sends and the even queue Q6 receives, so Q7 is turned on and Q6 is turned off in Tx-gate, and Q7 is turned off and Q6 is turned on in Rx-gate. CQF limits the maximum queue length and sets the time slot of one hop to a fixed period value T. The odd and even queues are executed alternately, which is equivalent to having only one gating entry, thus simplifying the complex gating entry configuration of TAS. The key issue of CQF is how to determine the size of the period T and the start time of the calculation flow. If the period T is too small, the queue is too short, which will lead to a large number of unschedulable situations; if the period T is too large, the end-to-end worst-case delay will increase, some low-latency traffic will not be scheduled, and on-chip cache resources will be wasted. Frame preemptionAnother detail in time-aware shaping is the need to set protection bandwidth. When a time-sensitive flow (priority 7) and a best-effort flow (priority 0) are co-transmitted, if the best-effort flow has already started transmitting when the time-sensitive flow enters the queue, then the time-sensitive flow must wait for at least one best-effort data packet to be transmitted (12us for a 1500-byte packet at 1Gbps bandwidth), resulting in its time slot being unable to be aligned, that is, unable to be transmitted according to the established gating list. Therefore, before the time-sensitive flow arrives, all gates should be closed for the time of one MTU size packet transmission to form protection bandwidth. However, not all time-sensitive flows are transmitted before the arrival of best-effort flows, and not all best-effort flows have the same packet size as the MTU. In fact, the average packet size of Internet traffic is around 256 bytes, so protecting bandwidth will cause a lot of bandwidth waste in the case of dense gated switching. To reduce bandwidth waste, frame preemption (FP) is proposed. Frame preemption divides MAC into eMAC and pMAC. Time-sensitive flows (high-speed frames) use eMAC, and best-effort flows (low-speed frames) use pMAC. When a high-speed frame arrives during low-speed frame transmission, it will first determine whether the low-speed frame can be fragmented. If it can, the low-speed frame will be fragmented, and then the high-speed frame will be preempted for transmission, and finally the low-speed frame will be reassembled after fragmentation. Because Ethernet frames have a minimum transmission limit of 64 bytes, it must be ensured that the two slices (including the checksum) after the low-speed frame is sliced are not less than 64 bytes. Therefore, when the data length of a low-speed frame is less than 124 bytes, the low-speed frame cannot be fragmented. In addition, frame preemption can reduce the queuing blocking time of high-speed frames, thereby effectively reducing the latency of high-speed frames, but at the expense of increasing the latency of low-speed frames. The uncertainty of low-speed frame slicing timing will cause fluctuations in the blocking time of high-speed frames, thereby introducing a certain amount of latency jitter. SummarizeThis article analyzes five mechanisms: token bucket, credit-based shaping (CBS), time-aware shaping (TAS), round-robin forwarding (CQF), and frame preemption (FP). The five mechanisms can be used individually or partially integrated, such as time-aware shaping integrated with credit-based shaping, and time-aware shaping integrated with frame preemption. The prerequisites for the use of the mechanisms and the applicable scenarios of the mechanisms are the key points worthy of attention; token buckets are widely used in the Internet, and the latter four mechanisms are currently mainly used in LAN scenarios such as automotive Ethernet, factory intranets, and aerospace equipment systems; in terms of scheduling delay guarantee granularity, the five mechanisms are gradually progressive, and the granularity is getting finer and finer. To make it easy to understand, This article simplifies the relevant details. For more configuration parameters and technical implementations of the mechanisms, please refer to the relevant standard protocols and product documentation. |
<<: Education takes off with 5G smart technology
>>: Interviewer asked: What is a dynamic proxy?
1. TCP Knowledge System We analyze the TCP knowle...
Structured cabling standards help organizations a...
IT departments are becoming more and more strateg...
At present, the degree of enterprise informatizat...
introduction On May 10, 2021, the National Securi...
On November 26, 2014, the Ministry of Industry an...
Asia has high hopes for the Internet of Things. I...
The Federal Aviation Administration (FAA) said it...
Deloitte's "Study of Advanced Wireless A...
Is 5G still waiting for a "killer app"?...
How will businesses’ approach to networking evolv...
In 2017, Broadcom, Qualcomm, Marvell and other ma...
5G (fifth-generation mobile communication technol...
Two programs on the network exchange data through...
A few days ago, we shared information about spins...