Understanding Overlay Network Technology

Introduction

In the traditional historical stage, the network of the data center is based on the three-layer architecture (core, aggregation, access) as the basic standard. In the specific implementation process, with the continuous development of technology, different manufacturers have different ways of organization. Some manufacturers add virtualization technology to the core layer to achieve virtual separation of the core layer and the aggregation layer on the physical device, making the overall network architecture flat; some manufacturers add virtualization technology to the aggregation layer and the access layer to achieve virtual separation of the aggregation layer and the access layer on the physical device. However, no matter how it changes, the basic principle of Ethernet network transmission has not changed, and it is necessary to rely on network addresses and physical addresses to control forwarding. However, with the development of cloud computing, the scale of most data centers is moving towards the direction of large-scale and ultra-flexible needs. Then there are problems such as the difficulty of cross-regional migration and protection of virtual computing, the limited scale of cluster network isolation, and the limited overall network resources of the data center. It is in this historical background that the Overlay network represented by vxlan debuts.

1. What is Overlay Network?

1.1 Basic Architecture of Overlay Network

Overlay network technology refers to a virtualization technology model superimposed on the traditional network architecture. In other words, it is based on the premise of the traditional network architecture, and realizes the bundling of applications and their virtual networks while ignoring the transmission mode and technology of the underlying physical network. To understand the Overlay network architecture system, we first need to know its composition architecture and elements. To understand these things, we must start with the differences between it and the traditional physical network architecture to understand it in detail, as shown in Figure 1.1:

Figure 1.1 Overlay & Traditional Physical Network Architecture

As shown in Figure 1.1, the Overlay network architecture is a virtualized network built on the underlying physical network layer. In other words, we have partially adjusted the traditional physical network and established a set of virtual transmission channels through logical abstraction. Then you may have a question: "How do we complete data transmission in this virtual network?"

Since the virtual network channel is established, the data packets transmitted by the application must be sent and transmitted based on the data packets that the virtual network can identify, and must be transmitted in accordance with the channel control standards in the virtual network. However, the physical transmission process of the message has to rely on the traditional physical network, which involves the encapsulation and decapsulation of the message, the maintenance of the logical channel, and the logical and physical forwarding of the data. This involves the three core elements of the Overlay network:

Edge device: A network device directly associated with a virtual network. It is the place where data packets are encapsulated/decapsulated. It is also a physical node that forms a virtual network, such as the physical switch shown in the figure (it must be a switch that supports the Overlay protocol).
Control plane: A virtual entity in the framework, responsible for service discovery, address notification and mapping, virtual network channel establishment and maintenance in virtual network transmission, as shown in the control flow in the virtual layer in the figure.
Data plane: A virtual entity in the framework, mainly responsible for forwarding data messages in the virtual layer, such as the data flow of the virtual layer in the figure.

1.2 Basic Rules of Overlay Network Transmission

When transmitting data in traditional networks, the basic rule followed is the seven-layer model of the network. That is to say, data needs to go through the process of packetization at the source and unpacking at the destination. When packaging, it is gradually encapsulated from the application layer information to the physical layer, and when unpacking, it is decomposed from the physical layer to the application layer. The basic addressing rule in the physical network environment is to use IP address information and MAC address information for routing and forwarding. Then in the Overlay network, it will also follow this basic rule, but what is the difference?

Figure 1.2 Overlay & traditional physical network architecture

Taking VXLAN as an example, let's look at its basic transmission rules in conjunction with Figure 1.2. First, let's look at the three edge devices ABC of the Overlay network. These three points are the core devices that support the Overlay virtual network, which we call VTEP. When the server's data packets pass through these edge devices, the data packets will be encapsulated twice, and the address or identification information of the sending VTEP and the destination VTEP will be encapsulated into the data packets. The data is then transmitted between the two VTEPs through the VTEP control plane, and then the data packets are unpacked on the destination VTEP and finally sent to the destination server. You may have a few questions here:

If it is L3 transmission, isn't this unnecessary?
If it is L2 transmission, how does the source VTEP know the VTEP information corresponding to the destination MAC and IP?
Doesn't the transmission between VTEPs also rely on the physical network? How is it transmitted from the source to the destination?

First of all, regarding the first question, if it is L3 transmission, it is indeed unnecessary to do so, so the source VTEP will determine whether it is a real L3 transmission. If so, the VTEP information can be discarded and the transmission can be carried out in the traditional way.

Next, for the second question, the MAC information of all devices under the jurisdiction of the VTEP node will be retained on the VTEP, and the MAC address mapping information on other VTEPs will also be synchronized with each other. Therefore, once the destination address information in the data packet is obtained, the VTEP can determine which VTEP the destination belongs to, and then forward it through the controller.

Finally, the transmission from one VTEP to another VTEP is entirely based on the IP and MAC address information of the VTEP itself.

It can be seen that both L2 and L3 transmission involve table lookup forwarding, message decapsulation and encapsulation operations. From the perspective of forwarding efficiency and execution performance, they can only be implemented on physical network devices (Overlay edge devices), and traditional devices cannot support them. They must be implemented in new hardware forms.

1.3 Technical Standards of Overlay Networks

Currently, there are three major technical routes being discussed in the field of Overlay technology:

(1). VXLAN

VXLAN is a tunnel forwarding mode that encapsulates Ethernet packets on the UDP transport layer. In order to make full use of the balance of the bearer network routing, VXLAN uses the hash value of the original Ethernet data header (MAC, IP, four-layer port number, etc.) as the UDP number; uses a 24-bit L2 network segment identifier, called VNI (VXLAN Network Identifier); network traffic such as unknown destinations, broadcasts, and multicasts are all encapsulated as multicast forwarding, and the physical network is required to support any-source multicast (ASM).

(2)NVGRE

NVGRE is a tunnel forwarding mode that uses the Generic Routing Encapsulation Protocol for message encapsulation. It uses the lower 24 bits of the GRE header as the Tenant Network Identifier (TNI). In order to provide a flow that describes the granularity of bandwidth utilization, the transmission network needs to use the GRE header, but this makes NVGRE incompatible with traditional load balancing, which is the biggest difference and the biggest shortcoming of NVGRE compared to VXLAN. NVGRE does not need to rely on flooding and IP multicast for learning, but broadcasts in a more flexible way, but this requires hardware reliance. NVGRE supports reducing the MTU of packets to reduce the size of internal virtual network packets.

(3) STT

STT is a tunnel forwarding mode that uses TCP to encapsulate messages. It transforms the transmission mechanism of TCP and is a newly defined stateless mechanism. It redefines the meaning of each TCP field and does not require a three-way handshake to establish a TCP connection. It is also called stateless TCP. Ethernet data is encapsulated in stateless TCP; 64 bits are used to identify L2 network segments; network load balancing is performed by using the hash value of the original Ethernet data header (MAC, IP, L4 port number, etc.) as the source port number of stateless TCP.

The common technical mode of these three overlay technologies is to transform and encapsulate Ethernet packets and carry them to the logical tunnel level for forwarding. The difference in technical characteristics lies in the difference in encapsulation and tunnel construction, and the bottom layer is IP forwarding. VXLAN and STT have low requirements for traffic balancing on existing network devices, that is, the load link load sharing adaptability is good, and general network devices can perform link aggregation or equal-cost routing traffic balancing on L2-L4 data content parameters. NVGRE requires network devices to perceive the GRE extension header and perform hash calculations on the flow ID, which requires hardware support; the following is a description of the specific differences between the three overlay technology standards.

Table 1.1 Comparison of Overlay Technology Standards

Technical Standards	Support	Virtualization method	Encapsulation message	Link load capacity
VXLAN	UDP	24 bit VNI	50Byte	L2-L4 HASH
NVGRE	GRE	24-bit VSI	42Byte	N/A
STT	Stateless TCP	64 bit CID	58~76Byte	L2-L4 HASH

2. What problems does the Overlay network solve?

Through the above analysis, we have basically understood the basic architecture of the Overlay network and its basic transmission rules. Then everyone must be very curious about why this technology has appeared in the big technical background of cloud computing? What specific problems can it solve in this specific historical period? In fact, there are three in summary, all of which are related to the application scenarios of large-scale cloud data centers.

2.1 How Overlay Networks Solve the Spatial Limitations of L2

Although many traditional industries still use physical machines to deploy services, more and more computing tasks are running on virtual machines and containers. Kuberentes is now the de facto standard in the field of container orchestration. Due to daily updates and maintenance and sudden failures, large-scale migration of virtual machines and containers in clusters is a common occurrence.

When the host machine where the virtual machine is located goes down due to maintenance or other reasons, the current instance needs to be migrated to another host machine. In order to ensure that the business is not interrupted, we need to ensure that the IP address remains unchanged during the migration process. Because Overlay implements L2 network at the network layer, a virtual LAN can be formed between multiple physical machines as long as the network layer is reachable. After the virtual machine or container is migrated, it is still in the same Layer 2 network, so there is no need to change the IP address. A large cluster consisting of thousands of physical machines makes resource scheduling within the cluster easier. We can improve resource utilization, tolerate virtual machine errors and improve node portability through virtual machine migration.

Figure 2.1 Virtual machine migration under the overlay network architecture

As shown in the figure above, although the migrated virtual machine and other virtual machines are located in different data centers, since the two data centers can be connected through the IP protocol, the migrated virtual machine can still form an L2 network with the original cluster virtual machines through the Overlay network. For the application, the address it publishes to the outside has not changed. For the virtual machine, it only knows that the remote host and the local host can form an L2 intercommunication LAN and can perform VMotion. However, the real data migration does go through the L3 transmission of traditional network devices at the bottom layer. No matter what kind of transmission conversion is done at the bottom layer, as long as the upper layer protocol meets the migration conditions required by the application. In this way, cross-regional L2 resource migration is no longer an unsolvable problem. Without the support of this technology, I am afraid that even bare fiber connection cannot solve this problem, after all, the distance of fiber is limited.

2.2 How Overlay Networks Solve the Problem of Limited Network Scale

The maximum cluster officially supported by Kuberentes is 5,000 nodes. Usually, there are many containers on each node, so the resource scale of the entire cluster can reach tens of thousands or even hundreds of thousands. When a container sends an ARP request to the cluster, all containers in the cluster will receive the ARP request, which will bring extremely high network load. Traditional network technology cannot tolerate network requests of this scale. In the Overlay network built using VxLAN, the network will re-encapsulate the sent data into IP packets, so that the network only needs to know the MAC addresses of different VTEPs, thereby reducing hundreds of thousands of data in the MAC address table to thousands of data, and ARP requests will only spread between VTEPs in the cluster. After the remote VTEP unpacks the data, it will only broadcast it locally, which will not affect other VTEPs. Although this still has high requirements for network devices in the cluster, it has greatly reduced the pressure on core network devices.

In addition, in the L2 network environment, data flows need to be addressed through clear network addressing to ensure accurate arrival at the destination. Therefore, the MAC address table of the network device becomes the upper limit of the size of the virtual machine in the cloud computing environment. Because the table entries are not 100% effective, the number of available virtual machines is further reduced, especially for low-cost access devices, because their table entries are generally small in size, which limits the number of virtual machines in the entire cloud computing data center. After using the Overlay technology, the storage of this MAC address table is transferred to the VTEP device. Although the MAC and ARP specifications of the core or gateway device will also face challenges as the virtual machines grow, for the capabilities of this level of equipment, large specifications are inevitable business support requirements. The way to reduce the pressure on the specifications of access devices can be to separate the gateway capabilities and use multiple gateways to share the termination and carrying of virtual machines.

2.3 How does Overlay Network solve the network isolation problem?

Large-scale data centers often provide cloud computing services to the outside world. The same physical cluster may be split into multiple small blocks and assigned to different tenants. Because the data frames of the L2 network may be broadcast, these different tenants need to be isolated from each other for security reasons to prevent traffic between tenants from affecting each other or even malicious attacks. The current mainstream network isolation technology is VLAN, which has two major limitations when deployed in large-scale virtualized environments:

First, the number of VLANs in the standard definition is only 12 bits, which means that the available number is about 4,000. This order of magnitude is insignificant for public clouds or large-scale virtualized cloud computing applications. Second, VLAN technology is currently a static configuration technology (only EVB/VEPA's 802.1Qbg technology can dynamically deploy VLANs at the access layer, but it is mainly deployed in the switch port connected to the host as a conventional deployment, and the uplink port is still configured to pass all VLANs). This makes the entire data center network almost all VLANs are allowed to pass, resulting in unknown broadcast data flooding the entire network, uncontrolled consumption of network switching capacity and bandwidth.

If Overlay network technology is used, the above problems can be avoided. Take VXLAN as an example:

First, VxLAN uses 24-bit VNI to represent the number of virtual networks, which can represent a total of 16,777,216 virtual networks, far exceeding the 4,000 VLANs. This number is sufficient to meet the large-scale cluster requirements of today's cloud computing data centers. Secondly, VXLAN encapsulates data at the VTEP node during L2 transmission, so that more L2 broadcasts are converted into purposeful L3 transmissions at the VTEP node, thereby avoiding uncontrolled network resource consumption. This not only meets the problem of large-scale cluster network isolation, but also improves the network transmission security in this case.

3. Disadvantages of Overlay Network Technology

Nothing can be perfect, and the same is true for Overlay technology.

From the principle of Overlay technology, we can basically judge that the performance of Overlay network may be its problem compared with traditional network, because no matter which technical standard Overlay network uses, it will experience the problem of re-encapsulation and re-decapsulation of data packets, which will undoubtedly cause performance delays in data transmission. Figure 3.1 is the test result we captured:

Figure 3.1 VXLAN performance comparison

The above figure shows the comparison of transmission indicators of virtual machines under the default network configuration and virtual machines under the VXLAN configuration in the VMware environment. Judging from the test results in the figure, in the VXLAN environment, no matter how the virtual machine changes, its transmission throughput will be lower than the indicator under our normal network configuration.

In our enterprise IT environment, various types of applications exist, some of which have very high requirements for network performance. For example, in the transaction database cluster of the financial industry, the amount of data interaction between different cluster nodes exceeds that of general applications in terms of data size and transmission frequency, especially lock information, data cache blocks, and heartbeat information are key factors that directly affect the operation of the database. Therefore, when applying VXLAN, we also need to consider its shortcomings and choose the appropriate application scenario.

4. Summary and Outlook

Through the analysis in Sections 1 and 2 of this article, we understand the basic framework, data transmission principles, and basic technical standards of Overlay network technology, as well as the fundamental reasons and inevitable trends for the popularity of Overlay network technology. Through the analysis in Section 3, we also understand some of the defects of Overlay technology itself. I believe that in the process of application selection, everyone may have a more accurate grasp. However, with the development of technology and the rich application experience of enterprises, how to avoid its inevitable disadvantages and give play to its natural advantages through improved methods or architectural adjustments is a problem we need to discuss later, and it is also what we hope to see.

<<: How to accelerate enterprise innovation and transformation with the help of Wi-Fi 6?

>>: The main problems facing 5G networks

As the competition for 6G gathers momentum, can China continue to lead the world?

Operators opening up roaming on different networks can accelerate the withdrawal of 3G networks! Netizens: The billing method has been thought out

Blog

Satellite Internet: The war is already raging

Recommend

Maxthon Hosting: Los Angeles CU2VIP line monthly payment starting from 38 yuan, return trip AS9929 for three networks/outbound trip CN2 for China Telecom

Let's share another excellent line VPS node o...

Omdia: Global Gigabit Broadband Users to Reach 50 Million by 2022

According to the latest report released by market...

Huawei redefines data infrastructure. Here are the answers to five questions that the industry should be concerned about!

[51CTO.com original article] On May 15, Huawei re...

Understanding Overlay Network Technology

Introduction

1. What is Overlay Network?

1.1 Basic Architecture of Overlay Network

1.2 Basic Rules of Overlay Network Transmission

1.3 Technical Standards of Overlay Networks

2. What problems does the Overlay network solve?

2.1 How Overlay Networks Solve the Spatial Limitations of L2

2.2 How Overlay Networks Solve the Problem of Limited Network Scale

2.3 How does Overlay Network solve the network isolation problem?

3. Disadvantages of Overlay Network Technology

4. Summary and Outlook

As the competition for 6G gathers momentum, can China continue to lead the world?

How Businesses Can Implement IoT Solutions

Wu Hequan: The proportion of IPv6 traffic in domestic applications still needs to be improved

Maxthon Hosting: Los Angeles 9929 Line 38 yuan/month-1GB/20G SSD/600GB@200Mbps/Free backup

Gartner: The number of devices in use worldwide will reach 6.2 billion in 2021

Interviewer asked: Tell me about the principle of IP address allocation

Operators opening up roaming on different networks can accelerate the withdrawal of 3G networks! Netizens: The billing method has been thought out

Satellite Internet: The war is already raging

Required course: VLAN is so important! Share VLAN planning and configuration examples in two most common scenarios!

2.4G or 5G? Do you know the difference between Wi-Fi frequency bands?

Recommend

Maxthon Hosting: Los Angeles CU2VIP line monthly payment starting from 38 yuan, return trip AS9929 for three networks/outbound trip CN2 for China Telecom

Omdia: Global Gigabit Broadband Users to Reach 50 Million by 2022

Huawei redefines data infrastructure. Here are the answers to five questions that the industry should be concerned about!

Interviewer asked: What are the functions of the wait and notify methods in threads?

Diagram: 5G millimeter wave peak rate calculation

Did you know there are 4 types of network latency?

Why are there so many different communication protocols in industrial sites?

More secure: Windows 10 will block installation of uncertified drivers

What is the difference between 5G and 5GHz Wi-Fi?

Learn how to start your networking career

What is SD-Branch? Why do you need it?

What are the new features of HTTP/2 compared to HTTP/1.1? How to solve head-of-line blocking and header compression?

Ten basic skills for Linux operation and maintenance engineers

Illustration | You call this a thread pool?

Redis: How do I communicate with the client?