UDT, a high-speed data transmission protocol based on UDP

UDT, a high-speed data transmission protocol based on UDP

Introduction

Simple is beautiful. In the world of network protocols, TCP and UDP are two very common protocols based on the IP protocol. The HTTP protocol we often use now is based on the TCP protocol. Equivalent to the stability of TCP, UDP is used in certain specific occasions due to its unreliable data transmission, such as live broadcast, broadcast message, video and audio stream processing, etc., where data integrity verification is not required.

Compared with TCP protocol, UDP is simple in nature. It removes various restrictive features in TCP protocol to ensure the accuracy of messages. The benefit of simplicity is speed! Today, I will explain to you UDT, a high-speed data transmission protocol based on UDP.

[[440279]]

UDT Protocol

Because of its simple characteristics, UDP can do many things that TCP cannot do, such as fast transmission of large amounts of data. This is not to rank TCP and UDP in terms of good or bad. After all, each protocol is adapted to different scenarios. The reason why they are popular is that they can play an important role in specific scenarios. To paraphrase a Chinese proverb: It doesn't matter whether the cat is black or white, as long as it can catch mice, it is a good cat.

By making good use of the UDP protocol, we can quickly transmit large amounts of data. This protocol is the UDT protocol.

In other words, these basic protocols were invented by foreigners, and China's Internet giants are all scrambling to build platforms and traffic businesses. There is really nothing to say...

The UDT project started in 2001 and was developed by Yunhong Gu while he was a doctoral student at the National Center for Data Mining (NCDM) at the University of Illinois at Chicago. It has been continuously maintained and upgraded since graduation.

UDP came into being because at that time, the fiber optic network with faster and cheaper transmission speed appeared, replacing the previous copper cables and twisted pair cables, thus greatly improving the efficiency of information transmission. At this time, people found that using the TCP protocol to transmit big data would have great problems. Thus, the UDT protocol based on UDP came into being.

The first version of UDT, also known as SABUL (Simple Available Bandwidth Utility Library), facilitates data transmission in private networks by supporting bulk data transmission.

It should be noted that the first version of UDT, SABUL, uses the UDP protocol to transmit data and a separate TCP protocol connection to transmit control messages.

The initial version of UDT was developed and tested on ultra-high-speed networks (1 Gbit/s, 10 Gbit/s, etc.), and in October 2003, NCDM achieved an average transmission of 6.8 Gbits per second from Chicago, USA to Amsterdam, Netherlands. In the 30-minute test, they transferred about 1.4TB of data.

Starting from version 2.0 released in 2004, SABUL was renamed UDT. The full name of UDT is UDP-based Data Transfer Protocol, which is a data transmission protocol based on UDP.

Why change to UDT? Because in UDT2.0, the TCP control connection in SABUL is deleted, and UDP is used to process data and control information. In addition, UDT2 also introduces a new congestion control algorithm, which allows the protocol to dynamically adjust UDT and TCP flows and realize the concurrent operation of UDT and TCP flows.

In 2006, the UDT protocol was upgraded to version 3. The protocol not only runs in private networks, but also extends to the commercial Internet. At the same time, the congestion control in UDT3 can be adjusted and optimized, can run in low-bandwidth environments, and allows users to easily define and install their own congestion control algorithms. In addition, UDT3 also significantly reduces the use of system resources (CPU and memory).

In 2007, UDT4 was optimized and improved in terms of high concurrency and firewall penetration. UDT4 allows multiple UDT connections to be bound to the same UDP port, and it also supports aggregate connection settings for UDP hole punching.

What is UDP hole punching?

UDP hole punching is commonly used in Network Address Translation (NAT) to maintain the flow of user UDP packets across NAT. It is a method of establishing bidirectional UDP connections between Internet hosts in a private network using a Network Address Translator.

What is NAT?

Everyone knows that IPV4 addresses are limited and will soon run out. So how do we solve this problem?

Of course, a permanent solution is IPV6, but even after so many years of introduction, IPV6 still seems to have not been truly popularized.

What is the solution if we don't use IPV6?

This method is NAT (Network Address Translators).

The principle of NAT is to map the IP and port of the LAN to the IP and port of the NAT device.

NAT maintains a translation table inside, so that many LAN servers can be connected through one NAT IP address and different ports.

So what’s the problem with NAT?

  • The problem with NAT is that the internal client does not know its external IP address, it only knows the internal IP address.

If it is in the UDP protocol, because UDP is stateless, NAT is needed to rewrite the source port and address in each UDP packet, as well as the source IP address in the IP packet.

If the client tells the server its IP address within the application and wants to establish a connection with the server, it will definitely not be able to establish a connection because the client's public IP address cannot be found.

Even if the public IP is found, any packet that reaches the external IP of the NAT device must have a destination port, and there must be an entry in the NAT translation table that can translate it to the IP address and port number of the internal host. Otherwise, the connection failure problem shown in the figure below may occur.

How to solve it?

The first way is to use a STUN server.

A STUN server is a server with a known IP address. Before a client communicates, it first queries its external IP and port on the STUN server, and then uses this external IP and port to communicate.

But sometimes UDP packets are blocked by firewalls or other applications. At this time, the repeater technology Traversal Using Relays around NAT (TURN) can be used.

Both parties send data to the relay server, which is responsible for forwarding the data. Note that this is no longer P2P.

Finally, we have a comprehensive protocol called ICE (Interactive Connectivity Establishment):

It is actually a combination of direct connection, STUN and TURN. When direct connection is possible, use direct connection. If direct connection is not possible, use STUN. If STUN cannot be used, use TURN.

When using STUN and ICE, we will have a network host to establish port mappings and maintain the status of other UDP ports. However, the UDP status usually expires after a short period of time, from tens of seconds to several minutes. In order to ensure the status and life cycle of UDP in NAT, the UDP hole punching technology is introduced. The UDP status in NAT is updated by periodically transmitting keep-alive packets.

Disadvantages of UDT

Because UDT is based on the UDP protocol, but the UDP protocol does not have security features due to its simplicity. Therefore, the UDT protocol based on it lacks security features, so its application in commercial environments will be subject to certain restrictions.

However, a new version of UDT is already under development, so you can look forward to it.

Summarize

UDT is widely used in high-performance computing, such as high-speed data transmission on fiber optic networks. We will show you how to use the UDT protocol in Netty later.

<<:  How does 5G help enterprises explore the development of the Internet of Things?

>>:  Telecom, Mobile, Unicom, it will be too late if they don’t transform

Recommend

A great tool for front-end engineers - Puppeteer

[[423414]] This article mainly talks about puppet...

Efficiently build vivo enterprise-level network traffic analysis system

1. Overview With the rapid development of network...

Will remote work boost unified communications interoperability?

As more employees work remotely from home during ...

Let’s talk about the four major features of 5G

From telegraphs, telephones to mobile phones, and...

Graphical explanation | A brief history of what is HTTP

[[344212]] This article is reprinted from the WeC...

Quantum computing will impact businesses despite misunderstandings, study shows

Nanotechnology, transportation, cybersecurity and...

Millimeter wave tragedy puts 5G in an awkward position

The United States is increasingly anxious and str...

For the first time in 21 years! SpaceX acquires satellite communications startup

On August 9, according to foreign media reports, ...

What can 5G technology do? It will have a significant impact on 20 industries

First of all, we must know what 5G is. In a nutsh...