More intelligent CDN technology, CDN moves towards the 3.0 era

Our lives are dependent on the Internet all the time. We complain and curse about the slow Internet speed all the time, but we also enjoy the dividends brought by the development of Internet technology. With the persistence of digging into the technical details, we will find that there are content distribution technologies like CDN living behind the stage. However, when we just want to understand it, it seems that its pace is so vigorous, and it is completing the leap from CDN 2.0 to 3.0, which seems to be an intelligent leap.

[[186697]]

1.0 to 2.0, 20 years of technological evolution

CDN (Content Delivery Network) is a content distribution network. Its purpose is to add a new network architecture to the existing Internet to publish the content of the website to the "edge" of the network closest to the user, so that the user can get the required content nearby, and improve the response speed of the user's access to the website. The CDN PoP (Point of Presence) architecture is a software stack that provides support for content delivery services. It has gone through the development from 1.0 to 2.0 until the emergence of the intelligent 3.0. Below we will use a small amount of space to sort out this history.

The CDN PoP 1.0 architecture was born 20 years ago. It was very suitable for websites at that time. Some small-capacity information could be delivered through slow Internet connections. At that time, the main challenge of CDN was to transmit web content from edge nodes (POPs) deployed at Internet service providers (ISPs). Every end user who accessed the POP would quickly get the response content instead of first accessing the source server through the network (which was still very slow at the time). In this way, CDN can easily send popular content to a large number of Internet users.

The CDN PoP 2.0 architecture is closer to our lives and is currently the most widely used CDN technology. Since the 2.0 architecture consists of a basic software stack, but does not have in-depth analysis of data and information, not to mention the pitifully few intelligent functions, this determines that its architecture is passive, responsive, and stateless. The goal of the 2.0 architecture is to cache edge content and perform some simple processing at the edge, striving to improve TCP transmission performance from the principle of proximity. The 2.0 architecture is centered on caching software, and is also equipped with load balancing, log analysis, DNS and other services.

CDN 3.0, moving towards intelligence

With the innovation of network services and the intelligence of mobile terminals, CDNs are also evolving accordingly. It seems that we should introduce today's protagonist - CDN 3.0. Below we will compare several aspects to explain the advantages of 3.0.

The research on CDN 3.0 has actually gone through a lot of theoretical research and demonstration, including hundreds of concepts such as stream processors, batch processing, message brokers, Hadoop, NoSQL, machine learning, Cassandra, Spark, deep neural networks, recurrent neural networks, convolutional neural networks, and a large number of different types of algorithms. Below we will extract some typical research content to demonstrate the advantages of CDN 3.0:

Rules Engine: Many CDNs have automated rules engines that allow clients to set caching rules for their content. The use of classifiers in machine learning models can make even the most advanced rules engines as difficult as prehistoric technology.
WAF+DDoS+Bot Mitigation: These distributed systems for responding to threats will be replaced by supervised and unsupervised machine learning models, including deep neural networks, repetitive neural networks, etc.
Deep learning: We no longer need to manually configure cumbersome configurations such as cache software stack, routing, storage, etc. These tasks will be taken over by the learning model.

Here is a picture of the comparison between CDN 2.0 and CDN 3.0:

Figure 1 Comparison of CDN 2.0 and CDN 3.0 architectures

But the 3.0 architecture is completely different, with its inherent big data and machine learning capabilities highlighting its intelligent attributes. It will handle larger edge transmissions than the 2.0 architecture, and each PoP node will become a part of the Hadoop ecosystem, including HDFS, Apache Spark, Apache Flink, Kafka, Redis, and many open source plug-ins created by companies such as Facebook, Google, LinkedIn, and Spotify.

In the world of CDN 3.0 architecture, cache engineers, network engineers, etc. will work together with mathematicians and data scientists. The entire feature set will usher in endless changes, which will be reflected in the types and number of algorithms used. Tasks that originally required manual intervention, such as cluster performance tuning and network tuning, will be solved by means such as machine learning (ML).

Machine learning changes the tuning work from manual tuning to self-tuning, which is always in the learning process. The new personalization features will use ML algorithms to capture the behavior of visitors to the site, and then use this behavior as input to the classifier to finally build patterns based on the training data. After that, every time a visitor visits the site, the ML algorithm will predict which pages will generate the most sales and personalize the delivered content.

Sales organizations will have to be retrained in a whole new technical language, which includes AI, machine learning, big data, DevOps, data science, statistics, and applied mathematics. Today, the term machine learning is perhaps just a buzzword used by marketing departments. In due course, the CDN industry and machine learning will become more integrated as more and more people are educated on minute details, such as what algorithm does what, the purpose of each type of algorithm, the differences between various neural networks, etc.

We can also observe this change from another dimension.

Figure 2 Comparison of CDN 2.0 and CDN 3.0 Feature Sets

Faced with large-scale changes in the network and software stack, what we need to do may no longer be to simply add code, expand functions through APIs, or optimize BGP routes. We may only need to simply do subtraction to switch to a more practical 3.0 architecture. The important feature of CDN3.0 is that it can well support the current mainstream applications of the Internet, while having better cost-effectiveness, more guaranteed service quality, and greater security. In fact, CDN 3.0 has been applied in the scenarios of Facebook, LinkedIn, and Twitter. Below we use a table to do some analysis on CDN2.0 and 3.0.

Table 1 Comparison of CDN 2.0 and CDN 3.0 features

Artificial Intelligence: Gimmick or Revolution?

Figure 3 Technology Split Curve

The above figure is a curve of the degree of technological innovation over time that a scientific research institution has been analyzing since 2000. The more new technologies appear over a period of time, the higher the level of innovation, which is called the technology split curve. It can be seen that before 2015, the emergence and progress of innovative technologies was a linear growth trend. With the birth of big data and machine learning technologies, the growth of innovation has shown a crazy trend. The emergence of new technologies around these two technologies has gradually promoted the development of science and technology. According to the trend of this curve, in the next 12 months, machine learning and big data technologies will have a subversive impact on the entire science and technology community.

Today, the term machine learning may just be a hype word in the marketing department, but one day in the future, it will definitely change the CDN industry.

Reference Links:

1. https://www.bizety.com/2017/02/20/cdn-edge-pop-architecture-2-0-end-life-hello-3-0-architecture/

2. https://www.bizety.com/2017/03/07/cdn-pop-architecture-3-0-end-cdn-commoditization-part-1/

3. https://www.bizety.com/2017/03/08/cdn-pop-architecture-3-0-end-cdn-commoditization-part-2/

<<: Dynamic security technology is included in the Ministry of Industry and Information Technology's network security demonstration project and will be promoted nationwide

>>: Huawei aims to be a smart city incubator providing basic energy

CloudCone: $69/month-E3-1270v2, 32G memory, 512G SSD, 100M/1Gbps bandwidth, Los Angeles MC data center

More intelligent CDN technology, CDN moves towards the 3.0 era

CloudCone: $69/month-E3-1270v2, 32G memory, 512G SSD, 100M/1Gbps bandwidth, Los Angeles MC data center

Accelerating 5G standardization requires coping with the complexity of test scenarios

Building a streaming data lake using Flink Hudi

The role of LoRaWAN and IoT in optimizing asset management

vSwitch expansion in the Ack cluster Terway network scenario

4 major roles of the network in enterprise digital transformation

PacificRack has run away

Quickly understand the core components based on Netty server

5G and WiFi6 technologies are driving the development of the Internet of Things

5G has no presence? Wrong! It has already "bloomed in many places"

Recommend

PTC acquires next-generation application lifecycle management company

5 things you should know about monitoring tools

[Hotspot] ZTE was fined $1 billion but escaped death. Review of the whole ZTE incident

The 5G era is coming. Will the WiFi you use every day disappear?

5G is here: Now how will we make it work?

Introduction to VPN technology and commonly used VPN networking methods in enterprises

How to lay the foundation for closed-loop automation

What kind of network slicing does 5G require?

Addressing the risk of permanent roaming through network localization

Learn InnoDB tablespace

Aruba: Modernizing the network to enable ubiquitous connectivity

The first "government cloud" platform in our province was launched in Xinzhou

5G new scenarios and technologies bring new security threats

How is LOCVPS? Simple test of LOCVPS Hong Kong Tai Po VPS

How the IT industry can adopt a data-led approach