Issues that need to be resolved before NFV large-scale deployment

Issues that need to be resolved before NFV large-scale deployment

NFV is a key technology that enables network reconstruction. Since the formal establishment of ETSI ISG NFV in early 2013, the NFV framework and series of specifications have been designed. It has gone through the innovation promotion period and the over-expectation period of Gartner's technology maturity curve, and has gradually turned to steady development. After nearly two years of laboratory testing and live network pilots for various professional virtual networks such as vIMS, vEPC, and vBRAS, there is no large-scale commercial use yet. Both operators and manufacturers have begun to promote the implementation of NFV more cautiously.

The original goal of NFV was to share hardware resources among multiple network systems through software-hardware decoupling. On the one hand, it introduced large-scale standardized universal IT infrastructure to reduce costs, and on the other hand, it accelerated the launch and update of new services through software deployment. The ideal is beautiful, but the reality is very skinny. In the process of implementation and deployment, it was found that CT systems and IT systems have significant differences in design methods, scale and complexity, reliability requirements, interoperability requirements, operation and maintenance, etc. Borrowing IT technology and thinking methods to solve CT problems may not be suitable for local conditions.

[[219865]]

In order to achieve large-scale deployment of NFV, the following issues need to be addressed:

1. Improve NFV forwarding performance and reliability

CT systems have higher performance requirements than IT systems. CT network elements can be roughly divided into control and forwarding types.

Control network elements need to provide high reliability guarantees. Typical network elements such as the mobility management entity MME of the 4G core network are responsible for processing the signaling interaction of 4G users accessing the mobile core network, including key control functions such as authentication and access control, mobility management, etc., but do not forward user service data. Therefore, the traffic passing through the MME has the characteristics of small traffic, large concurrency, and high reliability requirements. Traditional physical MME equipment can handle more than one million concurrent user sessions at the same time. If it is replaced with an X86 server plus vMME to carry it, it is bound to require the deployment of multiple servers and provide load sharing and high-reliability HA solutions.

Forwarding network elements need to provide high-throughput line-speed forwarding. Typical network elements include broadband remote access servers (BRAS), which are responsible for access control and traffic aggregation and forwarding of broadband services. BRAS has a variety of virtualized forms, including: vBRAS with separated forwarding and control planes, which uses an X86-based control plane and a proprietary forwarding plane device based on NP proprietary devices or programmable white box switches, that is, only the control plane is virtualized; an integrated pure soft vBRAS based on standard X86; vBRAS with separated forwarding and control planes, where the control plane is based on a standard X86 server and the forwarding plane is based on an X86 server with acceleration hardware. For vBRAS that only virtualizes the control plane, it is essentially not much different from the traditional physical BRAS. It only implements centralized IP address control and device configuration management, and cannot change the problem of long R&D, procurement, deployment and launch cycles of proprietary equipment. The integrated vBRAS based on general X86 servers, even with the application of software acceleration technologies such as DPDK, is far from the line-speed forwarding of more than 100G that has been achieved by the single-board cards of traditional physical BRAS devices from the perspective of forwarding performance. Therefore, the integrated vBRAS based on X86 is mainly used to carry ITMS and other services with small traffic and large sessions, but cannot carry services such as home broadband Internet access and IPTV. Using X86 servers and intelligent acceleration hardware to improve the forwarding performance of vBRAS has become an important solution for vBRAS to carry all services, but hardware acceleration will inevitably reduce the versatility of NFVI. Putting forward special requirements for NFVI for specific CT network elements seems to violate the original intention of NFV.

From the perspective of scale, the deployment scale of forwarding network elements must be far greater than that of control network elements. The cost of NFVI is closely related to the scale of IT infrastructure, and the shipment volume of servers directly affects the purchase price. Therefore, the economic benefits of NFV rely on large-scale deployment. Only when the forwarding network elements have completed virtualization deployment can NFV be promoted on a large scale. From this perspective, using hardware acceleration technology to solve the forwarding performance problem of NFVI is currently the only way and a requirement that CT network elements must put forward for NFVI. Using pluggable smart network cards to offload some software functions and reduce the consumption of CPU and PCIe bandwidth is a relatively common hardware acceleration solution worth looking forward to. As smart network cards, they are also divided into ARM architecture-based, FPGA-based and other products from the perspective of implementation. Because FPGA has relatively strong repetitive programmability, on the one hand, it can achieve forwarding performance comparable to that of X86 vBRAS and physical BRAS single-board cards, and on the other hand, it supports the development and deployment of new functions in the future, which is more in line with the concept of NFV using general hardware and rapid deployment of software. However, the price of FPGA smart network cards also depends on the shipment volume. It is currently much more expensive than ordinary network cards. However, if an industrial chain and large-scale commercial use can be formed in the future, it is still feasible considering the cost savings on the CPU and the increased cost of the smart network card.

2. NFV decoupling and standardization

The advantages that NFV expects to achieve, such as unified infrastructure, rapid deployment of new services, and a more open ecosystem, must all be achieved through decoupling. Decoupling the software and hardware layers is the most basic goal, otherwise there is no essential difference from traditional integrated proprietary equipment. However, in fact, according to the NFV framework defined by ETSI and the development of the industry, the current NFV industry chain can be divided into more levels:

  • Hardware: Server vendors
  • Platform: Hypervisor vendor
  • Functional software + management: VNF + VNFM + VIM manufacturers. Since these three modules interact frequently and have a certain binding relationship, VNF manufacturers usually also provide VNFM and VIM.
  • Orchestration: NFVO. Due to the need to globally manage network services and resources across manufacturers, many operators choose to develop their own NFVO or choose relatively neutral third-party manufacturers for in-depth customized development.

In order to avoid being bound by a few manufacturers and to perform cross-manufacturer orchestration management, the above four parts must be decoupled. There are two criteria for successful decoupling: first, the modules from different manufacturers can interoperate normally to realize the basic business functions of the NFV network; second, the functional software can achieve stable and consistent performance on different hardware and platforms, meeting the performance requirements of the NFV network.

The decoupling and cross-vendor connection of complex communication systems rely on globally unified technical standards. Only when everyone follows the same bit-level technical standards can the intercommunication of two communication devices be completed. 3GPP is undoubtedly the most successful international communication standards organization. The GSM, WCDMA, LTE/EPC and the upcoming 5G standards defined by 3GPP have laid the foundation for the development and large-scale deployment of several generations of global mobile communication systems. Strictly speaking, the ETSI ISG NFV does not define international standards, but general industry specifications that guide the NFV industry. ETSI's specification definition work is generally carried out in stages. From the beginning of 2013 to the present, the NFV specification has completed two stages of definition, and is currently in the third stage. From the proposal of the concept to the definition of architecture, functions, interfaces and information models, the NFV specification is also constantly being improved. However, the specifications output in the first two stages are not sufficient as the basis for the intercommunication of various modules. The third stage of work involves how to change the products of various manufacturers, and the discussion is more intense. In addition, traditional equipment manufacturers who are familiar with the communication standard gameplay have no motivation to promote the development of NFV standards, resulting in a pessimistic progress in standards. On the other hand, the various open source projects related to NFV have been growing wildly. Whoever creates the de facto standard has the final say, and it is even more fragmented. Even the Linux Foundation has begun to integrate its major projects. Products based on open source code are also developed by private optimization by various manufacturers. There are too many uncertainties, and it is difficult to become a commercial product in the CT field.

For the above reasons, only simple network elements such as integrated vBRAS can achieve full NFV decoupling, and operators have spent a lot of effort to promote the connection between various manufacturers one by one. For more complex systems such as vIMS and vEPC, more work is needed to achieve full decoupling.

3. NFV networks can be purchased and operated

At present, domestic operators usually carry out rolling planning and centralized procurement of various professional traditional network equipment on an annual basis. Before entering the procurement stage, these network devices are generally subject to centralized procurement tests by operators in a black box manner. After passing the test, they are subject to bidding. After winning the bid, the equipment is sent to the provincial and municipal companies of the operator, and then installed and configured in the computer room to load services. From testing to procurement, and then to the arrival and online deployment of equipment in different provinces and cities, the cycle is long, and there are often inconsistencies in the price/configuration of the winning products and the arrived products. The emergence of NFV is also to make up for the shortcomings of these traditional equipment procurement and operation, hoping to deploy a unified resource pool through procurement, and then quickly deploy network functions in software to accelerate business launch. However, the NFV network after layered decoupling also needs layered decoupling from procurement to operation.

First, from the perspective of centralized procurement testing, layered decoupling requires various combinations of functional and performance testing. The testing workload will increase exponentially with the addition of each bidder, and each update of the software and platform version will require re-traversal of various combinations of tests. In order to meet the centralized procurement testing needs of future NFV networks, it is necessary to build a complete test bed covering all disciplines and adopt an automated integrated testing method.

Secondly, traditional network equipment is provided by a single manufacturer, so any problems that arise are the responsibility of the manufacturer. After the NFV network is decoupled, more manufacturers are introduced. Once a failure occurs, it is necessary to first locate which layer has the problem, otherwise it is very easy for different manufacturers to shirk responsibility. Software problems are more difficult to locate than hardware problems. When it involves the cooperation of software from multiple manufacturers, it is difficult to get an objective conclusion as to which party should make the modification.

Therefore, the future procurement and operation of NFV networks will require operators to have strong system integration capabilities, and also require adjustments in organizational structure, business processes, and division of responsibilities.

In summary, NFV networks still need to solve issues such as forwarding performance and reliability, decoupling and interoperability standard formulation, and procurement and operation before large-scale commercial deployment. The network reconstruction of operators affects the entire communications industry. We hope to work together with the entire industry to solve the above problems and promote the implementation and implementation of network reconstruction.

<<:  Why is it not appropriate to implement "one price for data traffic nationwide" at present?

>>:  Which broadband operator do you use at home?

Recommend

Deutsche Telekom warns: Banning Huawei will hinder Europe's 5G development

Europe will fall behind the United States and Chi...

Interesting explanation of TCP three-way handshake and four-way wave

Students who have studied computer networks know ...

Is Matter worth the wait?

An ambitious new smart home networking standard i...

GSA: A total of 122 5G commercial networks have been launched worldwide

As technical standards and specifications are det...

Tencent Interview: Do you understand process communication?

[[432787]] This article is reprinted from the WeC...

Forcepoint releases 2017 cybersecurity predictions

[[179053]] Forcepoint , a global cybersecurity le...

MESI protocol, JMM, common thread methods, etc.

[[329428]] This article is reproduced from the We...

What is the difference between Cookie and Session in HTTP protocol?

HTTP is a stateless protocol, that is, each time ...

How to realize LoRa networking without a gateway?

As a low-power wide area network communication te...