A brief discussion on operation and maintenance under SDN architecture

A brief discussion on operation and maintenance under SDN architecture

At present, the domestic network operation and maintenance is still in its infancy. The staff are busy like putting out fires every day. "Why is the network down again?", "Oh my god, the server is down?", "The network speed is as slow as a turtle". These complaints echo in the ears of the operation and maintenance personnel every day. The operation and maintenance personnel can only bury their heads in searching the system operation logs, which is time-consuming and labor-intensive, and sometimes they get nothing after working for a long time. As an operation and maintenance engineer, have you ever encountered similar miserable experiences?

[[247314]]

Traditional Network Operation and Maintenance Pain Points

Traditional network operation and maintenance involves typing different command lines for different vendors' equipment every day, from Cisco, Juniper, to Huawei, and H3C, and the only change is to change the command show/display, no/undo. Network management is decentralized, and the network and cloud management platform, security, and IT/business systems are independent of each other and need to be maintained separately, which is inefficient; the network structure, configuration, topology, and link status are not visualized, and operation and maintenance personnel can only rely on experience and memory to change and adjust the network, which leaves a lot of hidden dangers for the network; the management mode is single, based on single device or single machine architecture management, with many errors and omissions, and difficult troubleshooting. When there is a problem with the network, all departments of the company blame the operation and maintenance department, but the operation and maintenance personnel are also innocent. Not to mention the complicated work they face every day, they can only "swallow their teeth and blood" when there is a problem, and become the real "scapegoat".

The operation and maintenance department has to formulate different rules and regulations every day. Larger companies will have their own developers to do secondary development of open source software and open source products. In traditional networks, as the business of the enterprise grows, the scale of the company's network operation and maintenance department will also expand. A typical network operation and maintenance department has only a dozen people at the beginning. After four or five years, the business system becomes complicated, the types of network equipment involved are increasing, and the number of operation and maintenance personnel is increasing, which basically doubles. They need to be on duty 24 hours a day, 7*24 hours a day, even the staff who have gone home after get off work are always on their mobile phones. There will be fault templates when handling old faults. When encountering new faults, in addition to working hard to find a solution, you have to write a new fault template in the end. As a result, the fault template library of the operation and maintenance personnel is getting longer and more complicated. However, I can only sigh, "My heart is very tired!"

The development of the network

What kind of changes are happening in the network? Only when we see the changes in the network can we see what corresponding changes need to be made in network operation and maintenance.

From the release of TCP/IP protocol in 1974 to today's SDN, network technology has been developing. During this period, Fast Ethernet, MPLS, SDN technology, Openflow 1.0 and subsequent versions, and the release of Open Daylight have promoted the development of the network.

In the 1960s, many universities and research institutions were working on new communication technologies, and one of them was the U.S. Department of Defense. At that time, packet switching was born to achieve a roundabout communication transmission method. By the second half of the 1960s, a large number of people had been involved in the research of packet switching and packet communication. Later, in order to provide reliable communication for interconnected computers, a global organization proposed the TCP/IP protocol specification in 1982. Around 1990, both local area networks and wide area networks began to tend to the TCP/IP protocol.

The Internet was put into commercial use in 1995, when the number of Internet service providers increased dramatically. In 1996, the IPv6 specification was released and included in the RFC.

In 1995, the Fast Ethernet standard was developed, and the IETF established the MPLS working group in 1997. In 2005, the concept of carrier-grade Ethernet appeared in China, and in the same year, the global backbone network infrastructure began to rise on a large scale.

SDN was born in 2006. Since its birth, there have been few commercial projects in China. In 2009, Openflow 1.0 was officially released, which set off a wave of enthusiasm around the world. People began to realize that the network was going to change. The establishment of ONF in 2011 set off another wave. In 2012, Google B4 was fully operational, OpenDaylight was released in 2013, and ONOS was released in 2014. Players from all walks of life began to enter the SDN field.

What is SDN?

SDN is the abbreviation of Software Defined Network. SDN is a network architecture that separates the control plane from the forwarding plane of the network and directly programs the control plane through open and programmable interfaces. The core concept of SDN is to control the forwarding behavior through applications and completely define the entire network through software.

The SDN architecture is divided into the application layer, control layer and infrastructure layer:

  • The application layer includes various services and applications and is responsible for the orchestration of various network resources;
  • The control layer is the control software of SDN, which is responsible for processing various data forwarding resources, maintaining network topology and status information, and performing global network management;
  • The infrastructure layer includes various network devices, which are responsible for data processing, forwarding and status collection.

SDN is a technology that reconstructs the existing network architecture. The traditional network architecture is defined by the transmission of network traffic, which is defined by network infrastructure such as switches and routers. It is like the traffic on city roads. Before GPS navigation, drivers basically took the shortest and best route based on the current situation, but it was often a mess during peak hours. SDN, on the other hand, arranges and dispatches each vehicle to its destination based on the dynamic traffic conditions of the entire city and the needs of each vehicle (such as the shortest time, the lowest cost, and not taking the highway). It dispatches from a global perspective and ensures the optimal route for each vehicle.

SDN technology has become the first choice for the next generation of network core technology due to its open architecture and flexible deployment and programming capabilities. Whether it is the SDN transformation completed by Google for its DC (data center) system, or the SDN cloud service experience shared by IT giants Microsoft and Alibaba, they all paint a bright future for the application of this technology. SDN-based network virtualization can decouple the logical network topology of the service from the physical network topology, greatly improve the speed of service delivery, simplify network operation and maintenance, and at the same time meet the demands of operators, government and enterprises to reduce network costs and increase the speed of service innovation.

Advantages of SDN for Operation and Maintenance

Traditional networks consist of devices with integrated control and data forwarding planes, so each box needs to be configured and managed independently. Even simple changes to the network can take weeks or even months to complete because changes must be made to each device. But with the rise of the Internet of Things (IoT), cloud computing, and mobility, the separation of control and data planes in the SDN architecture enables control to be abstracted from devices and centralized so that network administrators can centrally control and manage the underlying complex infrastructure. In theory, all network nodes only need a forwarding or data plane to push data packets. The advantages that SDN brings to operations and maintenance are as follows:

  • Reduce overhead

Since network administrators no longer have to go from device to device to change network configurations, they can make necessary changes more efficiently. Not only can network configurations be efficiently controlled through centralized control, but many configurations can also be automated.

  • Holistic centralized network management

One of the biggest benefits of the SDN approach is that all network components can be controlled through a single device. Both physical and virtual devices can be controlled through a single API, making the life of network administrators much easier.

  • Enable "Network Experiments without Network Impact"

The flexibility that SDN brings to the network allows administrators to “jump over” the limitations imposed by SNMP and experiment with network configurations without the network being impacted.

  • More detailed, granular security

Virtualization, cloud computing, and mobile devices have created significant challenges for information security. SDN controllers provide a single point of control where information security policies and rules can be distributed throughout the organization. In addition, SDN controllers provide an additional point where security policies can be placed to address specific software and application vulnerabilities.

  • Improving the ability to respond to cyber threats

SDN helps IT staff respond to security incidents by providing them with real-time visibility into network activity. You can also program the network to automatically respond to certain types of events, alleviating human reliance. For example, let's say a laptop detects that someone is sending malware or attacking another system. SDN allows you to program the network to selectively block specific traffic based on attributes such as device address or application.

  • Improve visibility into your network

One of the biggest benefits of software-defined networking is the overall increased visibility of an organization’s network. First, centralized control can identify network security, performance, and challenges. All of this can be analyzed without disrupting network activity. By pinpointing the source of bandwidth or security challenges, outages and downtime can be prevented before the network is disrupted.

  • Scalability

Additionally, the flexibility of this centralization allows for more options to be included, as SDN allows programmers to write to common interfaces and manage multiple devices without having to understand the intricacies of each device on the network.

  • Use network resources more efficiently

In traditional networks, the network control plane that determines how data travels is located in hardware. In an SDN infrastructure, the control plane is a software function that operates independently of the network hardware. This logical separation of the network and data control planes enables SDN to support advanced applications and services, including big data analytics, while keeping up with the growing demand for network services.

  • Improve uptime and reliability

The flexibility and redundancy built into SDN programs can eliminate human errors that can occur during network deployment. In addition, SDN supports virtualization of most physical and virtual network devices, allowing you to perform upgrades or replacements on one component of the network without taking the entire system offline. In the event of downtime, SDN supports snapshots of configurations, allowing for quick recovery from disruptions caused by upgrades.

The future of networking will increasingly rely on software, and SDN is a big step forward in addressing many of the challenges faced by traditional networking approaches. IT brings network operations into the modern realm by improving visibility and security while simplifying and automating operations.

Changes in SDN Operation and Maintenance Tools

In traditional network operation and maintenance, there are so many operation and maintenance rules and regulations, but there is only so much that operation and maintenance personnel can do. They have to type different command lines for hardware devices from different manufacturers, check logs when problems occur, and write fault reports. The main features of SDN networks are clustering, virtual software network data flows, and simple presentation in a graphical way, which facilitates business launch and subsequent content maintenance. So if SDN is so powerful, does it mean that operation and maintenance tools are no longer needed? The answer is of course no!

In the SDN system, there is an independent central controller and upper application layer. The forwarding layer only serves as the lowest level of data forwarding. Business orchestration is done in the controller, which is a pure software system. This system can realize external API docking, and this is when DevOps comes in handy.

DevOps promotes communication and collaboration between developers, operations teams, and infrastructure professionals to achieve unified and automated IT development, implementation, and management. Meanwhile, SDN allows engineers to apply software control to network elements, centrally managing and configuring large amounts of virtual and physical infrastructure.

1. SDN meets NetDevops

DevOps (a combination of Development and Operations) is a collective term for a set of processes, methods and systems that promote communication, collaboration and integration between development (application/software engineering), technical operations and quality assurance (QA) departments. It is a culture, movement or practice that emphasizes communication and cooperation between "software developers (Dev)" and "IT operations and maintenance technicians (Ops)". By automating the processes of "software delivery" and "architecture changes", it makes building, testing and releasing software faster, more frequent and more reliable. Its emergence has made the software industry increasingly aware that in order to deliver software products and services on time, development and operations must work closely together.

2. DevOps and automated network requirements

DevOps leverages the componentization of applications into small applets (or microservices) that can be distributed across a range of data center resources (i.e., public or private clouds). Containers (e.g., Docker) are becoming a popular way to quickly introduce new microservices.

Microservices and DevOps applications require fast provisioning of compute and storage network resources so that they can run quickly, scale as needed, perform with high reliability, and guarantee the security of services. Networks require management tools to meet the needs of development and automation - reducing downtime and complexity in processing, while not requiring the transmission of Opex data.

The network is responsible for quickly provisioning the appropriate resources for DevOps applications and plays a critical role in securing and managing these rapidly moving applications. However, the agility and rapidly changing requirements of microservices challenge the capabilities of traditional networks. The decomposition of applications means there are too many moving parts for manual networking - so network automation is critical. The ability to pre-test network resources with DevOps is important to reducing application deployment time (for example, to go back and fix network problems). The basic ideal: developers don't have to worry about network resources, including IP addresses or firewall rules.

3. Where SDN, DevOps, and Automation Meet

Software-defined networking optimizes development and automation of networks, enabling IT organizations that deploy complex applications to quickly provide network resources and services (including security policies). SDN supports centralized management of the network and transfers the challenges of (manual) configuration from personnel to technology, reducing operating costs.

SDN-based networks can automatically detect traffic changes and select the path data takes through the network based on parameters such as application type, quality of service, and security rules. The software control plane manages and hides network complexity, making 10,000 switches look like one. SDN can instruct the network to provide services consistent with its associated applications and support the rapid deployment of a large number of new applications and microservices (for example, containers).

SDN provides the ability to automate network processes to quickly provision network/security resources for DevOps applications. It can reduce operational costs by shifting the challenges of (manual) configuration from people to technology. Many hyperscale cloud providers - including Google, Apple, Facebook, and Microsoft - have deployed SDN technology to help automate the configuration and management of their networks. IT leaders should consider deploying SDN to meet the changing needs of their DevOps teams and related applications.

Let's talk about SDN operation and maintenance. SDN has so many advantages, so will the operation and maintenance work be easy? SDN operation and maintenance work mainly includes two aspects, one is daily operation and maintenance, and the other is engineering projects. Daily operation and maintenance work is similar to traditional network operation and maintenance, including on-duty monitoring, first- and second-line fault resolution, and communication with various departments.

The key is cross-departmental communication. In traditional network operation and maintenance, many devices and functions are bundled together, and related functions are not open to the outside world. Only the equipment suppliers know them. Therefore, operation and maintenance is often a closed department, and there is not much intersection with development. However, after entering the era of SDN, operation and maintenance will involve many departments, such as testing, R&D, etc. At this time, operation and maintenance is no longer closed, and it is necessary to look at this position from a new perspective. It is necessary to interact with network engineers in the development department and the testing department in advance. This is also in line with the requirements of DevOps, that is, in order to deliver software products and services on time, development and operation work must work closely together.

The tools used in SDN operation and maintenance are similar to those used in traditional network operation and maintenance, mainly Cacti, Smokeping, Nagios, and Zabbix. However, nowadays, more attention is paid to open source, which can promote the development of SDN and network technology. Operation and maintenance engineers can learn more about the network and have more autonomy in network management. Engineers can also do secondary development on open source software according to their own needs, which greatly reduces the cost of network operation and maintenance and improves the efficiency of operation and maintenance compared with traditional closed operation and maintenance.

SDN automated operation and maintenance

Operation and maintenance includes three stages: alarm monitoring, change, and troubleshooting. Before introducing alarms, let's talk about the SLO and SLI that operation and maintenance personnel need to pay attention to. Then we will briefly analyze monitoring, analysis, change, and troubleshooting.

1. Operation and maintenance service quality design

In traditional network operation and maintenance, network engineers focus on SLA, but operators focus on SLO and SLI. We need to find out what the service quality indicators are and set goals based on the indicators. SLI is a carefully defined measurement indicator. It determines what to measure based on the characteristics of different systems. The determination of SLI is a very complex process. SLI needs to answer what the indicator to be measured is, what the system status is when measuring, how to summarize and process the measured indicators, whether the measured indicators can describe the service quality, and the credibility of the measured indicators. It mainly focuses on performance, availability, quality, internal indicators and factors. SLO (Service Level Objective) specifies an expected state of the functions provided by the service. SLO should contain all the information that can describe what functions the service should provide. Service providers use it to specify the expected state of the system; developers write code to implement it; customers rely on SLO for business judgment. SLO does not mention what will happen if the goal is not achieved. Network latency, packet loss rate and end-to-end can all be used as measurement indicators, and we set SLO based on this indicator.

SLA is a contract involving two parties, and both parties must agree to and abide by it. When providing services to external parties, SLA is a very important service quality signal, which requires the involvement of both the product and legal departments.

2. Monitoring and alarm

SDN can perform more white-box monitoring, that is, to understand the operating status of the system by monitoring the performance indicators within the system. From the southbound interface, SDN only needs to monitor a few protocols, and the monitoring is relatively simple. When facing business changes, it can change with the API changes. The main complexity is concentrated in the control plane and business orchestration. The monitoring industry is mainly focused on the robustness of the control plane, user business status, and consistency of control forwarding. In large networks, a large number of path calculations and re-optimizations caused by underlying link failures need to be controlled in a timely manner and respond quickly. The web interface for end users will need to respond and analyze various requests and configuration changes in real time.

The design of monitoring alarms in the operation and maintenance system usually starts from the bottom-level collection and is designed from top to bottom, followed by storage, functional module development, upper-level alarm channels, and user side. In terms of the collection method, it is necessary to choose whether to adopt a centralized or decentralized method based on the network architecture. If there are many forwarding nodes in the network, then in this case, a centralized method cannot be used. It is necessary to formulate distributed collection in different regions, including storage, according to your own business distribution points. Deploy central storage and distributed storage, and synchronize distributed collection to central storage in real time. At the same time, backup is required after local storage.

In terms of functional modules, by collecting original data at the bottom layer, according to the rules of the original system, an intermediate layer is created from monitoring alarms to alarm channels. Network administrators can then make customized rules based on their own network conditions.

After getting the original data, how to better display the data and synchronize useful information in real time. Real-time alarms in SDN are not like traditional networks that only forward at the bottom layer. Now it can monitor business systems and network elements in real time (operating system stability). After getting the alarm information, it is classified before the next alarm analysis can be done.

3. Log statistics analysis

Log statistics analysis, now most companies use ELK for analysis. The software can be developed differently according to your business.

The log includes the entire SDN system, from the upper-layer control system, the middle-layer operating system, storage, and service orchestration, the bottom-layer forwarding network element, and finally the bottom-layer transmission. In traditional networks, operation and maintenance personnel will not care about these, but only the network equipment.

4. Traffic statistics analysis

Traffic statistics analysis: Currently, network management systems and operation and maintenance personnel focus on device traffic and port traffic. SDN needs to focus on the entire link port, and more importantly, business traffic. The biggest feature of SDN is that it can be associated with the business system and all business-related traffic information can be viewed through the operation and maintenance system.

5. Changes

In traditional networks, it is difficult to have a unified configuration template due to the changing needs of time and services. Various temporary configurations are set up on different devices. Today's network maintenance personnel dare not delete the settings of the previous operation and maintenance personnel. Over time, changes in people, equipment, and requirements will cause the configuration to be out of touch with the actual situation. SDN basically gets rid of the device configuration problem. Infrastructure data can be realized on the GUI through self-discovery and initial definition. Business data is realized through the GUI and API. When the software is upgraded, the front-end, back-end, business orchestration, and underlying controller components of the control plane can be upgraded separately or uniformly, and there is no obvious impact on forwarding.

6. Automated troubleshooting

SDN troubleshooting is more about combining with DevOps and solving problems through software. A good fault handling system can self-heal and perform correlation analysis. When multiple warnings appear, how can these warnings be automatically correlated and then generate a truly useful one. Fault self-healing means that after correlation, the fault can heal itself without human intervention.

Where will traditional operation and maintenance personnel go in the future?

The evolution of future telecommunications network architecture based on SDN technology has had a profound impact on the operation and maintenance process. The integration of telecommunications technology and IT technology has also put forward new skill requirements for the operation and maintenance teams involved in the system.

In addition to knowing traditional operation and maintenance skills and tools, SDN operation and maintenance personnel must also understand the SDN operation and maintenance system. Currently, from the perspective of the SDN system, the lowest-level resources include network devices, forwarding network elements, devices, and servers. The collection part mainly covers SNMP collection, Netconf command issuance for traditional devices, Openflow protocol for new devices, and CLI management.

SDN operation and maintenance system architecture

The storage in the middle is independent and separate, with logs, configuration libraries, and knowledge bases in the middle, which are independent and separate in the storage part. Functions include monitoring alarms and data collection, data analysis and statistics, process management and project management, and a large part is resource management, which includes document configuration. This part is mainly based on CMDB and has very powerful functions. How to use it in combination with the SDN system should be formulated according to the underlying network and controller development.

SDN is now being adopted by most companies. How can enterprises cultivate a suitable SDN operation and maintenance expert? Generally, companies choose to train existing employees because they feel that training existing employees is more economical than finding and recruiting new employees. Investing in existing employees requires a proactive top-down strategy to provide a large number of training opportunities. Secondly, from a personal perspective, network professionals should take control of their future and career. Not every network engineer needs to be a programmer. On the contrary, SDN requires a wider range of network concepts and basic knowledge. To understand how software systems work, it does not mean that you have to write code, but you need to understand how the entire ecosystem works and where things are done. In addition to these basics, network professionals should also take advantage of any learning opportunities. It is recommended that network professionals need to stick to it after making a plan. Plan carefully and focus on your own trajectory, and don't be affected by external circumstances.

<<:  Differentiate switches based on network coverage

>>:  VXLAN technology introduction: Building a virtual Layer 2 network with a Layer 3 network

Recommend

Z-Wave not concerned about potential threats from Project CHIP

This year marks the 20th anniversary of Z-Wave be...

HTTPS 7-way handshake and 9 times delay

HTTP (Hypertext Transfer Protocol) has become the...

In addition to speed, 5G can also change these aspects of your life!

5G is the hottest buzzword at the moment, and it ...

Wi-Fi - What's new in 6E networks? More interference testing is needed

Just like cellular standards, Wi-Fi standards are...

Phicomm N1 (Tiantian Chain) flash YYF voice version

More than 2 years ago, I recorded the process of ...

Intel and XSKY Debut at 2019 China Data and Storage Summit

On December 3-4, the 2019 China Data and Storage ...