Imagine a business-critical network that is running smoothly. No critical tickets have been raised and all services are operational. The change control board is meeting to successfully review the day’s changes. Then, the network team performs a small routing change and everything seems fine. However, shortly after, they see several high-priority trouble tickets. Is this a coincidence or is there something going on? In response, the team reverted the change, which cleared the issue and provided evidence that a routing change caused the outage. Further analysis revealed that the routing change had caused a critical portion of the network to be accidentally isolated from the internet. Similar problems occur every day in networks of all sizes. Change control boards are supposed to detect and prevent incorrect changes, but problems still occur. How can network teams improve the quality of network changes?
The Case for Automated Pre- and Post-Change ChecksOne option is to use pre- and post-change network validation to assess whether the network operates as expected before and after the change. The goal here is to allow network teams to prevent outages by performing some simple pre-change routing checks. If the pre-change validation does not reveal an issue, the post-change check can detect the incorrect routing state, immediately pinpoint the cause and revert to the previous configuration. This simple process of validating the network state can reduce network outages, or avoid them altogether. While teams can use manual processes to perform pre- and post-change checks, automation makes more sense. Whether teams use manual or automated processes, they must determine the state of the network before and after the change. Engineers may notice that the post-change state often becomes the basis for the pre-change check in the next change cycle. Check before changeWhen teams automate the change process, they can move quickly. This also helps teams avoid human errors, such as converting numbers or working on the wrong screen, which are common problems during change windows. The pre-change process should ensure that the desired interface is selected – by checking its operational status and assigned address. If it is up and running, are the correct neighbors connected? These steps help teams avoid silly mistakes and the resulting outages. The network team can use the pre-change check as a validation step in the change control board function. They will submit the output of the pre-change validation to the change control board as evidence to document the desired starting state. The change control board also requires the team to provide a set of post-change checks, which they will perform to verify that the network has reached the desired state after the change. Post-change inspectionWhen a post-change check fails, the network is not in the expected state. This could be because the validation data is incorrect or the network is not in the desired state. Automation can save the collected data and quickly revert the change to return the network to the state before the change. The team can then analyze the collected data against the desired state, make any needed corrections and re-execute the change. As teams adopt this process, they may find that many network operational status checks are useful when performing changes, even if they did not think they were applicable. For example, do you need to check the Network Time Protocol when making a routing change? Log data will be more difficult to correlate between network devices if device clocks are not synchronized. Automation makes it easy for teams to perform many checks that would be impossible to do manually. Periodic status verificationPost-change status can be a useful tool for periodically verifying the operation of the network to ensure that the network is functioning as expected. Let’s say a redundant interface fails and the network management system does not flag it. Regular status verification will highlight it, allowing the team to take proactive action. When to schedule validation runsUnderstand how often you schedule validation runs, depending on the network and the business functions it supports. Teams should perform checks at the start of the workday. The check should be performed before any change window, regardless of the planned change. Network state validation is a read-only operation, so teams should not hesitate to run it regularly. Start network verificationStoring the current and desired operational status (in a format that enables automated checks) doesn’t involve much work. The real work is in the data collection and analysis by the automation platform. Fortunately, libraries such as pyATS are available for DIY automation, while commercial products can help simplify deployment. If teams can’t find a commercial product that meets their needs, consulting firms can help them build a system. In summary, there is no reason not to use automation to verify the network posture during daily operations and change control processes. |
<<: Common methods of data transmission and data call
>>: The long-awaited 5G messaging trial will be launched in late October or early November
iWebFusion is the iWFHosting that I shared before...
[[353944]] This article is reprinted from the WeC...
As extreme weather conditions wreak havoc, compan...
When users open Taobao, Baidu, Zhihu and other ma...
【51CTO.com Quick Translation】 Big data, as a set ...
For cross-border e-commerce sellers in 2019, the ...
The Internet of Everything, cloud computing, and ...
edgeNAT has launched a promotional event after th...
[[394197]] April 19 news (LeSi) At the regular po...
I clearly entered the website address of a portal...
We have shared 10gbiz's cloud server promotio...
[51CTO.com original article] The Global Software ...
1 Introduction This article reviews ETSI GS MEC 0...
80VPS has launched a mid-year promotion, offering...
[[267637]] On June 6, China officially issued 5G ...