Three tips for data center network maintenance

Three tips for data center network maintenance

The network is the most important component of the data center and also the most technically complex part. To perform routine maintenance and troubleshooting on the network, you need to master a lot of skills. The data center network is divided into storage network and data network. The storage network uses the Fibre Channel protocol, and the data network uses the Ethernet protocol. Compared with the Ethernet protocol, the Fibre Channel protocol is much simpler, as long as the second layer intercommunication is completed. The Ethernet protocol is relatively complex, and there are a variety of protocol standards that make people confused. It is impossible for one person to master all the protocols. So, facing the complex network world in the data center, how can we do a good job of maintenance? This article summarizes the three axes of network maintenance. With these three axes, you will be able to become a data center network technology expert.

[[178582]]

Usually, network maintenance work consists of two parts: daily inspection, timely elimination of hidden dangers; and fault handling, which can restore services in time when a fault occurs, and locate the cause to avoid the same fault from happening again. Daily inspection is relatively simple, and you can get through even if you just deal with it. However, there is an idiom called "a thousand-mile embankment is broken by an ant hole". Many faults are caused by usual negligence and lack of attention to various small hidden dangers, which eventually lead to major accidents. Network maintenance and fault handling is a more valuable job, and for this reason, network troubleshooting experts are very popular. To become a senior network maintenance expert, you need to have means of fault analysis and diagnosis. Most network maintenance personnel check equipment, check lines, capture packets, and find fault points. They are tired and annoyed, and they lose a lot of hair, but they still cannot solve various network fault problems. This is because they have not found the doorway in it. Network maintenance requires network equipment monitoring, network equipment positioning and fault location alarm, and network traffic analysis, which are called three axes here. Using these three axes well can make you invincible in the world of network technology.

The first axe: network traffic analysis

The data center network does not care about the content of the application layer, but only completes the interaction of data traffic. Therefore, it is particularly important to analyze the direction of traffic. It is necessary to monitor all link traffic in the entire data center network. When a fault occurs, the fault range and location can be quickly locked. These faults can be reflected on the traffic map at the earliest, so traffic analysis software is essential for data center network maintenance. Select one or two widely used traffic analysis software to deploy in the data center network, flexibly start network devices at different levels (access layer, aggregation layer, core layer) to collect traffic information, without changing the existing network structure, to achieve the function of statistics and analysis of data center network traffic information, to timely understand the network bandwidth occupied by various network applications, the network resources consumed by various services, to help network maintenance personnel to timely discover network bottlenecks, prevent network virus attacks, and provide rich network traffic analysis reports. Learning to find and solve problems from network traffic maps is a basic skill of network maintenance and must be mastered.

The second axe: network monitoring and analysis

As long as the data center network ensures that the traffic reaches the destination device, it has completed its mission. Whether there is a problem in this delivery process needs to be determined through monitoring means. Generally, we often need to use some detection tools to determine whether a network has a problem, such as: PING, TRACERT, SNMP, SYSLOG and other tools. Through PING and TRACERT, the fault location and device can be quickly confirmed so as to conduct in-depth analysis of the faulty device. Through SNMP and SYSLOG, basic information about the operation of the faulty device can be obtained. In many cases, the cause of the device failure can be confirmed through the SYSLOG information reported by the device. Many data centers have solidified these tools into the network monitoring software, which can effectively prevent or discover faults. The alarm will be displayed in the alarm box through sound, SMS, WeChat, etc., and sent to relevant personnel in a timely manner by sending Mail, etc., or automatically run the corresponding program to handle the fault, providing a full range of alarm monitoring, statistical analysis and alarm positioning.

The third axe: network fault analysis

When the first two axes are used and the fault analysis still has no clue, you need to use the third axe to conduct an in-depth analysis of the fault. At this time, you still need to use the basic network conditions obtained from the above two points to have a basic understanding of the fault performance. The performance of network faults is reflected in the PING service in three phenomena: unreachable, packet loss, and large delay. The three phenomena derive a variety of network problems, which are analyzed according to the three fault phenomena. First, if it is unreachable, there must be a problem with the forwarding table or data path. Check the second and third layer forwarding table items, ports, VLANs, links and other device information along the way to see if there are any settings errors. In addition, if it is unreachable, it may also be that the link directly connected to the device is broken. This refers not only to the links and modules connected between the outside of the device, but also to the connections inside the device. Some frame-type devices have internal traffic that passes through many boards. Problems with the internal interconnection will also affect forwarding. For some box-type devices, after the traffic enters the device, the message is discarded on the connector of the port before it is processed. These will cause unreachable; secondly, packet loss must be caused by unstable forwarding table items or data paths. For problems caused by instability, it is still necessary to check the forwarding table items and data paths, but pay more attention to the stability of the forwarding table items and whether there are any changes. For the data path, pay more attention to whether there are fault manifestations such as wrong packets, MAC address migration, STP switching, etc.; thirdly, if the delay is large, it must be congestion on the data path, and the data traffic exceeds the maximum bandwidth that can be provided on the network path. At this time, check the forwarding channel along the way to see if there is congestion and packet loss, and whether the port traffic exceeds the line speed or speed limit. These faults will cause large delays at the business level. Large delays are very harmful to application services, and the access experience is extremely poor, especially for video services, which will cause the picture to be unsmooth and the viewing experience to deteriorate.

To do data center network maintenance, you need to use these three axes. These three axes are simple to say, but there are many ways to use them. Different people have different understandings of them, which reflects each person's network technology level. To truly understand these three axes thoroughly, it often takes a lot of time and experience accumulation. If you talk about each axe in detail, it will involve a lot of network technology, which requires continuous digestion and understanding in network maintenance work, and slowly realize that mastering these three axes can become an expert in data center network maintenance.

<<:  Application of modular power distribution system in high-density data center

>>:  Woman connected to WiFi and received a huge bill: Some WiFi is actually charged

Recommend

Eight networking trends your business should know about

[[386593]] The coronavirus outbreak broke out in ...

Telling the story of HTTPS

Starring in the story: Xiaohua is a freshman this...

Network Automation: The Core Competitiveness of Operators in the 5G Era

[[327272]] What is the core competitiveness of op...

CloudCone March Event: Los Angeles SSD VPS starting at $1.65 per month

CloudCone's Hashtag 2022 VPS Sale this month ...

Practice: How to connect two routers through WAN and LAN ports respectively?

The IP addresses of two routers in a network segm...

What are the main measures and methods to deal with data center downtime?

While data centers are designed to not fail in th...

5G messaging is about to be launched in the commercial use countdown

5G messaging is regarded as a major business inno...