A brief discussion on the organizational structure design of data center operation and maintenance

A brief discussion on the organizational structure design of data center operation and maintenance

A data center is a complex organization with many systems. To make the data center run efficiently and safely, a strong operation and maintenance team is needed. Although in recent years, some people have proposed to build unmanned data centers and automated operation and maintenance systems to reduce labor costs and try to improve the work efficiency of personal operation and maintenance, it is still not completely feasible in actual applications. A data center without anyone involved in operation and management will be a mess and will not form combat effectiveness at all. Reducing labor costs is a long-term goal of data centers, but a large number of technicians are still needed to form a fusion of human-computer interaction in the data center. Moreover, the TIA-942 standard for data center construction also clearly proposes the configuration of personnel. Data centers of different levels require different numbers of matching personnel. The higher the level, the higher the number of personnel and skill level required.

As shown in Figure 1, TIA-942 divides data centers into four levels, with T1 being the lowest and T4 being the highest. The higher the level, the higher the required personnel capabilities and the longest on-duty time. T4 often requires that there be no business interruption in the data center throughout the year, and has very high requirements for operation and maintenance. Professional and technical personnel must be arranged to be on duty on site 24 hours a day so that when problems arise, they can be eliminated in a timely manner, or they can be switched to the backup system immediately to ensure that the business is not affected.

Figure 1: Staffing requirements for different data center levels

In terms of personnel organizational structure design, the data center can be divided into three major parts, each of which can be further subdivided to build a complete operation and maintenance system, which is generally as shown in Figure 2:

Figure 2: Data center operation and maintenance organizational structure

According to the organizational structure of data center operation and maintenance listed in Figure 2, it is obvious that 13 to 15 people are the most basic configuration for a large data center. If 7*24 hours rotation duty (individual positions) is taken into account, the staffing should be at least 25 people. For security and cleaning staff, it is definitely not enough to have only one or two people in such a large data center. The area of ​​a large data center is tens of thousands of square meters. At least ten people are needed to clean this large building area. In addition, in the IT system part, network, server, and storage are relatively less related professional technologies. It is impossible for one person to master them all, so it is necessary to reserve some talents in these areas. There are also many data centers built all over the country. If the operation and maintenance organization shown in Figure 2 is built in data centers all over the country, the labor cost is too high. Therefore, many data centers also concentrate all professional and technical personnel of the IT system department to the headquarters office to implement remote management of data centers in various places. At the site of the data center computer room, only a small number of on-site personnel are needed. These personnel only need to plug and unplug network cables, restart and install equipment. Daily monitoring is also mainly completed by these on-site personnel. Once a problem is found, the personnel of the IT system department will be notified in time to locate and analyze it.

The infrastructure department and the administrative department are closely related to the computer room of the data center, and a set of them needs to be established in data centers in various places. In fact, most data centers now rent computer rooms provided by operators or professional data center service providers. Power supply, electricity, air conditioning, monitoring, security and cleaning are all completed by the operators. The data center only needs to pay rent to the operator, which can save a lot of manpower costs. The operation and maintenance of the data center only needs an IT system department. If Internet giants like Tencent and Alibaba build their own data centers independently, they need to have an infrastructure department and an administrative management department. Of course, if you want to save trouble, you can also outsource the operation and maintenance of these two parts to professional service providers, which is much more economical than maintaining two departments by yourself.

In addition to designing the organizational structure, detailed departmental work content and job requirements for each job position should be formulated, and the department head can evaluate and increase salaries based on each person's work performance. It is necessary to establish a scientific and reasonable personnel management life cycle including selection, employment, training, assessment and dismissal. Through reasonable organizational structure design and personnel division of labor, the subjective initiative of individuals can be maximized to contribute to organizational goals. These managements must be constrained by various processes, and everyone should do things and work according to the process. The process is the guarantee of the quality of the data center operation and maintenance architecture. The purpose of the process is to ensure that the operation and maintenance architecture can be operated in quality and quantity.

Personnel is the foundation of data center operation and maintenance, and also the core of data center operation and maintenance. A good data center operation and maintenance organizational structure cannot do without appropriate technical and management personnel. People are the most critical factor in the operation of a data center. There is an idiom that goes, "Success is due to Xiao He, and failure is also due to Xiao He." You should know that 80% of the failures in the data center are human failures, and people are the key part of handling these failures. A large amount of manpower is needed to ensure the stable operation of the data center. Therefore, the relationship between people and data centers is very delicate. The data center cannot be completely dependent on people, nor can it be completely separated from human management. A balance must be achieved between the two. Too much human intervention is likely to cause human failures. Too little human intervention will easily cause the data center system to go astray. Maybe some equipment has burned out, and people don't even know about it. Such data center operation and maintenance is a failure.

<<:  What are the main measures and methods to deal with data center downtime?

>>:  Multiple new vulnerabilities in 4G networks may cause server crashes (paper attached)

Recommend

STM32 Network SMI Interface

[[377132]] 01 Introduction to Ethernet The Ethern...

Ethernet cables: A billion-dollar market, but growth will be hampered

[[177568]] Allied Market Research forecasts that ...

Verizon expands 5G enterprise network to 24 cities in the U.S.

Beijing time, April 16th morning news, the larges...

Wi-Fi Alliance: Wi-Fi 6 and 6E have been "rapidly adopted"

By 2025, Wi-Fi 6 and Wi-Fi 6E are expected to exc...

Programming Practice: How to parse domain names in the program

[[403061]] This article is reprinted from the WeC...

The intelligent combination of 5G technology and artificial intelligence

5G and AI can find solutions to unsolved problems...

Edge chips could render some networks useless

【51CTO.com Quick Translation】Some scientists say ...