It is a commonplace to say that data centers need backup systems, and their importance is self-evident. Data centers are made up of tens of thousands of electronic devices, and these devices are bound to have problems during operation. It is very important for the system to have redundant backups. For example, a data center with tens of thousands of servers may have a server failure almost every day. To ensure that the operation of the system is not affected, backups must be made. When a server fails, other servers can automatically take over and the business will not be affected. However, it is not easy to back up the entire system of a data center from top to bottom. Not only does it require a large investment of money, but it also requires manpower to maintain. The energy consumed often makes most data centers reluctant to do so. Therefore, data centers often implement redundant backups on some equipment and systems so that services can be smoothly switched in the event of a failure.
At the end of last month, Alibaba Cloud experienced a large-scale and long-term failure, which quickly sparked heated discussions online. Many cloud users' services were interrupted for an hour, causing great losses. Regardless of the cause, at least Alibaba Cloud's business itself was not backed up properly. When the failure was discovered, the business was not switched to the backup system in time, or the backup system was missing. Today's cloud has business all day and cannot stop at any time, which poses a great challenge to data centers. Keeping the business running on thousands of devices normal and automatically self-healing when abnormalities occur requires a lot of technology to ensure that the system is not well designed, and problems often occur. To ensure that data center services are not interrupted, the only way is to use redundant backup technology and infiltrate backup technology into every link of the data center to ensure that the system services can operate normally even if any link fails. This is like an airplane, where the engine, wings, ventilation system, etc. of the airplane have backups, so that when a failure occurs, the airplane can use the backup system and continue to fly normally. This design makes the airplane the safest equipment in the world and the best system with redundant backup. Compared with airplanes, data centers are more complex, with more parts and equipment, so it is more difficult to do redundant backup of the entire system. Data center backup requires a lot of funds. The simplest redundant backup is to build another disaster recovery data center or to have redundant backups for important equipment. Obviously, this will greatly increase the expenditure of the data center. It originally required 1 billion yuan. Taking into account the backup of each link or even the entire data center, it will require 2 billion yuan or even more. However, such investment cannot double the income. This is also the reason why many data centers invest too much. Disaster recovery data centers are often built by financial banks and other wealthy sponsors. When a data center fails, the entire business can be switched to the disaster recovery data center. Usually, the disaster recovery data center is just for the prince to study and does not carry any business, but it still needs to be maintained normally. Therefore, the investment in construction and subsequent operation and maintenance is quite large. If the data center cannot be backed up, then the core equipment and business can be backed up. When the core equipment fails, the business can be directly switched to the backup equipment to continue running, ensuring that the data center business is not affected. This requires selectively choosing redundant backup equipment and systems based on the funding situation of the data center, and making more complete redundancy with the least amount of money possible. In addition to funding, the introduction of redundant backup technology is also very important. When the main equipment or system fails, whether it can be sensed and smoothly switched to the backup system also requires a lot of technical guarantees, otherwise the backup equipment and system will remain useless. If automatic detection and switching are not possible, manual switching is also a solution. In short, it is necessary to ensure that when the main system fails, the business can be smoothly switched to the backup system. Only such redundant backup is effective. In addition to equipment backup, from a technical perspective, redundant backup can also be achieved. For example, equivalent routing can be deployed at the network level, and server clusters and virtual machines can be deployed. When routing problems occur, traffic can be switched to other network links. When virtual machines have problems, they can be automatically migrated to normal servers to run. By introducing backup technology, the investment in equipment funds can be reduced. However, it should also be noted that too much redundant technology should not be introduced, which will make the entire system inefficient. Moreover, if the design is too complicated, it will be very inconvenient for operation and maintenance. It will be very difficult to troubleshoot problems. Once the main system fails, it cannot be switched to the backup system. It may be impossible to troubleshoot and restore the business, which will cause a longer business interruption, which is not worth the loss. Therefore, it is necessary to deploy some redundant backup technologies, but they should not be too complicated. There is no need for those high-end technologies. They should be simple and effective, with automatic switching between the primary and backup systems. In today's data centers, new technologies such as cloud computing and software definition are prevalent, which greatly increases the complexity of the system. If too many backup technologies are introduced, the complexity of the system will increase exponentially, which is not conducive to the stability of data center business. Of course, we cannot stop doing backups just because it is difficult. If a data center does not do any business backup, it cannot carry any important business, especially Internet business. We cannot help but reject it. If the data center has frequent business interruptions, it will cause losses to customers' business. In today's highly developed information transmission, the negative impact will spread rapidly, and the data center will soon lose a large number of customers and eventually be unable to continue operating. Today's data centers need to operate 24 hours a day, without stopping for a moment, and no interruption is allowed. For example, Alibaba Cloud and Tencent Cloud have to sign agreements with customers. Once the business is interrupted due to a fault, corresponding compensation must be made. If there are always interruptions, the compensation cannot be paid. We must pay attention to the redundant backup of data center services, and consider redundancy in all aspects, such as equipment, network, business, and system, so that we can calmly deal with failures when they occur, and handle the failures without the user's awareness. No one can predict what kind of failure will occur at what time in the future. Maybe after we build a backup data center, the main data center will never have a major failure, but it is also possible that when we do not have a backup data center, a fatal and unrepairable failure will occur in the data center. Who dares to bet? It is better to build a redundant backup system obediently. Although it will cost more money and manpower, it is worth it. |
<<: A detailed explanation of Brotli algorithm to save CDN traffic
>>: How 10 popular SD-WAN startups survive in the cracks
Here is another VPS host in an unpopular area. In...
[[426454]] This article is reprinted from the WeC...
[[415987]] The well-known research organization A...
In February this year, we shared the news that VU...
From 0 to 10W+ Ruijie Ethernet Color Light Every ...
Interviewer: Please tell me what the process of D...
Although it is the end of February, RackNerd has ...
"The return of thin and light design is part...
edgeNAT has released a promotion for February thi...
Technological innovation remains the most appropr...
A study conducted by Juniper Research reveals pro...
Nowadays, 5G has become a hot topic around the wo...
According to reports, Waveform released a survey ...
Edge computing use cases are broad and its early ...
Smart buildings are becoming increasingly importa...