Enterprises that rely on high-availability applications should adopt active-active data center designs to ensure reliability and resiliency. Any enterprise running high-availability applications must answer the following fundamental question: How can you create a resilient application architecture when the underlying communications infrastructure is no longer reliable? Take the cooperation between a consulting agency and a user as an example. The client's main business application has high availability requirements. The client sends transactions to the primary data center application server and buffers the transactions before receiving confirmation. The client configures its two data centers as the primary database and backup data center respectively.
In terms of reliability, the customer experienced network-related outages multiple times a year. In addition, the failover mechanism that switched from the primary data center to the backup data center was a manual process that took hours to execute. Therefore, network issues needed to be resolved before the failover process could be completed. It was clear that the customer needed a more reliable data center failover mechanism that would allow customers to access high-availability applications. Another option is to make the network and data centers highly reliable, so that downtime in the data center will be very rare. However, the architecture of highly reliable infrastructure is often fragile, and small changes can cause downtime and outages that are difficult to diagnose and correct. Resilient Application Architecture To avoid making the system vulnerable, a better way to achieve resilient applications is to deploy an active-active data center architecture that does not rely on a single path or function. The term active-active refers to operating at least two data centers, both of which can serve applications at any time, so each data center acts as a site for active applications. Customers can perform transactions in any of the data centers, and the design and operation of each data center is much simpler than creating a single super-reliable data center. Note that resilience should be built into the application, not the network and IT infrastructure. This means that even if part of the network or server fails unexpectedly, the application will continue to be accessible. At the heart of this approach is that a high-availability application architecture needs to include reliable data exchange. Implicit in this architecture is that the databases in each data center need to update each other when executing client transactions. The characteristics of the customer's application are well suited to an active-active architecture, where either data center can execute a full transaction. Customer transactions are sent to the data center application, which updates the central database and then sends a confirmation to the customer endpoint. This mechanism guarantees the delivery of the transaction. Since the high-availability application was developed in-house, subsequent modifications can be made in-house. TCP for data transfer? TCP is a network mechanism designed to ensure reliable data transmission. Although TCP can retry the transmission of dropped packets, it cannot guarantee data transmission when one of the endpoints fails. A TCP session is established between the interfaces of two endpoints. If one of the endpoints (the server or its interface) fails, the TCP session will terminate. Lessons from Unicorns For example, the IT systems of unicorn companies such as Facebook, Google, Microsoft, Netflix, Amazon, etc. are designed to keep customers connected to their data centers. If a part of the data center fails, transactions that attempt to use that component will automatically be distributed to different parts of the IT infrastructure. These industry giants do not want parts of their infrastructure to fail, so they build more resilience into the applications themselves. Other companies' flexible architecture If your organization is not a unicorn, what can you do? You can learn from the unicorns and modify your IT systems to operate in a similar manner. This works best for high-availability applications built in-house. For example, a client can use a transaction retransmission timer with a circular list of data center addresses learned through the domain name system, also known as global server load balancing. The client will buffer transactions until it receives an acknowledgment from an accessible data center. Database synchronization distributes updates to other instances, so any database can process these transactions. This architecture allows organizations to deploy multiple application database systems. This approach can even be extended to access database instances in cloud computing infrastructures such as Amazon and Microsoft Azure. Adopting third-party applications, such as electronic health record applications, is more challenging. Software vendors can be asked to design resilient systems that can operate using active-active data centers. If the client side of the application is carefully examined, the enterprise may find opportunities to add a small software module that can monitor the data center connection. If the connection fails, the software module can automatically switch the application to another data center. Another option is to consider technologies such as software-defined WAN, which increases path diversity by using multiple links from different providers. This approach also works for third-party applications. With the widespread adoption of cloud computing, it is tempting to design systems to use one on-premises data center and one cloud-based data center. Lessons learned from high availability applications There are also some examples of how to make IT systems and applications highly available. While it may take some innovation to improve applications that organizations cannot control, the good news is that there are many technologies that can help organizations improve the resiliency of their applications. |
>>: Don’t know how to access the router system backend? Learn it in one step!
The VoLTE function was once a major feature promo...
At present, my country's communications marke...
[[285361]] First, let's briefly introduce the...
background The Domain Name System (DNS) is a dist...
Earlier, we mentioned that Flutter uses BasicMess...
The development of cloud computing and virtualiza...
The resonance of technology and business is drivi...
Software-defined LAN, or SD-LAN, is the applicati...
In 2019, we often heard the industry say that 201...
Eurocloud has launched a July promotion, offering...
As many countries have suffered losses in the pac...
In 2020, the COVID-19 pandemic spread wildly arou...
5G is the fifth generation of mobile communicatio...
Last time when BandwagonHost launched a special o...