Traditional database upgrade practices under the cloud-native evolution trend

Traditional database upgrade practices under the cloud-native evolution trend

1. Overview of Cloud Native Databases

1. Cloud computing is the infrastructure of digitalization

As we all know, cloud computing has become the infrastructure of digitalization, and the whole society is also digitalizing. Digitalization has penetrated into our daily lives, including food, clothing, housing, transportation, education, medical care, games, etc.

Take the medical field as an example. In the early years, when you went to the hospital, whether it was for a blood test or a chest X-ray, you had to get a paper report and then print a plastic chest X-ray. But in the past one or two years, except for the top three hospitals, other hospitals have basically provided patients with materials such as reports and chest X-rays online. The digitalization phenomenon in the medical field is very obvious.

After all this data is digitized, a big problem arises: which platforms should be used to carry it and how should it be carried? Alibaba Cloud is a very important part of this. The database carries the entire process of data production, integration, real-time processing, and analysis in the process of digitization. Around the entire database, there may be hardware, security, elastic computing and other capabilities. These large and small things ultimately make up the Alibaba Cloud platform.

2. What is cloud-native database technology?

Cloud computing is reshaping database technology and business.

In the context of digitalization, we have a lot to think about.

How are databases different from before? What is a so-called cloud-native database? As a developer using a database, how have your requirements for databases changed? What kind of demands do we generally make when using a database today?

Nowadays, the business of the upper layer changes very quickly, including the same problem that Alibaba Taobao had in the past. The rapid change of business has made developers face a very big challenge, which is to adapt to the changes very quickly. Before the popularization of cloud computing, this process was actually quite slow, from building the server, then setting up the network, installing the operating system and database, etc., the whole process was very long.

The demands for the database can be summarized as follows.

The first is that we want to focus more on business development and not spend too much time on the configuration of underlying hardware, software, computer rooms, networks and other facilities.

The second is out-of-the-box use. We hope that the database can be used directly after it is created, without the need to do very tedious, time-consuming and highly professional tasks such as configuration and optimization.

The third is security and reliability. When placing data on a third-party platform, security and reliability are very basic requirements.

The fourth is openness and compatibility. We do not want to be locked in by any particular cloud vendor and hope to be able to migrate in and out freely.

The fifth is massive expansion. With the explosive growth of business, the system pressure will soon become several times or even dozens of times the original. In this case, if there is no good horizontal and vertical expansion database system, it will be difficult to support the normal operation of the business and it will be very difficult to handle.

The sixth is globalization. Many Chinese game manufacturers have done a very good job in overseas expansion and promotion, especially in Southeast Asia. In addition, some games have also achieved great success in Europe, the United States and Japan. Therefore, some developers are now facing the demand for globalization. As the infrastructure of the database, we should think about how to provide global capabilities.

The seventh is continuous availability. We originally developed our own database system, and continuous availability was also one of the core considerations.

In addition, there is reliability, which requires that data loss cannot occur.

Finally, there is low cost. When the business develops to a more mature stage, we will focus on low cost.

In response to these customer demands, we are thinking about what features the next-generation database or new database should have, that is, the product capabilities of the cloud-native database, as shown below.

The first is full hosting. Users no longer need to worry about installation, backup, deployment, monitoring, high availability, etc. They can create an instance with one click, and the created instance has the above features.

The second is pay-as-you-go, which can make the cost of starting a business very low. Otherwise, the cost of configuring a whole set of facilities such as a computer room, hardware, and network would be very high.

The third is on-demand elasticity, which is divided into two aspects. On the one hand, it is necessary to have the ability to scale up. When the business is developing rapidly, the database must also be able to scale up quickly. On the other hand, it is necessary to scale down. When the business peak is over, the resource usage needs to be quickly reduced to achieve the goal of reducing costs.

The fourth is ecological compatibility. No matter whether the user is currently using MySQL, Oracle, or other databases, we can migrate in or out.

The above are the product capabilities that we believe cloud-native databases have.

Beneath these product capabilities, there are still many technologies providing support.

The six core technologies are intelligence, multi-mode, software and hardware integration, security and trust, HTAP: large database integration, and cloud native + distributed. These six core technologies support the above product capabilities and meet the demands of developers.

3. Cloud-native relational database PolarDB

PolarDB is a new generation of cloud-native database developed by Alibaba. Under the storage and computing separation architecture, it takes advantage of the combination of software and hardware to provide users with database services with extreme elasticity, high performance, massive storage, security and reliability. It is 100% compatible with MySQL 5.6/5.7/8.0, PostgreSQL 11, and highly compatible with Oracle.

PolarDB-X is the distributed version of PolarDB, integrating the distributed SQL engine and distributed self-developed storage X-DB, focusing on solving problems such as massive data storage, ultra-high concurrent throughput, complex computing and analysis, etc.

(IV) Product architecture of the cloud-native relational database PolarDB

PolarDB product architecture diagram

PolarDB has the following features:

Storage and computing separation
1) Flexible upgrade and downgrade within minutes

2) Add/delete read-only nodes within minutes

Intelligent proxy forwarding
1) Realize transparent database expansion

2) Multiple consistency levels

3) Custom Endpoint

Distributed Storage
1) Support 100TB

2) Quick backup and recovery

3) Higher single-instance IO capability

libpfs+rdma+optane
1) High-performance transparent implementation of three copies with RPO=0

2) High-performance writing: achieving high-concurrency writing

Redo-based replication
1) Millisecond-level latency for read-only instances

2) Solve the binlog/redo dual log consistency and performance issues

Parallel execution
1) Query and analysis in some scenarios

2) Freely controllable parallelism to ensure performance and stability

Here we mainly talk about a feature that is closely related to developers' use: intelligent proxy forwarding.

There is a very difficult point in the database. It is different from the application server. When the application server system is under particularly high pressure, it is relatively easy to expand. You can add a group of application servers and expand the related traffic to the new application server.

But databases usually can't do this, because data is interrelated in query and use, and data cannot be simply split. PolarDB has an intelligent proxy layer called Proxy on the upper layer, which solves this problem for developers. When the database system is under great pressure, some queries can be automatically distributed to other read-only nodes through intelligent proxy. For example, if there was originally one master and one backup, it can be changed to one master and three backups, and the traffic can be automatically distributed to three nodes.

You may be thinking, isn’t this the same as adding a few backup databases to the original database?

PolarDB solves a very critical problem through intelligent proxy, that is, after adding these read-only nodes, there is no need to make any changes to the connection configuration on the application server. It can be added at any time, and the intelligent proxy will automatically forward the query after receiving it.

Take a real business scenario as an example. For example, one day the front-end business system tells us that we will have a promotion at 10 a.m. tomorrow and asks us to expand the database.

In the past, if you added a read-only node, you might encounter the problem that the front-end application server could not access the read-only node at all, or you could access the read-only node, but if you made some changes to the application server configuration, the application might have to restart the application server. Now, PolarDB's intelligent proxy can effectively solve this problem and quickly and easily expand capacity.

2. Migrate traditional relational databases to cloud-native environments

1. Challenges of replacing traditional commercial databases

Nowadays, if you want to migrate from other commercial databases to PolarDB, such as Oracle database, there are generally several major challenges.

The first challenge is the high degree of application coupling. Usually, the degree of coupling between the database and the application is very high. If you want to perform an action on the database, the front-end application must cooperate with it, which may affect the availability of the front-end. Because the business carried by the database is usually more critical, moving the database often means moving the front-end application.

The second challenge is the high stability requirement. If there is a problem with the database, the front-end business will also have problems, so database changes and actions are often performed at night.

The third challenge is the large amount of data. Since the current business is relatively large, the amount of data in the core database is usually relatively large.

The fourth challenge is the high requirement for syntax compatibility. Although everyone uses SQL, the SQL of different databases is still different. If you migrate from Oracle database to PolarDB, if you need to make too many SQL changes, it means that the transformation of the front-end business system will be very large and complicated.

2. Use the cloud-native database PolarDB to replace traditional commercial databases

It is a scientific process of standardization and productization.

Migration Flowchart

On Alibaba Cloud, we will provide a set of standardized processes and products to help users migrate from the original database to the PolarDB database.

First, we will provide users with a tool or script to run in the user's system. It can collect some characteristics of the user's database, including which SQL, functions, and stored procedures do not match the target database writing method, and the characteristics of the original database, such as whether it is a database with particularly high system pressure or a database with particularly obvious hot data. After detecting these points, we will tell users what issues to pay attention to in the subsequent transformation.

The table above is generated by the script in the actual business process.

From this table, we can see that if the original database is to be migrated to PolarDB, its overall compatibility is still relatively high. We detected a total of 6029 objects, which may include stored procedures, data tables, index sequences, and some synonyms and other related things. There are only two incompatible objects, which is actually relatively small. The report will indicate which two tables are incompatible, and there are also some more specific modification suggestions, and then you can migrate.

The following figure shows a more specific process, which will not be elaborated in detail here.

Currently, Alibaba Cloud has worked with the China Academy of Information and Communications Technology to develop this set of standardized and productized processes into a standard guide for database migration. Developers can check it online and follow the guide to perform database migration.

3. Managing the PolarDB O Engine (Compatible with Oracle Syntax)

1. PolarDB provides full-stack compatibility with Oracle

The Oracle compatibility provided by PolarDB includes multiple aspects, including compatibility at the syntax layer, as well as the physical storage layer, logic layer, and interface layer.

2. Managing the PolarDB O engine (Oracle syntax compatible): Common tools

If users migrate from Oracle, what are the differences when using or managing PolarDB?

In terms of management tools, users can use Alibaba Cloud's cloud-based data management platform DMS. Find the entry called "Login to Database" on the console and log in to DMS, as shown below.

The second is to use an open source data management platform called pgAdmin, on which you can perform basic data management operations, including viewing basic information, querying data, looking at some execution plans, tables, objects, etc., as shown below.

4. Development Practice of PolarDB O Engine (Oracle Syntax Compatible): Basic Database Specifications

Managing the PolarDB O engine (Oracle-compatible): Development specifications (1)

In addition, Alibaba Cloud has some commonly used development specifications, which are developed internally by Alibaba Cloud and are also called specifications. They are strictly followed and implemented within Alibaba and will be published in the developer community and Alibaba Cloud's documentation system in the future. Development specifications are divided into several aspects, some of which are closely related to developers' specific use of PolarDB. The following is a brief explanation.

Some of the specifications are our internal mandatory requirements, while others are recommended. Users can make choices based on their actual situation.

The above is the table creation specification. For example, there is a specification for field names that requires lowercase letters and numbers, and no keywords. Why is there such a specification? Because modifying field names is a relatively costly thing and usually cannot be "pre-released".

We found that changing a field name in the actual production process is very troublesome. Because the previous business is already running, if a field name is changed, it means that the business system cannot run normally. Therefore, most of the previous practices are to add new fields. Therefore, we have put forward some specifications for field names, such as only lowercase letters can be used and keywords cannot be used.

The second is the table name and field name. We require the addition of create_time and update_time. This will bring several benefits. The first is that if there is an error in the data, you can quickly know the modification status and time of the field. The second is that in the upstream and downstream systems, if you want to pull some changed data, it can also quickly find which data has changed and then do the corresponding processing.

In addition, the table must have a primary key. There are several reasons for this. The first is that the query performance will be very good, and the second is that when the downstream system pulls some changed data, it can get it relatively quickly through the primary key.

There are also a series of index conventions, as shown in the figure above.

The specification mentions that indexes must be created in order. This order may involve paying attention to the fields in the where condition and the order of the fields in the order by condition. This order may affect the order of the fields in index creation. Only when the two are relatively matched, the overall performance will be better.

In addition, if you can use a covering index query, try to use it as much as possible, which will greatly increase efficiency.

There is another recommendation in the specification: use delayed association or subquery to optimize the scenario of multiple paging. This is also our experience in database index optimization. When doing paging query, for example, when you turn to page 1000 or page 500, the recommended approach is, for example, to find out the content of 10 pages when turning the page, it is best to find out the primary key ID of the content of these 10 pages first, and then return to the table once to find out all the data. This is a common recommended approach.

Another point mentioned in the index specification is that you should pay attention to different field types and use as few or no implicit conversions as possible, because implicit conversions will cause the entire index to fail.

Managing the PolarDB O engine (Oracle-compatible): Development Specifications (2)

There are also many regulations for SQL and operation and maintenance. Here we mainly talk about a few points in operation and maintenance.

The first is data correction. If developers want to modify the data, they must first query the data, review it and then delete it, otherwise it is easy to delete it by mistake.

We also recommend using the data management product DMS. If you make data corrections on DMS, one of its benefits is that you can check the backup option. When you make data corrections, it will automatically back up all the data to be corrected. If you find that there is a problem with the data correction, you can find the data automatically backed up by DMS and restore it again.

The rest are not elaborated here and will be released in the developer community and Alibaba Cloud documentation system in the future.

5. Development Practice of PolarDB O Engine (Oracle Syntax Compatible): Common SQL Optimization

1. Managing the PolarDB O engine (compatible with Oracle syntax): SQL optimization case 1: parallel query

When querying some queries with complex calculations, using parallel query can greatly speed up the query efficiency.

The above is a simple example. There is a very simple calculation during GROUP BY. When the query has a lot of data to scan, starting a parallel query can reduce the time from more than 100 seconds to 10 seconds, which is 10 times faster. This is a little trick for users when using PolarDB.

(II) Managing the PolarDB O engine (compatible with Oracle syntax): SQL optimization case 2: Choosing the appropriate JOIN method

We support hash join, merge join and nest-loop join. Users can choose the appropriate Join method according to different scenarios.

As you can see, in the above case, nest-loop join is the fastest choice.

VI. Cases and Recognition

1. Complete database ecosystem

Although PolarDB is a separate product, it has a very complete product ecosystem, including data management DMS, data autonomous service DAS, data transmission DTS, database backup DBS, data and application migration ADAM, etc., which can meet various user scenarios and provide comprehensive services.

(II) Case: PolarDB helps PrestoMall smoothly migrate from Oracle to the cloud

PrestoMall is a Southeast Asian e-commerce company founded in 2014. In order to cope with the rapid growth of its business, Alibaba Cloud's PolarDB database helped PrestoMall smoothly migrate from Oracle to the cloud.

Migrating to the cloud mainly faces the following business challenges:

As the business grows rapidly, IT costs also rise, and Oracle costs are high;
The business is growing rapidly and is not able to cope with the Double 11 promotion. The application has the ability to scale horizontally, but the database lacks elasticity.
The complexity of O removal is too high, and we lack experience. We hope to have professional evaluation guidance;
Optimal migration cost and risk control become difficult problems.
Based on customer business needs, we developed a plan to migrate to PolarDB O (compatible with Oracle syntax) for the following reasons:

PolarDB O engine (compatible with Oracle syntax) As a cloud database, there is no expensive license fee;
PolarDB O engine (compatible with Oracle syntax) has cloud-native elasticity, solving the problem of insufficient elasticity of customer databases;
ADAM provides customers with professional database/application compatibility assessment reports and formulates comprehensive migration plans. Combined with the high compatibility of PolarDB O engine (compatible with Oracle syntax) with Oracle, it greatly improves transformation efficiency.
DTS real-time migration/reflow function, combined with expert services, can significantly shorten the cutover time and reduce risks.
After migrating to the PolarDB O engine (compatible with Oracle syntax), the following customer value was ultimately achieved:

PolarDB O engine (compatible with Oracle syntax) successfully supports customer business while reducing the company's overall IT costs by 40%;
PolarDB O engine (compatible with Oracle syntax) was flexibly upgraded to cope with the Double 12 promotion;
ADAM + PolarDB O engine (compatible with Oracle syntax) helps customers reduce code transformation costs by 93%;
The cutover was completed smoothly and steadily as planned, and the business is running stably.

3. PolarDB, a widely recognized cloud-native relational database

At present, PolarDB is widely recognized in the industry, with more than 10 papers published in top academic societies. It won the first prize for scientific and technological progress from the China Electronics Society this year, as well as some other authoritative honors.

<<:  "5G+Industrial Internet" security capabilities and scenario-based solutions

>>:  What can 5G messaging bring to industry customers?

Recommend

DeployNode: $3.49/month KVM-1GB/15G NVMe/2TB/Los Angeles & New York

DeployNode is a foreign hosting company founded i...

China Unicom successfully returns to the forefront of 5G user development

[[389476]] After much anticipation, China Unicom ...

These router phenomena must have troubled you. Take a look and learn more

Does the row of indicator lights on your router o...

Deny 5G and believe in Starlink? IQ is a good thing

Not long ago, a foreign artist "made" a...