As the digitalization process deepens, the value of data applications is being valued by more and more companies. Decision analysis based on data is an important scenario for the application value. Companies of different industries and sizes widely rely on BI products to create reports, dashboards and data portals for decision analysis. In the process of using BI products for data analysis, "slow" data processing will bring a lot of trouble to the business. You can imagine: The report I showed to my boss was very slow to load and sometimes crashed. I wanted to report well to my boss, but it gave him a bad experience. Quick BI: Cloud BI on Alibaba Cloud FeiTian OS The Quick BI product is a cloud BI software built on Alibaba Cloud's FeiTian operating system. It supports SAAS mode and private deployment, and is positioned as a consumer-based BI for multiple scenarios, multiple terminals, and multiple industries. This article will introduce the product's core Quick engine in detail. Quick BI is based on Alibaba Cloud's horizontally scalable architecture. It not only has traditional BI capabilities such as visual analysis, Chinese-style reporting, and self-service analysis, but also has an enterprise-level security base, mobile terminals, and open integration capabilities with third-party systems. Quick BI has built its own computing kernel, the Quick Engine. The SAAS service hosted on Alibaba Cloud can complete aggregation analysis of billions of data within 0.5 seconds. In addition, due to its reliance on Alibaba Cloud, computing resources support horizontal expansion, and more powerful data analysis and computing capabilities can be provided by adding servers. Quick Engine: Multi-mode BI calculation engine As the computing base of Quick BI, the Quick engine is a multi-mode BI computing engine that supports multiple computing modes such as direct database connection, extraction acceleration, real-time acceleration, query caching, and dimension value acceleration, providing different users with efficient computing solutions that best suit their scenarios. The above figure is a diagram of the Quick engine architecture. From the perspective of the Quick BI product usage link, it is divided into three parts: data source, data set, and data works. The data source is the underlying database connection, and the data set is used to model the tables in the data source (table association, field type modeling, etc.), turning one or more tables into a data object that can be used by upper-level data works (dashboards, spreadsheets, ad hoc analysis). The Quick engine architecture is located between the data source and the data set. It is used to process queries sent from the upper-level data works to the data set and finally transferred to the data source. In terms of technical implementation, the Quick engine is divided into three links: direct database connection, real-time database acceleration, and database extraction. Technical layer abstraction is performed in these three links. From the user's perspective, we provide the following five computing modes: (1) Direct connection mode: The computing load runs directly on the database or data warehouse connected to the BI product. It supports dozens of data sources and can be used by all versions of users. It is very suitable for scenarios where the underlying computing resources can meet the query load. (2) Extraction acceleration: Extract data from customer databases or data warehouses into the high-performance columnar storage engine of the Quick engine. It supports full and incremental modes. The analysis and computing load runs directly in the Quick BI engine, making full use of the Quick engine's performance while reducing the burden on the customer's data warehouse. This feature is available to professional version customers and is very suitable for enterprises that do not have independent data warehouses or whose data warehouses are overloaded. (3) Real-time acceleration: Based on Alibaba Cloud DLA (Data Lake Analysis) memory computing engine, data is retrieved from the customer database in real time during query, and the DLA memory engine is used to accelerate the calculation in the middle. It is available to professional version customers and currently supports Alibaba Cloud Max Compute data warehouse. It is very suitable for real-time analysis of Max Compute data warehouse. More database support is being opened; (4) Query cache: Available to all versions of users, temporary query results of application reports and dashboards are cached when they are accessed. During the configured cache validity period, other users' subsequent queries with the same query will directly retrieve the cached results, speeding up the response time while avoiding resource consumption caused by repeated calculations. This is very suitable for scenarios where the application has a large number of repeated queries, such as visualization. (5) Dimension value acceleration: Available to all version users, it is implemented based on direct connection mode and dimension table configuration. By configuring dimension value acceleration, high-frequency and time-consuming dimension field query calculations are performed directly on the database dimension table instead of on the original detail table. For example, dimension value queries for ad hoc analysis and query controls can return results quickly and save computing resources in such scenarios compared to not performing dimension value acceleration. Quick Engine - User Guide Before we begin to introduce the specific usage of each engine, we will first provide a scenario usage guide based on the characteristics of each engine to help users choose the most suitable engine in different scenarios. The Quick engine uses different calculation modes based on different data set configurations. Depending on the data set, the following suggestions are made: (1) The data set uses direct connection mode by default. If the query performance is good, no additional configuration is required. If it cannot meet the requirements, the following judgment is made (2) Datasets are mainly used in dashboards and reports, tend to be fixed data display types, and are not controlled by many query controls The effectiveness requirement is not very high, so it is very suitable for caching, which can basically solve the problem (perhaps more than 80% can be solved) The underlying database is not OLAP, such as MySQL, and runs very slowly. First, it is recommended to use extraction acceleration, and then it is recommended to optimize data modeling. The underlying database is OLAP, such as ADB, and runs very slowly. It is recommended to first optimize data modeling (for example, whether to join large tables). Secondly, it is recommended to use extraction acceleration to share some of the load. The underlying database is ODPS and runs very slowly. If the effectiveness requirement is high, DLA real-time acceleration is recommended. If the effectiveness requirement is not high, extraction acceleration is recommended. (4) The dimension field of the data set is frequently used for query controls or ad hoc analysis. It is recommended to configure dimension value acceleration for this field. Quick Engine - Direct Connection Mode Direct connection mode is the default mode for Quick engine queries. All queries will be sent to the underlying database or data warehouse for execution. Quick BI direct connection mode supports dozens of cloud and self-built databases. On the dataset page, click "New Dataset" and select the configured data source. The left panel will display all tables in the data source. Drag one or more tables into the panel and configure the fields in the data preview area. After the configuration is complete, save the dataset before you can perform subsequent analysis. After the dataset is saved, the subsequent analysis query will default to direct connection mode. Quick Engine - Extraction Acceleration When there are too many queries in direct connection mode or the amount of data is too large, the underlying database will be overloaded and the query will slow down. The display and analysis of the upper-level instrument table will also slow down, resulting in the problem mentioned at the beginning of the article. At this time, you can consider using the Quick engine to accelerate the extraction. Extraction acceleration is a unique feature of the professional version. It currently covers three data sources: MySQL, ADB for MySQL, and MaxCompute. It supports full and incremental extraction of data into the high-performance column-based storage analytical database of the Quick engine. The extracted data query is completed directly in the column-based analytical database without being sent to the customer database, improving data query performance while reducing the customer database load. Click the dataset menu, select "Acceleration Configuration", click on the first "Quick Engine" tab to start the engine, and select Extraction Acceleration:
The Quick engine extracts acceleration performance test. The aggregations such as sum, count, avg, and median of 1 billion data are all returned within 0.5 seconds. It has the ability to perform sub-second analysis on billions of data. The following table shows the performance test results. At the same time, since Quick BI is a product architecture based on the Alibaba Cloud FeiTian base and has the ability to scale horizontally, the data processing capabilities of the Quick engine will continue to increase as the number of machines increases, and in theory it has unlimited expansion capabilities. Quick Engine - Real-time acceleration When performance problems occur in the direct connection mode and the data effectiveness is required to be high, daily granularity updates cannot meet the requirements, and hourly or minute granularity data updates are required. Extraction acceleration cannot be used because it requires daily granularity data updates. At this time, another option can be considered, which is to use real-time acceleration to accelerate the query of highly effective data. Like extraction acceleration, real-time acceleration is also a feature unique to the professional version. It currently supports MaxCompute data sources and is based on the Alibaba Cloud DLA (Data Lake Analysis) memory computing engine. During query, data is loaded into DLA in real time for calculation to improve query performance. The offline data warehouse MaxCompute can be converted into an online analytical data warehouse through real-time acceleration. On the dataset acceleration configuration page, enable the Quick engine, switch to real-time acceleration, and save to enable the dataset real-time acceleration mode. Quick Engine - Query Cache The principle of query caching is that temporary query results of application-side reports and dashboards are cached when they are accessed. During the configured cache validity period, other users' subsequent queries with the same query will directly obtain the cached results. Queries that hit the cache can return results immediately, and queries that do not hit the cache will be sent to the underlying database for query. After the query returns, the query will also be cached for subsequent use. Result caching is a widely used and very effective way to accelerate data queries. It is applicable to all data sources and is available to users of different versions. Query caching can be configured for data sets that have repeated queries within a certain period of time. This is especially true for scenarios with a large number of repeated queries, such as dashboard displays, where query performance can be greatly improved. On the acceleration configuration page, enable query result caching and configure different cache times, which represent the validity period of the cache. If the data is not time-sensitive, it is recommended to select 12 hours. Quick Engine - Dimension Value Acceleration In direct connection query, the query of dimension value is relatively time-consuming, such as product name, customer name, city name, etc., because such query needs to go to the underlying database to perform de-duplication aggregation operation in direct connection mode, and needs to scan the entire table data, so it is relatively time-consuming. In some scenarios, such query operations may occur very frequently, such as dimension value analysis of ad hoc analysis and dimension value query of query control. In such scenarios, query performance can be improved by configuring dimension value acceleration. On the acceleration configuration page, enable dimension value acceleration. The data set is an order detail table. On the front-end dashboard page, you often need to query transaction status based on customer name and product name. Therefore, configure dimension value acceleration for these two fields, corresponding to the fields of the two user and product dimension tables in the underlying database. After that, dimension value queries will be directly obtained from these two dimension tables without aggregation in the detail table, thereby improving query speed. The above is an introduction to the functions and usage scenarios of Quick BI's computing kernel, the Quick Engine. Relying on Alibaba Cloud's computing base, the Quick Engine achieves the ability to perform sub-second analysis on billions of data, allowing upper-level analytical visualization applications to truly take off in the big data era. |
<<: Three-minute review! A quick overview of 5G industry development trends in November 2021
>>: Difference between web scraping and web crawling
Recently, H3C, a leading manufacturer in the IP n...
With the rapid development and popularization of ...
I have been engaged in operation and maintenance ...
DediPath has launched a 25% discount promotion fo...
[51CTO.com original article] With the development...
There are many blogs and vendor papers about 5G r...
1. Industrial wireless network development and se...
[[259528]] Recently, 5G mobile phones have been r...
UFOVPS has launched a promotional event during th...
According to Gartner's forecast, global IT sp...
Cool Cloud is a Chinese hosting company founded i...
Last month we shared information about UCloud'...
According to foreign media reports, Lockheed Mart...
Recently, the Ministry of Industry and Informatio...
When it comes to TCP connection establishment and...