Best Practices for Stream Computing Processing with Flink on Zeppelin

Best Practices for Stream Computing Processing with Flink on Zeppelin

Content framework:

Big Data Overview
Flink Learning Framework
Demonstration of best practices for stream computing on EMR Studio

1. Overview of Big Data

Big Data Processing ETL (Data → Data)
Big Data Analysis BI (Data → Dashboard)
Machine Learning AI (Data → Model)

2. Flink Learning Framework

Flink Essentials

Stateful
Time
Flink Architecture
Flink API
Flink Configuration
Flink Log

Stateful:

Why

Timeliness of stream computing

Unbounded Stream Computing

When

Window

Join

Pattern

How

statebackend

Time

Event time
Processing time
Watermark

Flink Architecture

Flink API

Flink Configuration

Cluster Configuration
Job Configuration
Statebackend
Resource Manager
SQL/Python
Reference documentation: https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/

Flink Log

III. Best Practices for Stream Computing on EMR Studio

EMR Studio features:

Compatible with open source components

EMR Studio has been optimized and enhanced based on the open source software Apache Zeppelin, Jupyter Notebook, and Apache Airflow.


Supports connecting multiple clusters and adapting to multiple computing engines. Interactive development + seamless job scheduling. Applicable to a variety of big data application scenarios. Computing and storage separation

Flink Clients

Flink on Zeppelin (Phase 1) - Interactive Flink Client

Flink on Zeppelin (Phase 2) - Interactive JobManager

Flink on Zeppelin Main Features

Original link: http://click.aliyun.com/m/1000286010/

<<:  It’s time to launch 5G applications

>>:  External tools connect to SaaS mode cloud data warehouse MaxCompute practice

Recommend

5G Thinking and Development: 5G development is just in time

Mobile communication technology advances every te...

Global spending on 5G network infrastructure nearly doubled in 2020

According to the latest forecast from Gartner, gl...

How cloud services enable a 5G-driven future

As high-speed cellular networks become mainstream...

Serverless Engineering Practice | Quickly Build Kubeless Platform

Quickly build a Kubeless platform Introduction to...

...

Guangxi Maitong: We didn't miss Ruijie!

"I missed Lenovo 10 years ago, but I cannot ...