Description
In this course, you will learn :
- Exploratory data analytics is a critical stage in data science that involves investigating data in order to extract insights.
- Exploring massive datasets in a big data world is difficult because it necessitates scalable, fast, and feature-rich technologies. The popular stream-processing platform Apache Flink is well suited for this endeavour.
- This course focuses on using SQL to explore datasets on Apache Flink.
- Kumaran Ponnambalam begins by going over the relational APIs that Flink provides for big data analytics. Kumaran then delves into the Table API and SQL functions.
- He investigates the various SQL capabilities available for data exploration, such as filtering, aggregations, and joins. Finally, he provides a use case project for you to practise your new skills.
Syllabus :
1. Flink Relational APIs
- What is Apache Flink?
- Flink relational APIs
- Integrations and connectors
- Course prerequisites
- Setting up the exercise files
2. Basic Batch Analytics
- Creating a table environment
- Creating tables from a CSV
- Selecting table data
- Filtering data in tables
- Writing tables to files
3. Advanced Batch Analytics
- Aggregations on tables
- Ordering and limiting data
- Adding new columns
- Joining tables
- Working with datasets
4. Streaming SQL
- Challenges with streaming SQL
- Dynamic tables
- Appending and retracting data
- Consuming Kafka sources
- Running continuous queries
5. Advanced Streaming Analytics
- Windowing on streams
- Using tumbling and sliding windows
- Writing tables to Kafka
- Working with data streams
- Using event time
6. Use Case Project
- Use case problem definition
- Read source data into a Flink table
- Compute total scores
- Compute aggregations