Description
In this course, you will learn :
- PySpark, Apache Spark, Big Data Analytics, Big Data Processing, Python
Syllabus :
1. A Scenario To Get Us Started
- Introduction to our development environment
- Introduction to our dataset & dataframes
- Environment configuration code snippet
- Ingesting & Cleaning Data
- Answering our scenario questions
2. Core Concepts
- Bringing data into dataframes
- Inspecting A Dataframe
- Handling Null & Duplicate Values
- Selecting & Filtering Data
- Applying Multiple Filters
- Running SQL on Dataframes
- Adding Calculated Columns
- Group By And Aggregation
- Writing Dataframe To Files