Description
In this course, you will learn:
- Exploratory data analysis vs. hypothesis-driven statistical analysis
- Performing data quality checks
- Calculating quartiles
- Using box plot to understand the distribution of values
- Using histograms to understand the frequency of values
- Using chi square to understand the correlation between values
Syllabus:
- Introduction
- What you should know
1. Introduction to Exploratory Data Analysis
- Why explore data?
- Exploring data with statistics
- Testing hypothesis with statistics
2. Data Quality Checks
- Why check data?
- Types of quality checks
- Imputing missing values
- Identifying business logic checks
3. Calculating Quartiles
- Why learn about the distribution of data?
- Minimum, maximum, and median values
- Ordering and counting
- Calculating quartiles
- Introduction to box plots
4. Histograms
- Introduction to histograms
- Partitioning data
- Calculating histograms
5. Checking Correlation between Attributes
- Introduction to correlation
- Calculating correlation with SQL