Description

In this course, you will build classifiers that perform admirably on a variety of tasks. You will become acquainted with the most successful and widely used techniques in practise, such as logistic regression, decision trees, and boosting. Furthermore, you will be able to design and implement the underlying algorithms for learning these models at scale using stochastic gradient ascent. These techniques will be applied to real-world, large-scale machine learning tasks. You will also cover important tasks that you will encounter in real-world ML applications, such as dealing with missing data and measuring precision and recall to evaluate a classifier. This course is hands-on, action-packed, and packed with visualisations and illustrations of how these techniques will perform on real-world data. We've also included optional content in each module that covers advanced topics for those who want to delve even deeper!

Syllabus :

1. (A) Welcome!

Welcome to the classification course, a part of the Machine Learning Specialization
What is this course about?
Impact of classification
Course overview
Outline of first half of course
Outline of second half of course
Assumed background
Let's get started!

2. Linear Classifiers & Logistic Regression

Linear classifiers: A motivating example
Intuition behind linear classifiers
Decision boundaries
Linear classifier model
Effect of coefficient values on decision boundary
Using features of the inputs
Predicting class probabilities
Review of basics of probabilities
Review of basics of conditional probabilities
Using probabilities in classification
Predicting class probabilities with (generalized) linear models
The sigmoid (or logistic) link function
Logistic regression model
Effect of coefficient values on predicted probabilities
Overview of learning logistic regression models
Encoding categorical inputs
Multiclass classification with 1 versus all
Recap of logistic regression classifier

2. (A) Learning Linear Classifiers

Goal: Learning parameters of logistic regression
Intuition behind maximum likelihood estimation
Data likelihood
Finding best linear classifier with gradient ascent
Review of gradient ascent
Learning algorithm for logistic regression
Example of computing derivative for logistic regression
Interpreting derivative for logistic regression
Summary of gradient ascent for logistic regression
Choosing step size
Careful with step sizes that are too large
Rule of thumb for choosing step size
(VERY OPTIONAL) Deriving gradient of logistic regression: Log trick
(VERY OPTIONAL) Expressing the log-likelihood
(VERY OPTIONAL) Deriving probability y=-1 given x
(VERY OPTIONAL) Rewriting the log likelihood into a simpler form
(VERY OPTIONAL) Deriving gradient of log likelihood
Recap of learning logistic regression classifiers

(B) Overfitting & Regularization in Logistic Regression

Evaluating a classifier
Review of overfitting in regression
Overfitting in classification
Visualizing overfitting with high-degree polynomial features
Overfitting in classifiers leads to overconfident predictions
Visualizing overconfident predictions
(OPTIONAL) Another perspecting on overfitting in logistic regression
Penalizing large coefficients to mitigate overfitting
L2 regularized logistic regression
Visualizing effect of L2 regularization in logistic regression
Learning L2 regularized logistic regression with gradient ascent
Sparse logistic regression with L1 regularization
Recap of overfitting & regularization in logistic regression

3. Decision Trees

Predicting loan defaults with decision trees
Intuition behind decision trees
Task of learning decision trees from data
Recursive greedy algorithm
Learning a decision stump
Selecting best feature to split on
When to stop recursing
Making predictions with decision trees
Multiclass classification with decision trees
Threshold splits for continuous inputs
(OPTIONAL) Picking the best threshold to split on
Visualizing decision boundaries
Recap of decision trees

4. (A) Preventing Overfitting in Decision Trees

A review of overfitting
Overfitting in decision trees
Principle of Occam's razor: Learning simpler decision trees
Early stopping in learning decision trees
(OPTIONAL) Motivating pruning
(OPTIONAL) Pruning decision trees to avoid overfitting
(OPTIONAL) Tree pruning algorithm
Recap of overfitting and regularization in decision trees

(B) Handling Missing Data

Challenge of missing data
Strategy 1: Purification by skipping missing data
Strategy 2: Purification by imputing missing data
Modifying decision trees to handle missing data
Feature split selection with missing data
Recap of handling missing data

5. Boosting

The boosting question
Ensemble classifiers
Boosting
AdaBoost overview
Weighted error
Computing coefficient of each ensemble component
Reweighing data to focus on mistakes
Normalizing weights
Example of AdaBoost in action
Learning boosted decision stumps with AdaBoost
The Boosting Theorem
Overfitting in boosting
Ensemble methods, impact of boosting & quick recap

6. Precision-Recall

Case-study where accuracy is not best metric for classification
What is good performance for a classifier?
Precision: Fraction of positive predictions that are actually positive
Recall: Fraction of positive data predicted to be positive
Precision-recall extremes
Trading off precision and recall
Precision-recall curve
Recap of precision-recall

7. Scaling to Huge Datasets & Online Learning

Gradient ascent won't scale to today's huge datasets
Timeline of scalable machine learning & stochastic gradient
Why gradient ascent won't scale
Stochastic gradient: Learning one data point at a time
Comparing gradient to stochastic gradient
Why would stochastic gradient ever work?
Convergence paths
Shuffle data before running stochastic gradient
Choosing step size
Don't trust last coefficients
(OPTIONAL) Learning from batches of data
(OPTIONAL) Measuring convergence
(OPTIONAL) Adding regularization
The online learning task
Using stochastic gradient for online learning
Scaling to huge datasets through parallelization & module recap

Machine Learning: Classification

In this course, you will create classifiers that provide state-of-the-art performance on a variety of tasks.

88.7K

4.7

Free

Description

Similar Courses

Reviews

No Reviews Available yet

Course Features

Enrollment options

Free Audit

Free Trial

Coursera Plus - Monthly

Coursera Plus - Annual

Machine Learning: Classification

In this course, you will create classifiers that provide state-of-the-art performance on a variety of tasks.

88.7K

4.7

Free

Description

Similar Courses

Predictive Analytics for Business with H2O in R

Reinforcement Learning

How Google does Machine Learning

Applied Machine Learning With R

Unsupervised Machine Learning Course

IBM & Darden Digital Strategy

Ensemble Machine Learning in Python: Random Forest, AdaBoost