Description
In this course, you will:
- Learn how to utilize numerical and graphical summaries to analyze whether data contains outliers.
- You'll utilize Grubbs' test to determine whether a point is an outlier, and you'll learn about the Seasonal-Hybrid ESD technique, which can help find outliers when the data is a time series.
- Learn how to compute the k-nearest neighbors distance and the local outlier factor, which are used to generate continuous anomaly scores for each data point when there are numerous features in the data.
- Discover the distinction between local and global abnormalities and how the two methods can assist in each scenario.
- Investigate an isolation forest, a rapid and robust way of finding anomalies that analyzes how readily points can be isolated by randomly splitting the data into smaller and smaller sections.
- Learn to compare the detection performance of the algorithms when tagged anomalies are present.
- Learn how to compute and evaluate the precision and recall statistics for an anomaly score, as well as how to modify algorithms to accommodate data containing categorical features.
Syllabus:
- Statistical outlier detection
- Distance and density based anomaly detection
- Isolation forest
- Comparing performance