Description
In this course, you will learn :
- The fundamental concepts and libraries required to solve any problem in this field.
- You will work on Kaggle real-time projects while also honing your mathematical skills, which will be used extensively in the majority of problems you will face.
- In addition, you will be guided through a systematic approach to learning everything from data acquisition to data wrangling and everything in between. This is your one-stop shop for becoming a self-assured data scientist.
Syllabus :
1. What is Data Science ?
- Data Science vs. Data Analysis vs. Data Engineering
- Descriptive and Predictive Analytics
- Data Science Life Cycle
- Structured vs. Semi-Structured vs. Unstructured Data
- The Good Traits of a Data Scientist
2. Applications of Data Science
- Applications in Healthcare and Recommender Systems
- Image Analysis
3. Overview of Libraries
- Beautiful Soup (Scraping Data from Simple HTML)
- Beautiful Soup (Scraping Data from Html Table)
- Scrapy
- Numpy Basics
- Numpy Array Creation
- Numpy Array Manipulation
- Sorting Numpy Arrays
- Basic Statistics on Numpy Arrays
- Broadcasting in Numpy Arrays
- Pandas
- Spacy
- Seaborn
4. Probability and Statistics
- Probability
- Statistics
- Joint Probability
- Conditional Probability and Bayes Theorem
- Measures of Locations
- Measures of Variability
- Probability Distributions (Binomial and Bernoulli Distributions)
- Gaussian Distribution
- Poisson Distribution
- Skewness and Kurtosis
- Sampling Methods
- Key Concepts in Statistics
- Statistical Hypothesis Testing
5. Machine Learning
- Machine Learning and its Types
- Deep Learning and Recommender Systems
- What is Regression ?
- Univariate Linear Regression
- Multivariate Linear Regression
- Feature Scaling
- Linear Regression in Scikit Learn
- Regularization (Lasso, Ridge, and ElasticNet Regression)
- Support Vector Regression
- Nearest Neighbour Regression
- Decision Tree Regression
- Feature Engineering and Categorical Variables Encoding
- Numerical Variables Transformation
- Feature Selection (Filter Methods)
- Feature Selection (Wrapper Methods)
- Feature Selection (Intrinsic Methods)
- Model Evaluation Measures (Explained Variance Score, MAE, MSE)
- Model Evaluation Measures (Median Absolute Error, R^2 Score)
- Dummy Regressors
- Cross Validation
- Types of Classification Problems
- Logistic Regression
- Support Vector Machines
- Decision Trees
- Naive bayes
- K-Nearest Neighbors
- Ensemble Learning
- XGBoost, Light GBM and CatBoost
- Learning Curves
- Model Evaluation
- Dummy Estimators and Handling Imbalance Class Problem
- Hyper-Parameter Optimization and Kaggle Competition
- Unsupervised Learning
- K-Means Clustering
- Hierarchical Clustering
- DBSCAN Clustering and Customer Segmentation
- Apriori Algorithm and Association Rules
- Principal Component Analysis for Dimensionality Reduction
- Semi-Supervised Learning Techniques
6. Deep Learning
- What is Deep Learning?
- Neural Networks
- Feedforward Neural Networks
- Backpropagation
- Convolutional Neural Network
- Recurrent Neural Network
- LSTM Networks
7. Machine Learning Tools and Libraries
- Automated Machine Learning
- Pandas Profiling and PyCaret
- RAPIDS (Using GPU for Fast Computations)
8. Big Data Tools and Technologies
- What is Big Data ?
- Hadoop Ecosystem
- Map Reduce Framework
- Apache Spark and it's Components
9. Where to go next ?
- Starting Career on Kaggle (Tips)
- Recommended Courses from Educative
- References and Acknowledgements