Description
In this course, you will learn :
- About ETL pipelines and how to use them to process and mix data from CSV, JSON, logs, APIs, and databases.
- Tokenize, lemmatize, and remove stop words from text data before analysing it. Using bag of words and tf-idf, transform and vectorize text data and generate features with scikit-learn.
- About the benefits of employing machine learning pipelines to speed up the data preparation and modelling process. To develop a whole machine learning pipeline that prepares data and creates a model for a dataset, use feature unions to perform steps in parallel and create more complicated workflows.
Syllabus :
- ETL Pipelines
- Natural Language Processing
- Machine Learning Pipelines