Description
In this course, you will learn:
- Why data management is critical for AI applications.
- What type of data these applications require.
- How to collect data for AI applications.
- How to extract and query data from existing databases with SQL.
- How to set up your Python Notebooks.
- How to use the pandas library to manipulate tabular data.
- How to visualize data with the Seaborn library.
Syllabus :
Week 1:
- We investigate why data management is important for AI and Machine Learning (ML) systems.
- We look at which data are required in the ML lifecycle and what features they should have.
- We talk about the effort and time required for data management operations and look at potential data sources.
Week 2:
- Databases, data models, and data schemas are among the fundamental data management topics covered.
- The Relational Data Model is defined and contrasted with the Single-Table Model (such as CSV and Excel) and the Document Models.
Week 3:
- We show how to extract data from existing relational databases using SQL queries and converting the query results into CSV files for further processing using pandas in Python notebooks.
Week 4:
- The many methods for setting up and operating Python notebooks are discussed, including cloud-based notebooks and local notebooks.
- We will walk you through the process of configuring your conda environment and installing the Jupyter and pandas libraries.
- You'll learn how to run notebooks in Visual Studio Code.
Week 5:
- Become a pandas expert.
- Explore the essential functionalities of pandas and, most importantly, write elegant and efficient Python pandas code to process and engineer tabular data.
Week 6:
- You will learn how to create simple and straightforward scientific figures in Python using the Seaborn package.
- Use Seaborn's fundamental capabilities to create gorgeous statistical charts.