Description
In this course, you will :
- Discover how to use the Python scientific stack to perform common data science tasks.
- covers the tools and concepts needed to process data effectively with the Python scientific stack, such as Pandas for data crunching, matplotlib for data visualisation, NumPy for numeric computation, and more.
Syllabus :
1. Scientific Python Overview
- Ramp up with Scientific Python
2. The Jupyter Notebook
- Start the notebook server
- Use code cells
- Extensions to Python language
- Understand markdown cells
- Edit notebooks
3. NumPy Basics
- NumPy arrays
- Slicing
- Learn Boolean indexing
- Understand broadcasting
- Understand array operations
- Understand ufuncs
4. Pandas
- Load CSV files
- Parse time
- Access rows and columns
- Use pure Python packages
- Calculate speed
- Display a speed box plot
5. Conda
- Introduction to Python packages
- Manage environments
6. Folium and Geo
- Create an initial map
- Draw a track on the map
- Use geo data with Shapely
- Generate a report
7. NY Taxi Data
- Examine data
- Load data from CSV files
- Work with categorical data
- Work with data: Hourly trip rides
- Work with data: Rides per hour
- Work with data: Weather data
8. scikit-learn
- Learn regression on Boston dataset
- Understand train/test splits
- Preprocess data
- Compose pipelines
- Save and load models
9. Plotting
- Use styles
- Customize Pandas output
- Use matplotlib
- Tips and tricks
- Understand bokeh
10. Other Packages
- Go faster with Numba and Cython
- Understand deep learning
- Work with image processing
- Understand NLP: NLTK
- Understand NLP: SpaCy
- Bigger data with HDF5 and dask
11. Development Process
- Understand source control
- Learn code review
- Testing overview
- Testing example