Description
In this course, you will :
- teach you the fundamentals of Dask and lazy evaluation By the end of this chapter, you'll be able to use parallel processing or multi-threading to speed up almost any Python code.
- Learn the distinctions between these two task scheduling methods and which one is superior in which situations.
- Learn how to use Dask arrays and DataFrames to analyse large amounts of structured data.
- Learn how to easily apply everything you know about NumPy and pandas to data that is too large to fit in memory.
- Discover how Dask bags can be used to process unstructured text data, semi-structured JSON data, and even recorded audio in an efficient manner.
- Learn how to use the Dask-ML package to train machine learning models on big data, as well as how to distribute Dask calculations across multiple processes and threads for increased computing speed.
Syllabus :
- Lazy Evaluation and Parallel Computing
- Parallel Processing of Big, Structured Data
- Dask Bags for Unstructured Data
- Dask Machine Learning and Final Pieces