Want to master your topic faster? Let AI build your personalized course

📚 Smarter courses, 🔍 adaptive quizzes, 🎓 real certificates.

Coursesity is supported by learner community. We may earn affiliate commission when you make purchase via links on Coursesity.

Certification Course

Scalable Machine Learning on Big Data using Apache Spark

Learn Scalable Machine Learning on Big Data using Apache Spark from IBM. This course will empower you with the skills to scale data science and machine learning (ML) tasks on Big Data sets using Apache Spark.

total enrollments

( 121 )

Total ratings

In this course, you will :

Gain a hands-on understanding of Apache Spark and use it to solve machine learning problems involving both small and large amounts of data.
Comprehend how parallel code, capable of running on thousands of CPUs, is written.
Apply machine learning algorithms on Petabytes of data using Apache SparkML Pipelines on large scale compute clusters.
Eliminate out-of-memory errors caused by traditional machine learning frameworks when data does not fit in the main memory of a computer.
Test thousands of different ML models in parallel to find the best performing one, as many successful Kagglers do.
Use Apache SparkSQL and the Apache Spark DataFrame API to run SQL statements on very large data sets.

Syllabus :

1. Introduction