Want to master your topic faster? Let AI build your personalized course

📚 Smarter courses, 🔍 adaptive quizzes, 🎓 real certificates.

Build Custom Course

Coursesity is supported by learner community. We may earn affiliate commission when you make purchase via links on Coursesity.

Introduction to PySpark

Introduction to PySpark

Learn to implement distributed data management and machine learning in Spark using the PySpark package.

84.4K

total enrollments

Free Trial

Go to Course SAVE

Course Overview
Reviews

Description

In this course, you will learn :

Learn how to use the PySpark package to implement distributed data management and machine learning in Spark.
You'll discover how Spark manages data and how to read and write tables in Python.
Learn about the pyspark.sql module, which allows you to run optimised data queries in your Spark session.
PySpark includes cutting-edge machine learning routines as well as utilities for creating full machine learning pipelines.
You'll use what you've learned to build a model that forecasts which flights will be delayed.

Syllabus :

Getting to know PySpark
Manipulating data
Getting started with machine learning pipelines
Model tuning and selection

Similar Courses

Reviews

No Reviews Available yet

Be the first to write a review

Course Features

Enrollment options

Standard

7 - days Free Trial
Unlimited access to 350+ Courses
Unlimited access to 50+ Skill tracks
Practice Challenges
Certificate on completion
Peer Support
Live coding
Skill Assessments
$25/month - Annual Plan (13% saving)
$29/month - Monthly Plan

Premium

7 - days Free Trial
Unlimited access to 350+ Courses
Unlimited access to 80+ Projects
Unlimited access to 50+ Skill tracks
Practice Challenges
Certificate on completion
Peer Support
Live coding
Skill Assessments
Priority Support
$33/month - Annual Plan (32% saving)
$49/month - Monthly Plan