Want to master your topic faster? Let AI build your personalized course

📚 Smarter courses, 🔍 adaptive quizzes, 🎓 real certificates.

Build Custom Course

Coursesity is supported by learner community. We may earn affiliate commission when you make purchase via links on Coursesity.

Certification Course

Best Hands-on Big Data Practices with Spark and PySpark

Best Hands-on Big Data Practices with Spark and PySpark

[NEW] Semi-Structured (JSON), Structured and Unstructured Data Analysis with Spark and Python & Spark Performance Tuning

94

total enrollments

Discount Offer

Go to Course SAVE

Course Overview
Reviews

Description

In this course, you will :

Understand the Apache Spark framework, execution model, and programming model for developing Big Data Systems.
Learn how to set up and configure Spark using a free cloud-based and desktop machine.
Using real-world case studies, create simple to advanced Big Data applications for various types of data (volume, variety, and veracity).
Learn how to use RDD, DataFrame, and SQL to perform step-by-step hands-on PySpark practises on structured, unstructured, and semi-structured data.
Investigate and implement optimization and performance tuning methods for managing data skewness and preventing Spill.
Examine and implement Adaptive Query Execution (AQE) to optimise Spark SQL query execution at runtime.

Syllabus :

PySpark for a large Semi-Structured (JSON) File
PySpark for a large Structured File
PySpark for a large Unstructured (LOG) File
Distributed Processing Challenges and Spark Performance Tuning

Similar Courses

Reviews

No Reviews Available yet

Be the first to write a review

Course Features

Certificate on completion
Udemy
English
Intermediate
Development ,Apache Spark

Enrollment options

Course Material
Certificate on completion
30 days Refund (refund policy)
Lifetime Access
Instructor direct message
Instructor Q&A