Description
In this course, you will :
- helps you become acquainted with some of the most well-known data science tools in cloud computing, distributed file storage, distributed processing, and machine learning.
- covers Proxmox, Hadoop, Spark, and Weka, explaining how to install and use each tool in your data science workflow.
- explains how Hadoop, Spark, and Weka can work together to achieve the best results
Syllabus :
1. Introduction to Data Science
- Data science
- Fundamental skills
- Tools of trade
- Enabling technologies
2. Cloud Computing
- Cloud computing and virtualization
- Cloud fundamentals
- Types of cloud
- Solution providers
- Private cloud hands-on with Proxmox
- Proxmox: Bootable installation disk
- Proxmox: Installation
- Proxmox: Managing virtual machines
- Proxmox: Creating and configuring virtual machines
3. Distributed File Systems
- Distributed file systems
- Fundamentals
- Distributed systems and distributed processing
- Hadoop: Preparation
- Hadoop: Installation
- Hadoop: MapReduce hands-on
4. Distributed Processing
- Distributed processing with MapReduce
- Distributed processing with Spark
- Spark architecture and features
- Spark: Installation
- Spark: Spark shell
- Spark: pyspark
- Spark: Application
5. Machine Learning
- Machine learning
- Fundamentals
- Types of machine learning
- Weka: Installation
- Weka: GUI
- Weka: Training vs. testing
- Weka: Clustering