Description
In this course, you will :
- Design distributed systems that manage "big data" using Hadoop and related technologies..
- Use HDFS and MapReduce for storing and analyzing data at scale..
- Use Pig and Spark to create scripts to process data on a Hadoop cluster in more complex ways..
- Analyze relational data using Hive and MySQL.
- Analyze non-relational data using HBase, Cassandra, and MongoDB.
- Query data interactively with Drill, Phoenix, and Presto.
- Choose an appropriate data storage technology for your application.
- Understand how Hadoop clusters are managed by YARN, Tez, Mesos, Zookeeper, Zeppelin, Hue, and Oozie..
- Publish data to your Hadoop cluster using Kafka, Sqoop, and Flume.
- Consume streaming data using Spark Streaming, Flink, and Storm.