Description
In this course, you will :
- Distinguish operational from analytic databases, and understand how these are applied in big data.
- Understand how database and table design provides structures for working with data.
- Appreciate how differences in volume and variety of data affects your choice of an appropriate database system.
- Recognize the features and benefits of SQL dialects designed to work with big data systems for storage and analysis.
Syllabus :
1. Data and Databases
- What Is Data?
- Why Organize Data?
- What Does a DBMS Do?
- Relational Databases and SQL
- The Success of RDBMSs and SQL
- Operational and Analytic Databases
- Comparing Operational and Analytic DBs: SELECT Statements
- Comparing Operational and Analytic DBs: DML Activity
- Operational and Analytic Databases: Further Comparisons
2. Relational Databases and SQL
- Introducing Table Schemas
- NULL Values
- Data Types
- Primary Keys
- Foreign Keys
- Two Strategies for Database Design
- Database Normalization
- Denormalization
- Differences
- Trade-offs
- Database Transactions
- ACID
- Enforcing Business Rules: Constraints and Triggers
- Business Rules and ACID for Analytics?
3. Big Data
- How Big Is Big Data?
- Distributed Storage
- Distributed Processing
- Structured Data
- Unstructured Data
- Semi-Structured Data
- Strengths of Traditional RDBMSs
- Limitations of Traditional RDBMSs
- SQL and Structured Data
- SQL and Semi-structured Data
- SQL and Unstructured Data
4. SQL Tools for Big Data Analysis
- Big Data Analytic Databases (Data Warehouses)
- NoSQL: Operational, Unstructured and Semi-structured
- Non-transactional, Structured Systems
- Big Data ACID-Compliant RDBMSs
- Search Engines
- Challenges
- What We Keep
- What We Give Up
- What We Add
- Where to Store Big Data
- Coupling of Data and Metadata
5. Introduction to the Hands-On Environment
- Apache Hive
- Apache Impala
- Exploring Structured Data in Hue
- Welcome to the Honors Track
- Honors Track Conclusion