Sunteți pe pagina 1din 3

Big Data Analytics

Module 1 : Duration:
1 weeks
Understanding Data base and SQL:
I. Architecture of Data base
II. What is SQL (structure query language)
III. SQL commands
SQL overview
SQL SELECT statements
SQL functions and expressions
SQL updating
SQL joins
SQL with multiple tables
SQL summarization
SQL: preparing for the real world )

Module 2 :
Duration:2 weeks

Understanding Big Data and Hadoop:

Topics :
I. Big Data
II. Limitations and Solutions of existing Data Analytics Architecture
III. Hadoop
IV. Hadoop Architecture and HDFS
Hadoop Cluster Architecture
Important Configuration files in a Hadoop Cluster
Data Loading Techniques.
Module 3 :
Duration:3 weeks

Understanding SQL:
I. SQL Overview
Relational database concepts, specific products
SQL syntax rules
Data definition, data manipulation, and data control
statements
Getting acquainted with the course database and editor
II. SQL SELECT statements
Clauses
The SELECT clause: columns and aliases, where
expressions, order by expressions how null values
behave
III. SQL Functions and Expressions
Eliminating duplicates with DISTINCT arithmetic
expressions
Replacing null values
Literals, concatenation, other string functions
Numeric operations, including rounding
Date and time functions
Nested table expressions
Case logic H. Other expressions in specific dbms
products
IV. SQL Updating
The INSERT, UPDATE and DELETE statements
Column constraints and defaults
Referential integrity constraints

V. SQL Joins
Inner joins with original and SQL 92 syntax
Table aliases
Left, right and full outer joins
Self-joins

VI. SQL Subqueries and Unions


Intersection with IN and EXISTS
Subqueries
Difference with NOT IN and NOT EXISTS subqueries
The purpose and usage of UNION and UNIONALL

VII. SQL Summarization


The column functions MIN, MAX, AVG, SUM and COUNT
The GROUP BY and HAVING clauses Grouping in a
combination with joining
Module 4 :
Duration:3 weeks

HIVE:
understanding Hive concepts, Loading and Querying Data in Hive and Hive UDF.

Topics :
Hive Background
Hive Use Case
About Hive
Hive Vs Pig
Hive Architecture and Components
Metastore in Hive
Limitations of Hive
Comparison with Traditional Database
Hive Data Types and Data Models
Partitions and Buckets
Hive Tables (Managed Tables and External Tables)
Importing Data
Querying Data
Managing Outputs
Hive Script
Hive UDF
Hive Demo on Healthcare Data set

Module 5: Duration:5-
6weeks

Other technologies associated with Hadoop:


Hadoop MapReduce framework
Advanced MapReduce
PIG
Advanced Hive and Data file partitioning
Apache Flume and HBASE
Processing Distributed data with Apache Spark
RDDs in Apache.
Spark SQL

S-ar putea să vă placă și