This course provides a well grounded introduction to Hadoop and its building blocks – HDFS, Map/Reduce, Pig, Hive, HBase, Sqoop and Flume. Anyone with a basic understanding of programming and databases (SQL) can benefit from this course. Prior knowledge of Java is not essential, although it is useful. As part of lab work, students build their own development cluster and implement a wide range of technical use cases.
The environment can be used for further learning and experimentation after the course as well. Students should be able to perform basic tasks on a Hadoop project, based upon their prior background and experience.

BENEFITS FOR INDIVIDUALS

BENEFITS FOR ORGANIZATIONS
Advanced Hive
Hive Formats & SerDes
Exercise: Working with ORC, XML & RegEx SerDes
Exercise: Optimizing Hive Queries using Partitions & Clusters
Overview of Hive Functions
Exercise : Creating User Defined functions(UDF)
Introduction to Hbase
Need for Low Latency Queries
Introduction to Hbase & NoSQL
Database
Hbase Data Model & Architecture
Exercise : Working with Hbase Shell
Role of Zookeeper
Sqoop
Using Sqoop to Extract data from MySQL
Exercises: Loading Data in HDFS, Hive & Hbase in Various formats
Flume
Flume Architecture & Data Model
Configuring Flume Agents to build custom data flows
Exercise: Ingesting Sensor Data into HDFS
Exercise: Aggregating Weblogs into HDFS
Building An End to End Hadoop AppListItemcation
Exercise: Running HQL Queries through a JDBC Client
Exercise: Reading and Writing to Hbase
Exercise: HiveBase Integration
Exercise: Displaying the results on a Dashboard