About The Course:-
This course will cover concept such as HDFS, Hadoop Cluster, Hadoop Architecture etc.,.
Who Should Take this?
Systems administrators, linux administrators, windows administrators, Infrastructure engineers, Big Data Architects, DB Administrators, IT managers and Mainframe Professionals.
Pre-requisites for this training
This course requires no prior knowledge of Java, Hadoop Cluster Administration or Apache Hadoop. Fundamental knowledge of Linux basics is necessary as Hadoop runs on Linux.
After completion of the course you should be able to unserstand :
- The core technologies of Hadoop
- How to populate HDFS from external sources
- How to plan your Hadoop cluster hardware and software
- How to deploy a Hadoop cluster
- What issues to consider when installing Pig, Hive, and Impala
- What issues to consider when deploying Hadoop clients
- How Cloudera Manager can simplify Hadoop administration
- How to configure HDFS for high availability
- What issues to consider when implementing Hadoop security
- How to schedule jobs on the cluster
- How to maintain your cluster
- How to monitor, troubleshoot, and optimize the cluster
- Management and monitoring tools
Big Data & Hadoop Introduction
Learning Objectives : In this chapter you will have clear understanding of what exactly big data is and what are the challenges of big data, what is Hadoop and how it provide solution to big data problems, Hadoop Components and its architecture, History of Hadoop.
Learning Topics : What is big data?, challenges of big data?, Hadoop Definition and its components, Hadoop solutions to big data problems, Hadoop Eco System and its Components, History of Hadoop.
Hadoop Distributed File System
Learning Objectives : In this chapter you will understand complete HDFS architecture and how data will be stored in HDFS, How Data read and writes will happen into HDFS, Purpose of Secondary Namenode, How to achieve cluster balancing and its High Availability, HDFS Federation.
Learning Topics : HDFS architecture, Namenode and Datanode roles, Replication, Block placement strategy, Rack awareness, Anatomy of file read from HDFS, Anatomy of file write into HDFS, Secondary Namenode, Load Balancer, safe mode, High Availability, HDFS Federation.
Map Reduce – I
Learning Objectives : In this module you will understand Map Reduce framework fundamentals and its Flow. Yow will also learn important concepts like input splits, Record Reader. You will have clear picture on Life cycle of Map Reduce Jobs and Yarn jobs, Difference between YARN and MR v1, different Job scheduling algorithms.
Learning Topics : Map Reduce Framework fundamentals, Data Flow in MR, input splits, Record Reader, Life cycle of Map Reduce Jobs and Yarn jobs, Difference between YARN and MR v1, Job scheduling techniques.
Map Reduce – II
Learning Objectives : In this module you will understand different modes of Hadoop Installation and their configurations, Map Reduce API used to write Map Reduce programs in Java. You will also learn about Combiner, Partitioner, Side data distribution, Map side and Reduce side joins, Counters, Hadoop streaming.
Learning Topics : Hadoop Installation modes and cluster configurations, Map Reduce Datatypes, Functions of Mapper and reducer, Combiner, Partitioner, Side data distribution using Distributed Cache and job configurations, Map side VS Reduce side joins, Counters, Hadoop streaming.
Learning Objectives : This module will give you understanding of Hive and its advantages, Hive shell, Hive Query Language, Hive metastore and its types. You will also learn Hive API, concepts like dynamic partitioning and bucketing, user defined functions, Setting hive configurations.
Learning Topics : Hive introduction, Hive shell, Hive Query Language and DDL commands, Hive metastore and its types, Hive datatypes, operators, functions, dynamic partitioning, bucketing , user defined functions, Setting hive configurations
Learning Objectives : This chapter will cover Introduction to PIG and PIG Latin Language, Different ways of executing PIG scripts and you will also learn about GRUNT shell and its commands, Data Processing Operators, User Defined Functions, macros and some practical techniques.
Learning Topics : Introduction to PIG and PIG Latin, PIG Data types, Execution types and modes of PIG scripts, about GRUNT shell and its commands, Loading – Storing – Filtering data, Joins, Groups and CoGroups, Combining and splitting of data, User Defined Functions, macros and practical techniques
Learning Objectives : In this chapter you will learn what is Hcatalog and Advantages of HCatalog, How to use Hcatalog API and how to access tables in other tools like PIG using HCatalog.
Learning Topics : What is Hcatalog and why to use it, Hcatalog API and Accessing Hive tables in PIG using HCatalog.
SQOOP and FLUME
Learning Objectives : In this module you will learn what is SQOOP and different scenarios to use it, how to import and export Data in Hadoop using SQOOP with architecture and examples. You will also learn about FLUME and its Configuration, about FLUME agent, different types of Channels and Fan out, Example of using FLUME to get Twitter Data.
Learning Topics : SQOOP: What is SQOOP? , SQOOP commands and connectors, SQOOP Import & Export flow and examples. FLUME: What is FLUME? , FLUME Configuration, about FLUME agent, different types of Channels and Fan out, Getting Twitter Data using FLUME.
Learning Objectives : This module will give you clear understanding on using of NoSQL databases and different categories of NoSQL Databases, CAP Theorem. You will learn what is HBase? , Storage mechanism in HBase, HBase architecture, how first read or write will happen in HBase? , HBase in-depth architecture, Sharding and compaction. You will also learn Executing HBase shell commands and HBase operations using Map Reduce API
Learning Topics : Introduction to NoSQL databases and types of NoSQL DBs, CAP Theorem. HBase Introduction, Storage mechanism in HBase, HBase architecture, HBase first read or write, HBase in-depth architecture, Sharding and compaction. Executing HBase shell commands and executing using API
Learning Objectives : In this module you will learn Zookeeper fundamental concepts and its work flow, how to elect leader among ZNodes. It will also cover Zookeeper Command Line Interface and API, Applications where zookeeper will be used.
Learning Topics : Zookeeper fundamental concepts, work flow, leader election algorithm, Zookeeper Command Line Interface and API, Applications using zookeeper.
Learning Objectives : In this module you will learn how to use oozie, different types of oozie jobs, oozie workflow and coordinator jobs, property file and bundle system. You will also learn how to combine and execute map reduce, hive and pig Jobs.
Learning Topics : Introduction to oozie, different types of oozie jobs, oozie workflow with Directed Acyclic Graphs and oozie coordinator jobs, property file and bundle system. Example to demonstrate oozie job scheduling and combining MR, Hive and PIG jobs.