LETFIX TECHNOLOGIES | Course Detail

Course Information

Course Price $250
Total Students 800+
Course Duration 4 Weeks

Description

The amount of data produced by the internet is rising every day. Enterprises need highly skilled Hadoop professionals to handle this huge data. A technology that has become one of the major forces for handling Big Data processing is Hadoop. This amazing platform assists in storing, dealing with and retrieving huge sums of data in diverse applications. This also assists in deep analytics. Several organizations are adopting Hadoop, and the demand for Hadoop Developers is seeing a raise. Lay the Foundation for an Excellent Career with the Leading Big Data Hadoop Technology changes at a swift rate and also the demands in the job scenario. To keep yourself updated you should be aware of the functioning of Big Data Hadoop; you can also be in pace with the changing trends.

Benefits

Controlling data is a challenging task and companies need skilled people to deal with their data to tackle these challenges. There is a great demand for Big Data Hadoop experts in big companies
People who can grasp and ace components of Hadoop ecosystems are much in need. The sooner you learn the skill the greater the chance of getting placed in a top organization
The salary provided to Big Data Hadoop trained participants is high; when you possess an experience of six months to a year then you can get a lucrative salary
The scope of progressing and earning is really big. Learn Big Data Hadoop from LETFIX and enter into your desired job

Syllabus

Session 1- Big Data Introduction

What is Big Data
Evolution of Big Data
Benefits of Big Data
Operational vs Analytical Big Data
Need for Big Data Analytics
Big Data Challenges

Session2 –Hadoop cluster

Master Nodes

Name Node
Secondary Name Node
Job Tracker

Client Nodes
Slaves
Hadoop configuration
Setting up a Hadoop cluster

session 3- HDFS

Introduction to HDFS
HDFS Features
HDFS Architecture
Blocks
Goals of HDFS
The Name node & Data Node
Secondary Name node
The Job Tracker
The Process of a File Read
How does a File Write work?
Data Replication
Rack Awareness
HDFS Federation
Configuring HDFS
HDFS Web Interface
Fault tolerance
Name node failure management
Access HDFS from Java

Session 4-Yarn

Introduction to Yarn
Why Yarn
Classic MapReduce v/s Yarn
Advantages of Yarn
Yarn Architecture

Resource Manager
Node Manager
Application Master

Application submission in YARN
Node Manager containers
Resource Manager components
Yarn applications
Scheduling in Yarn

Fair Scheduler
Capacity Scheduler

Fault tolerance

Session 5-MapReduce

What is MapReduce
Why MapReduce
How MapReduce works
Difference between Hadoop 1 & Hadoop 2
Identity mapper & reducer
Data flow in MapReduce
Input Splits
Relation Between Input Splits and HDFS Blocks
Flow of Job Submission in MapReduce
Job submission & Monitoring
MapReduce algorithms

Sorting
Searching
Indexing
TF-IDF

Session 6-Hadoop Fundamentals

What is Hadoop
History of Hadoop
Hadoop Architecture
Hadoop Ecosystem Components
How does Hadoop work
Why Hadoop & Big Data
Hadoop Cluster introduction
Cluster Modes

Standalone
Pseudo-distributed
Fully – distributed

HDFS Overview
Introduction to MapReduce
Hadoop in demand

Session 7-HDFS Operations

Starting HDFS
Listing files in HDFS
Writing a file into HDFS
Reading data from HDFS
Shutting down HDFS

Session 8-HDFS Command Reference

Listing contents of directory
Displaying and printing disk usage
Moving files & directories
Copying files and directories
Displaying file contents

Session 9-Java Overview for Hadoop

Object oriented concepts
Variables and Data types
Static data type
Primitive data types
Objects & Classes
Java Operators
Method and its types
Constructors
Conditional statements
Looping in Java
Access Modifiers
Inheritance
Polymorphism
Method overloading & overriding
Interfaces

Session 10-MapReduce Programming

Hadoop data types
The Mapper Class

Map method

The Reducer Class

Shuffle Phase
Sort Phase
Secondary Sort
Reduce Phase

The Job classes

Job class constructor

Job Context interface
Combiner Class

How Combiner works
Record Reader
Map Phase
Combiner Phase
Reducer Phase
Record Writer

Partitioners

Input Data
Map Tasks
Partitioner Task
Reduce Task
Compilation & Execution

Hadoop Ecosystems

Session 11-Pig

What is Apache Pig?
Why Apache Pig?
Pig features
Where should Pig be used
Where not to use Pig
The Pig Architecture
Pig components
Pig v/s MapReduce
Pig v/s SQL
Pig v/s Hive
Pig Installation
Pig Execution Modes & Mechanisms
Grunt Shell Commands
Pig Latin – Data Model
Pig Latin Statements
Pig data types
Pig Latin operators
Case Sensitivity
Grouping & Co Grouping in Pig Latin
Sorting & Filtering
Joins in Pig Latin
Built-in Function
Writing UDFs
Macros in Pig

Session 12-HBase

What is HBase
History of HBase
The NoSQL Scenario
HBase & HDFS
Physical Storage
HBase v/s RDBMS
Features of HBase
HBase Data model
Master server
Region servers & Regions
HBase Shell
Create table and column family
The HBase Client API

Session 13-Spark

Introduction to Apache Spark
Features of Spark
Spark built on Hadoop
Components of Spark
Resilient Distributed Datasets
Data Sharing using Spark RDD
Iterative Operations on Spark RDD
Interactive Operations on Spark RDD
Spark shell
RDD transformations
Actions
Programming with RDD

Start Shell
Create RDD
Execute Transformations
Caching Transformations
Applying Action
Checking output

GraphX overview

Session 14-Impala

Introducing Cloudera Impala
Impala Benefits
Features of Impala
Relational databases vs Impala
How Impala works
Architecture of Impala
Components of the Impala

The Impala Daemon
The Impala State store
The Impala CatLog Service

Query Processing Interfaces
Impala Shell Command Reference
Impala Data Types
Creating & deleting databases and tables
Inserting & overwriting table data
Record Fetching and ordering
Grouping records
Using the Union clause
Working of Impala with Hive
Impala v/s Hive v/s HBase

Session 15-MongoDB Overview

Introduction to MongoDB
MongoDB v/s RDBMS
Why & Where to use MongoDB
Databases & Collections
Inserting & querying documents
Schema Design
CRUD Operations

Session 16-Oozie & Hue Overview

Introduction to Apache Oozie
Oozie Workflow
Oozie Coordinators
Property File
Oozie Bundle system
CLI and extensions
Overview of Hue

Session 17-Hive

What is Hive?
Features of Hive
The Hive Architecture
Components of Hive
Installation & configuration
Primitive types
Complex types
Built in functions
Hive UDFs
Views & Indexes
Hive Data Models
Hive vs Pig
Co-groups
Importing data
Hive DDL statements
Hive Query Language
Data types & Operators
Type conversions
Joins
Sorting & controlling data flow
local vs MapReduce mode
Partitions
Buckets

Session 18-Sqoop

Introducing Sqoop
Scoop installation
Working of Sqoop
Understanding connectors
Importing data from MySQL to Hadoop HDFS
Selective imports
Importing data to Hive
Importing to HBase
Exporting data to MySQL from Hadoop
Controlling import process

Session 19-Flume

What is Flume?
Applications of Flume
Advantages of Flume
Flume architecture
Data flow in Flume
Flume features
Flume Event
Flume Agent

Sources
Channels
Sinks

Log Data in Flume

Session 20-Zookeeper Overview

Zookeeper Introduction
Distributed Application
Benefits of Distributed Applications
Why use Zookeeper
Zookeeper Architecture
Hierarchical Namespace
Znodes
Stat structure of a Znode
Electing a leader

Session 21-Kafka Basics

Messaging Systems

Point-to-Point
Publish – Subscribe

What is Kafka
Kafka Benefits
Kafka Topics & Logs
Partitions in Kafka
Brokers
Producers & Consumers
What are Followers
Kafka Cluster Architecture
Kafka as a Pub-Sub Messaging
Kafka as a Queue Messaging
Role of Zookeeper
Basic Kafka Operations

Creating a Kafka Topic
Listing out topics
Starting Producer
Starting Consumer
Modifying a Topic
Deleting a Topic

Integration with Spark

Session 22-Scala Basics

Introduction to Scala
Spark & Scala interdependence
Objects & Classes
Class definition in Scala
Creating Objects
Scala Traits
Basic Data Types
Operators in Scala
Control structures
Fields in Scala
Functions in Scala
Collections in Scala

Mutable collection
Immutable collection

	100% online courses Start instantly and learn at your own schedule.
	Flexible Schedule Set and maintain flexible deadlines.
	Beginner Level No prior experience required
	Approx. 3 months to complete Suggested 3 hours/week
	English Subtitles:English

	COURSE OUTCOMES
62%	Started a new career after completing this specialization.
17%	Got a pay increase or promotion.

Course Information

Description

Benefits

Syllabus

Hadoop Ecosystems

Information

Student Help

News letter

Contact

Big Data and Hadoop

Course Information

Description

Benefits

Syllabus

Hadoop Ecosystems

Related Courses

AWS TRAINING

MICROSOFT AZURE TRAINING

SALESFORCE

VM WARE