Course: Big Data Using Handoop Ecosystem | Timmins

Course Content

Section outline

Select section Day 1 – Big Data Concepts, HDFS, Sqoop

Day 1 – Big Data Concepts, HDFS, Sqoop

Collapse all Expand all
Overview of Hadoop and Big Data use cases

HDFS Architecture and File Operations

Sqoop for RDBMS-Hadoop data transfer

Labs: HDFS commands, Sqoop import/export jobs
Select section Day 2 – MapReduce, Oozie, YARN, Cluster Planning

Day 2 – MapReduce, Oozie, YARN, Cluster Planning
MapReduce algorithms and Java-based job execution

Job orchestration with Oozie

YARN Architecture and Performance Tuning

Cluster sizing and planning

Labs: MapReduce via Eclipse, Oozie workflows, YARN tuning
Select section Day 3 – ETL with Pig and Hive

Day 3 – ETL with Pig and Hive
Data Transformation and Analytics using Pig

Hive SQL: Partitions, SerDe, Table formats

Labs: Yahoo Finance data processing with Hive/Pig
Select section Day 4 – Data Format & Compression, HBase

Day 4 – Data Format & Compression, HBase
Choosing optimal file formats and compression codecs

HBase architecture and REST API

Hive-HBase integration and bulk loading

Labs: Data format transformations, HBase table operations
Select section Day 5 – Kafka and Apache Spark

Day 5 – Kafka and Apache Spark
Kafka Architecture, Producers/Consumers

Multi-node Kafka setup and integration

Spark Architecture: RDDs, DAG, Streaming

Labs: Kafka ingestion, Spark processing of SFPD crime data

Offices

Kuala Lumpur

Taman Zeta@Zetapark, C-11-01

Komplek Danau Kota, 67, Jln

Taman Ibu Kota, Setapak,

53300 Kuala Lumpur

Penang

Timmins Training Center

1-3-6 Jalan Mayang Pasir 3, Elit Avenue

Bayan Lepas

11950, Pulau Pinang

COMPANY

SERVICES