HRDC Reg. No: 10001548281
Duration: 3 Days (24 hours)
Course Overview
This hands-on course is designed for Data Engineers and Big Data Architects aiming to adopt Apache Ozone—a high-performance, scalable object store that is a next-generation replacement for HDFS. Participants will explore architecture, cluster setup, security, integrations with big data tools (Hadoop, Hive, Spark, Kafka, Flink), and deployment of real-world data pipelines. Labs and a capstone project reinforce each learning objective.
Who Should Attend
Targeted Industries
Why Choose This Course
HRDC Claimable – [TBD]
Future-proof your data architecture with hands-on training in Apache Ozone, a cloud-native object store compatible with HDFS and S3, and optimized for next-gen data lakes and hybrid big data platforms.
Learning Outcomes
By the end of this course, participants will be able to:
-
Understand the evolution and advantages of Apache Ozone over HDFS and S3-compatible object stores
-
Deploy and manage multi-node Ozone clusters
-
Perform volume and object-level operations using CLI and APIs
-
Tune performance and enable high availability configurations
-
Secure Ozone using Kerberos, Ranger, and TLS
-
Integrate Ozone with Spark, Hive, Flink, Kafka, and Presto
-
Implement enterprise-grade data lake pipelines using Ozone
Prerequisites
-
Basic knowledge of Big Data and distributed storage systems
-
Familiarity with HDFS, YARN, and Linux file operations
-
Understanding of Hive, SQL, and object stores like AWS S3, MinIO
-
Basic CLI skills and networking concepts (SSH, ports)
-
Experience with Spark or Flink (recommended)
Lab Setup
Pre-configured Lab Environment (provided):
-
3-node Ozone + Hadoop cluster
-
Hive, Spark, Flink, Kafka pre-installed
-
Configured with Kerberos, Ranger, Prometheus, Grafana
-
Sample datasets for log, financial, and clickstream analysis
Manual Deployment Requirements:
-
RAM: 16 GB per node
-
vCPUs: 4
-
OS: Linux (Ubuntu/CentOS)
-
Java: 8 or 11
-
Access: SSH, sudo, open ports for Web UI/S3
Teaching Methodology
-
Instructor-led conceptual and practical sessions
-
Scenario-based labs and integrations
-
Real-world use cases
-
End-of-course capstone project and certification quiz