Skip to main content

Section outline

    • Iceberg vs Hive, Delta, Hudi: benefits and architecture

    • Table structure: snapshots, manifests, schema evolution

    • Setup: Iceberg with Spark, Flink, Trino, Hive Metastore, Glue, Nessie

    • Lab: CRUD on Iceberg tables, metadata inspection

    • Partition evolution, time travel, snapshot expiration

    • Performance tuning: file pruning, compaction, predicate pushdown

    • Merge-on-read vs copy-on-write

    • Lab: Optimize Iceberg tables, snapshot rollback, S3/GCP/Azure integration

    • Flink & Spark streaming ingestion, upserts, CDC

    • Apache Ranger: row-level security, encryption, masking

    • Lab: Kafka → Flink → Iceberg → Trino pipeline

    • Capstone: Migrating from Hive to Iceberg; deployment best practices

    • Quiz and certification