Skip to main content

Section outline

    • Spark DAG, Shuffle, Stages, Job Metrics

    • Performance Tuning: Memory, Executors, Caching

    • Setting up Spark on YARN, Kubernetes

    • Intro to DataFrames, Catalyst, Tungsten

    • Lab: Metrics, Caching, DataFrame Operations