juniorApache Spark
What is a DAG in Spark?
Updated May 5, 2026
Short answer
DAG (Directed Acyclic Graph) is a representation of the logical execution plan of a Spark job.
Deep explanation
Each node in the graph represents an RDD, and each edge represents a transformation. 'Directed' means it flows one way, and 'Acyclic' means there are no loops.
Real-world example
Visualizing how a join operation depends on two different filter operations in the Spark UI.
Common mistakes
- Assuming Spark executes code line-by-line
- it actually builds a graph and executes when an action is called.
Follow-up questions
- What triggers the DAG execution?