What is a DAG in Spark?

Updated May 5, 2026

Short answer

DAG (Directed Acyclic Graph) is a representation of the logical execution plan of a Spark job.

Deep explanation

Each node in the graph represents an RDD, and each edge represents a transformation. 'Directed' means it flows one way, and 'Acyclic' means there are no loops.

Real-world example

Visualizing how a join operation depends on two different filter operations in the Spark UI.

Common mistakes

  • Assuming Spark executes code line-by-line
  • it actually builds a graph and executes when an action is called.

Follow-up questions

  • What triggers the DAG execution?

More Apache Spark interview questions

View all →