Explain the difference between Transformations and Actions.

Updated May 5, 2026

Short answer

Transformations create a new dataset from an existing one (lazy), while Actions trigger execution and return a result to the driver.

Deep explanation

Transformations (like map, filter) are lazy; Spark just records the instructions. Actions (like collect, count, save) trigger the DAG scheduler to execute the recorded operations.

Real-world example

Filtering a 1TB dataset; without lazy evaluation, Spark might load the whole set before filtering. With it, it only loads relevant partitions.

Common mistakes

  • Thinking a transformation has finished just because the line of code ran without error.

Follow-up questions

  • What is a Narrow Transformation?
  • What is a Wide Transformation?

More Apache Spark interview questions

View all →