juniorApache Spark
Explain the difference between Transformations and Actions.
Updated May 5, 2026
Short answer
Transformations create a new dataset from an existing one (lazy), while Actions trigger execution and return a result to the driver.
Deep explanation
Transformations (like map, filter) are lazy; Spark just records the instructions. Actions (like collect, count, save) trigger the DAG scheduler to execute the recorded operations.
Real-world example
Filtering a 1TB dataset; without lazy evaluation, Spark might load the whole set before filtering. With it, it only loads relevant partitions.
Common mistakes
- Thinking a transformation has finished just because the line of code ran without error.
Follow-up questions
- What is a Narrow Transformation?
- What is a Wide Transformation?