What are Accumulators and Broadcast Variables?

Updated May 5, 2026

Short answer

Broadcast variables are read-only shared variables; Accumulators are write-only variables used for counters/sums.

Deep explanation

Broadcast variables allow the driver to efficiently distribute large data to all tasks once. Accumulators allow tasks to 'add' to a variable on the driver (e.g., counting bad records).

Real-world example

Broadcasting a lookup table; using an accumulator to count how many records failed parsing in a file.

Common mistakes

  • Reading an accumulator value inside a task (only the driver can read it).

Follow-up questions

  • What happens to an accumulator during a task retry?

More Apache Spark interview questions

View all →