midHadoop
Explain Hadoop job execution lifecycle in detail
Updated May 16, 2026
Short answer
A Hadoop job goes through submission, scheduling, execution, and completion phases managed by YARN.
Deep explanation
The client submits a job to YARN ResourceManager, which allocates an ApplicationMaster. The ApplicationMaster requests containers from NodeManagers. Tasks execute in containers, report progress, and upon completion results are written back to HDFS. Failures trigger retries or re-execution.
Real-world example
Processing daily billing records in a telecom system.
Common mistakes
- Assuming MapReduce directly executes without YARN involvement.
Follow-up questions
- What is ApplicationMaster?
- What happens on task failure?