seniorHadoop
What is Hadoop block placement policy?
Updated May 16, 2026
Short answer
HDFS block placement policy decides how replicas of a block are distributed across DataNodes and racks.
Deep explanation
HDFS uses a rack-aware placement strategy to balance fault tolerance, bandwidth usage, and reliability. The default policy places the first replica on the local node (if available), the second replica on a different rack, and the third replica on the same rack as the second but a different node. This ensures resilience against both node-level and rack-level failures while minimizing cross-rack traffic.
Real-world example
In a data center, replicas of a large log file are distributed across different racks to survive rack power failure.
Common mistakes
- Assuming replicas are randomly distributed without rack awareness.
Follow-up questions
- Why is rack awareness important?
- Can block placement policy be customized?