juniorHadoop
What is replication factor in HDFS?
Updated May 16, 2026
Short answer
It defines how many copies of a block are stored in HDFS.
Deep explanation
Higher replication improves fault tolerance but increases storage cost.
Real-world example
Storing critical logs with replication factor 3.
Common mistakes
- Setting replication too low causing data loss risk.
Follow-up questions
- Default replication factor?
- Can replication be changed?