juniorHadoop

What is replication factor in HDFS?

Updated May 16, 2026

Short answer

It defines how many copies of a block are stored in HDFS.

Deep explanation

Higher replication improves fault tolerance but increases storage cost.

Real-world example

Storing critical logs with replication factor 3.

Common mistakes

  • Setting replication too low causing data loss risk.

Follow-up questions

  • Default replication factor?
  • Can replication be changed?

More Hadoop interview questions

View all →