seniorHadoop

What is Hadoop data block corruption handling mechanism?

Updated May 16, 2026

Short answer

Hadoop detects corrupted blocks using checksums and recovers them using replicas.

Deep explanation

Each HDFS block has a checksum stored separately. During read/write, checksum validation ensures data integrity. If corruption is detected, NameNode marks the block as corrupt and retrieves a healthy replica from another DataNode to replace it.

Real-world example

Recovering corrupted log files in distributed storage systems.

Common mistakes

  • Assuming corrupted blocks are automatically deleted without replication.

Follow-up questions

  • How is corruption detected?
  • What happens to corrupt replicas?

More Hadoop interview questions

View all →