juniorHadoop
What is Hadoop?
Updated May 16, 2026
Short answer
Hadoop is an open-source framework for distributed storage and processing of large datasets using HDFS and MapReduce.
Deep explanation
Hadoop enables scalable, fault-tolerant processing across clusters of commodity hardware. It consists of HDFS (storage), YARN (resource management), and MapReduce (processing). It distributes data and computation to achieve high throughput.
Real-world example
Used by companies like Yahoo and Facebook to process petabytes of logs and user data.
Common mistakes
- Assuming Hadoop is only MapReduce or only a database.
Follow-up questions
- What are Hadoop core components?
- Is Hadoop still relevant?