- Apache HDFS (Hadoop Distributed File System) is a distributed file system that runs on standard or low-end hardware. (handles large data sets)
- HDFS provides better data throughput than traditional file systems, in addition to high fault tolerance and native support of large datasets.
- It is used to scale a single Apache Hadoop cluster to hundreds & thousands (and even tens of thousands) of nodes.
- HDFS accommodates applications that have data sets typically gigabytes to terabytes (& even petabytes) in size.