1. Introduction and summary
Hadoop 开源项目包含三部分Common, HDFS, MapReduce.
The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware.
http://hadoop.apache.org/common/docs/stable/hdfs_design.html
- Simple Coherency Model
HDFS applications need a write-once-read-many access model for files. A file once created, written, and closed need not be changed. *
This assumption simplifies data coherency issues and enables high throughput data access. A MapReduce application or a web crawler application fits perfectly with this model.
There is a plan to support appending-writes to files in the future. *
- The File System Namespace
HDFS supports a traditional hierarchical file organization.
A user or an application can create directories and store files inside these directories. The file system namespace hierarchy is similar to most other existing file systems;
one can create and remove files, move a file from one directory to another, or rename a file. *
HDFS does not yet implement user quotas. HDFS does not support hard links or soft links. However, the HDFS architecture does not preclude implementing these features.
2. BUILDING