HDFS小结

1、HDFS: Motivation:
(1)Based on Google’s GFS
(2)Redundant storage of massive amounts of data on cheap and unreliable computers
(3)Why not use an existing file system? 
        – Different workload and design priorities;
        – Handles much bigger dataset sizes than other filesystems
2、HDFS Design Decisions
(1)Files stored as blocks-Much larger size than most filesystems (default is 64MB)
(2)Reliability through replication
           – Each block replicated across 3+ DataNodes
(3)Single master (NameNode) coordinates access, metadata
           – Simple centralized management
(4)No data caching-– Little benefit due to large data sets, streaming reads
(5)Familiar interface, but customize the API
          – Simplify the problem; focus on distributed apps
3、HDFS Client Block Diagram
4、Based on GFS Architecture
5、Metadata
(1)Single NameNode stores all metadata
          – Filenames, locations on DataNodes of each file
(2)Maintained entirely in RAM for fast lookup
(3)DataNodes store opaque file contents in “block” objects on underlying local filesystem
6、HDFS Conclusions
(1)HDFS supports large-scale processing workloads on commodity hardware
            –designed to tolerate frequent component failures;
            –optimized for huge files that are mostly appended and read
           – filesystem interface is customized for the job, but still retains familiarity for developers
           – simple solutions can work (e.g., single master)
(2)Reliably stores several TB in individual clusters

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值