HDFS图解及流对文件的操作

最新推荐文章于 2022-08-29 14:18:25 发布

wang_da_king

最新推荐文章于 2022-08-29 14:18:25 发布

阅读量633

点赞数

分类专栏：技术总结

本文链接：https://blog.csdn.net/wang_da_king/article/details/81258652

版权

技术总结专栏收录该内容

6 篇文章 0 订阅

订阅专栏

HDFS：Hadoop Distributed File System（Hadoop分布式文件系统）

Introduction：The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware. It has many similarities with existing distributed file systems. However, the differences from other distributed file systems are significant. HDFS is highly fault-tolerant and is designed to be deployed on low-cost hardware. HDFS provides high throughput access to application data and is suitable for applications that have large data sets. HDFS relaxes a few POSIX requirements to enable streaming access to file system data. HDFS was originally built as infrastructure for the Apache Nutch web search engine project. HDFS is part of the Apache Hadoop Core project.

大致意思：Hadoop分布式文件系统是一个分布式文件系统，为了运行在普通硬件上而设计的，它和已经存在的分布式文件系统很相似。然而，不同之处是其他的分布式文件系统是重要的（因为是单节点）。Hdfs是高容错和为廉价硬件而设计的。Hfds提供高的吞吐量对于通过应用的数据和大的数据集是最合适的。HDFS放宽了一些POSIX要求，允许对文件系统数据进行流式访问，hdfs是以Apache nutch 网页搜索引擎项目而开发的。Hdfs是Apache Hadoop核心的一部分。

1、Hdfs读写过程（图解）