Hadoop当前所包含的子项目汇总

最新推荐文章于 2024-09-05 08:52:10 发布

DerekJiang

最新推荐文章于 2024-09-05 08:52:10 发布

阅读量2.5k

点赞数

分类专栏： Hadoop 文章标签： hadoop serialization database system mapreduce processing

本文链接：https://blog.csdn.net/derekjiang/article/details/6834657

版权

12 篇文章 0 订阅

订阅专栏

目前，Hadoop project下已经包含了很多的子项目，有的是从原有的hadoop项目中细化出来的，有的是在hadoop的基础之上演变出来的，本文只是引用hadoop文档中关于其子项目的介绍，以备了解。

The project includes these subprojects:

Hadoop Common: The common utilities that support the other Hadoop subprojects.
Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data.
Hadoop MapReduce: A software framework for distributed processing of large data sets on compute clusters.

Other Hadoop-related projects at Apache include:

Avro™: A data serialization system.
Cassandra™: A scalable multi-master database with no single points of failure.
Chukwa™: A data collection system for managing large distributed systems.
HBase™: A scalable, distributed database that supports structured data storage for large tables.
Hive™: A data warehouse infrastructure that provides data summarization and ad hoc querying.
Mahout™: A Scalable machine learning and data mining library.
Pig™: A high-level data-flow language and execution framework for parallel computation.
ZooKeeper™: A high-performance coordination service for distributed applications.