the Apache Hadoop delivers functionality of reliable ,scalable and distributed computing
Apache Hadoop library allows for the distributed processing of the large data sets across clusters of computers using the simple programming models. it is designed to scale up from the single server to thousands of machine,each offering local storage and computation rather than rely on hardware to deliver high availability
there are the following modules:
1 Hadoop common:the common utilities to support others Hadoop module
2 Hadoop Distributed File System (HDFS):a distributed file system which provides high though out access to application data
3 Hadoop YARN :job schedule
4 Hadoop MapReduce: parallel processing