Class InputFormat<K,V>
Map-Reduce framework :Split-up the input file(s) into logical InputSplit
s, each of which is then assigned to an individual Mapper
.
Map-Reduce framework 分割输入文件到逻辑的InputSplit
s,每一个InputSplit都被赋值给个人的Mapper.
RecordReader
implementation to be used to glean input records from the logical InputSplit
for processing by the Mapper
RecordReader实现成为了Mapper处理,用于从逻辑的InputSplit收集记录
the FileSystem
blocksize of the input files is treated as an upper bound for input splits. A lower bound on the split size can be set via mapreduce.input.fileinputformat.split.minsize.
输入文件的最大系统块是分割的上线,下线可以通过mapreduce.input.fileinputformat.split.minsize设置
JOB
It allows the user to configure the job, submit it, control its execution, and query the state
允许用户配置作业,提交他,控制它的执行,和查询状态
Java抽象类org.apache.hadoop.fs.FileSystem定义了hadoop的一个文件系统接口
FileCopyWithProgress---Copies a local file to a Hadoop filesystem 展现如何拷贝本地文件到Hadoop文件系统
FileSystemCat /FileSystemDoubleCat--Displays files from a Hadoop filesystem on standard output by using the FileSystem directly 通过直接使用文件系统显示hadoop文件系统的文件到标准输出上。
URLCat--- Displays files from a Hadoop filesystem on standard output using a URLStreamHandler. 使用URLStreamHandler显示hadoop 文件系统的文件到标准输出上。
Hadoop中的FileStatus类可以用来查看HDFS中文件或者目录的元信息
FileStatus[] status = fs.listStatus(paths);