StreamJob.java
run() method:
init(); 生成 Environment env_ 对象
prePorcessArgs();
parseArgv(); 解析Hadoop Streaming 命令参数,并赋值给StreamJob成员变量
postProcessArgs(); 检查输入参数的完整性,有效性,充分性
setJobConf(); 根据上面的命令参数,配置mapreduce job 的各项参数
JobConf: jobConf_ : general MapRed job properties
Configuration: config_ : as parameter to create JobConf object.
Class fmt=TextInputFormat.class
TextInputFormat implements InputFormat interface:
public interface InputFormat<K,V>
InputFormat
describes the input-specification for a Map-Reduce job.The Map-Reduce framework relies on the
InputFormat
of the job to:
- Validate the input-specification of the job.
- Split-up the input file(s) into logical
InputSplit
s, each of which is then assigned to an individualMapper
.- Provide the
RecordReader
implementation to be used to glean input records from the logicalInputSplit
for processing by theMapper
.