mapreduce 参数:
return Math.max(minSize, Math.min(maxSize, blockSize));
mapreduce.input.fileinputformat.split.minsize (default 0)
mapred.min.split.size
The minimum size chunk that map input should be split into. Note that some file formats may have minimum split sizes that take priority over this setting.
mapreduce.input.fileinputformat.split.maxsize
mapred.max.split.size(旧版)
启动map最大的split size大小
每个split的最大值,如果设置了mapreduce.input.fileinputformat.split.maxsize,则为该值,否则为Long的最大值。(如果不设置,合并小文件时,所有小文件会合并成一个文件)
mapreduce.input.fileinputformat.split.minsize.per.node
mapred.min.split.size.per.node(旧版)
mapreduce.input.fileinputformat.split.minsize.per.rack
mapred.min.split.size.per.rack(旧版)
划分split
划分的逻辑如下:
1) 遍历输入目录中的每个文件,拿到该文件
2)