hadoop_mapreduce03-InputFormat数据输入-包括切片与MapTask并行度Combine小文件等

hadoop_mapreduce03-InputFormat数据输入-包括切片与MapTask并行度Combine小文件等

注:仅做笔记,摘记,详细见源码和word。

1. 切片与MapTask并行度决定机制

数据块:Block是HDFS物理上把数据分成一块一块。 本地32M,hadoop1.x 64M,hadoop2.x 128M

数据切片:逻辑上对数据进行切片,建议和块大小设置一样,否则会有产生很多IO。

2. Job提交流程源码和切片源码解读

waitForCompletion()

submit();

略。

3. FileInputFormart切片机制

略。

input.getSplits(job)

计算切片大小:computeSplitSize(Math.max(minSize,Math.min(maxSize,blockSize))) = blockSize = 128M
(每次切片时,都要判断切完剩下的部分是否大于块的1.1倍,不大于1.1倍就划分为一块切片)

// 源码中计算切片大小公式
Math.max(minSize,Math.min(maxSize,blockSize));
mapreduce.input.fileinputformat.split.minsize = 1 默认值为1,机器好设置为256M
maperduce.input.fileinputformat.split.maxsize = Long.MAXValue 默认为Long的最大值,机器差设置为32M

略。

提交切片规划文件到yarn集群上,由MrAppMaster根据切片规划文件计算开启MapTask的个数。

4. CombineTextInputFormat切片机制

框架默认的TextInputFormat是对任务按文件规划切片,不管文件多小都会是一个单独的切片,交给一个MapTask,如果有大量小文件就会产生大量MapTask,处理效率低。

4.1 应用场景

小文件过多的场景,将多个小文件从逻辑上规划到一个切片中,这样多个小文件就可以交给一个MapTask处理。

4.2 mr中应用

Driver中设置

job.setInputFormatClass(CombineTextInputFormat.class)

虚拟存储切片最大值设置

CombineTextInputForMat.SetMaxInputSplitSize(job,4194304); // 4m
4.3 切片机制

虚拟存储过程+切片过程。

4.4 案例实操
4.4.1 需求

将输入的3个小文件合并成一个切片处理。

a.txt 10kb

b.txt 2.6mb

c.txt 30kb

4.4.2 执行

1)不做任何处理,运行mapreduce01的WordCount案例,观察number of splits个数为3。

2)Driver中设置如下,观察number of splits个数。

job.setInputFormatClass(CombineTextInputFormat.class)
CombineTextInputForMat.SetMaxInputSplitSize(job,2097152); // 2097152b = 2m , 134217728b = 128m

3)本地执行日志

// TextInputFormart    number of splits:3
/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/bin/java -javaagent:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar=52203:/Applications/IntelliJ IDEA CE.app/Contents/bin -Dfile.encoding=UTF-8 -classpath /Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/charsets.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/deploy.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/ext/cldrdata.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/ext/dnsns.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/ext/jaccess.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/ext/jfxrt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/ext/localedata.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/ext/nashorn.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/ext/sunec.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/ext/sunjce_provider.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/ext/sunpkcs11.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/ext/zipfs.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/javaws.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/jce.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/jfr.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/jfxswt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/jsse.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/management-agent.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/plugin.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/resources.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/jre/lib/rt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/lib/ant-javafx.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/lib/dt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/lib/javafx-mx.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/lib/jconsole.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/lib/packager.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/lib/sa-jdi.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_251.jdk/Contents/Home/lib/tools.jar:/Users/art/Documents/#Java_original_script/hadoop_mr_2103/mr_wordcount_2110/target/classes:/Users/art/.m2/repository/junit/junit/4.13.2/junit-4.13.2.jar:/Users/art/.m2/repository/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar:/Users/art/.m2/repository/org/apache/logging/log4j/log4j-core/2.8.2/log4j-core-2.8.2.jar:/Users/art/.m2/repository/org/apache/logging/log4j/log4j-api/2.8.2/log4j-api-2.8.2.jar:/Users/art/.m2/repository/org/apache/hadoop/hadoop-common/2.7.2/hadoop-common-2.7.2.jar:/Users/art/.m2/repository/org/apache/hadoop/hadoop-annotations/2.7.2/hadoop-annotations-2.7.2.jar:/Users/art/.m2/repository/com/google/guava/guava/11.0.2/guava-11.0.2.jar:/Users/art/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar:/Users/art/.m2/repository/org/apache/commons/commons-math3/3.1.1/commons-math3-3.1.1.jar:/Users/art/.m2/repository/xmlenc/xmlenc/0.52/xmlenc-0.52.jar:/Users/art/.m2/repository/commons-httpclient/commons-httpclient/3.1/commons-httpclient-3.1.jar:/Users/art/.m2/repository/commons-codec/commons-codec/1.4/commons-codec-1.4.jar:/Users/art/.m2/repository/commons-io/commons-io/2.4/commons-io-2.4.jar:/Users/art/.m2/repository/commons-net/commons-net/3.1/commons-net-3.1.jar:/Users/art/.m2/repository/commons-collections/commons-collections/3.2.2/commons-collections-3.2.2.jar:/Users/art/.m2/repository/javax/servlet/servlet-api/2.5/servlet-api-2.5.jar:/Users/art/.m2/repository/org/mortbay/jetty/jetty/6.1.26/jetty-6.1.26.jar:/Users/art/.m2/repository/org/mortbay/jetty/jetty-util/6.1.26/jetty-util-6.1.26.jar:/Users/art/.m2/repository/javax/servlet/jsp/jsp-api/2.1/jsp-api-2.1.jar:/Users/art/.m2/repository/com/sun/jersey/jersey-core/1.9/jersey-core-1.9.jar:/Users/art/.m2/repository/com/sun/jersey/jersey-json/1.9/jersey-json-1.9.jar:/Users/art/.m2/repository/org/codehaus/jettison/jettison/1.1/jettison-1.1.jar:/Users/art/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.3-1/jaxb-impl-2.2.3-1.jar:/Users/art/.m2/repository/javax/xml/bind/jaxb-api/2.2.2/jaxb-api-2.2.2.jar:/Users/art/.m2/repository/javax/xml/stream/stax-api/1.0-2/stax-api-1.0-2.jar:/Users/art/.m2/repository/javax/activation/activation/1.1/activation-1.1.jar:/Users/art/.m2/repository/org/codehaus/jackson/jackson-jaxrs/1.8.3/jackson-jaxrs-1.8.3.jar:/Users/art/.m2/repository/org/codehaus/jackson/jackson-xc/1.8.3/jackson-xc-1.8.3.jar:/Users/art/.m2/repository/com/sun/jersey/jersey-server/1.9/jersey-server-1.9.jar:/Users/art/.m2/repository/asm/asm/3.1/asm-3.1.jar:/Users/art/.m2/repository/commons-logging/commons-logging/1.1.3/commons-logging-1.1.3.jar:/Users/art/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/Users/art/.m2/repository/net/java/dev/jets3t/jets3t/0.9.0/jets3t-0.9.0.jar:/Users/art/.m2/repository/org/apache/httpcomponents/httpclient/4.1.2/httpclient-4.1.2.jar:/Users/art/.m2/repository/org/apache/httpcomponents/httpcore/4.1.2/httpcore-4.1.2.jar:/Users/art/.m2/repository/com/jamesmurty/utils/java-xmlbuilder/0.4/java-xmlbuilder-0.4.jar:/Users/art/.m2/repository/commons-lang/commons-lang/2.6/commons-lang-2.6.jar:/Users/art/.m2/repository/commons-configuration/commons-configuration/1.6/commons-configuration-1.6.jar:/Users/art/.m2/repository/commons-digester/commons-digester/1.8/commons-digester-1.8.jar:/Users/art/.m2/repository/commons-beanutils/commons-beanutils/1.7.0/commons-beanutils-1.7.0.jar:/Users/art/.m2/repository/commons-beanutils/commons-beanutils-core/1.8.0/commons-beanutils-core-1.8.0.jar:/Users/art/.m2/repository/org/slf4j/slf4j-api/1.7.10/slf4j-api-1.7.10.jar:/Users/art/.m2/repository/org/slf4j/slf4j-log4j12/1.7.10/slf4j-log4j12-1.7.10.jar:/Users/art/.m2/repository/org/codehaus/jackson/jackson-core-asl/1.9.13/jackson-core-asl-1.9.13.jar:/Users/art/.m2/repository/org/codehaus/jackson/jackson-mapper-asl/1.9.13/jackson-mapper-asl-1.9.13.jar:/Users/art/.m2/repository/org/apache/avro/avro/1.7.4/avro-1.7.4.jar:/Users/art/.m2/repository/com/thoughtworks/paranamer/paranamer/2.3/paranamer-2.3.jar:/Users/art/.m2/repository/org/xerial/snappy/snappy-java/1.0.4.1/snappy-java-1.0.4.1.jar:/Users/art/.m2/repository/com/google/protobuf/protobuf-java/2.5.0/protobuf-java-2.5.0.jar:/Users/art/.m2/repository/com/google/code/gson/gson/2.2.4/gson-2.2.4.jar:/Users/art/.m2/repository/org/apache/hadoop/hadoop-auth/2.7.2/hadoop-auth-2.7.2.jar:/Users/art/.m2/repository/org/apache/directory/server/apacheds-kerberos-codec/2.0.0-M15/apacheds-kerberos-codec-2.0.0-M15.jar:/Users/art/.m2/repository/org/apache/directory/server/apacheds-i18n/2.0.0-M15/apacheds-i18n-2.0.0-M15.jar:/Users/art/.m2/repository/org/apache/directory/api/api-asn1-api/1.0.0-M20/api-asn1-api-1.0.0-M20.jar:/Users/art/.m2/repository/org/apache/directory/api/api-util/1.0.0-M20/api-util-1.0.0-M20.jar:/Users/art/.m2/repository/org/apache/curator/curator-framework/2.7.1/curator-framework-2.7.1.jar:/Users/art/.m2/repository/com/jcraft/jsch/0.1.42/jsch-0.1.42.jar:/Users/art/.m2/repository/org/apache/curator/curator-client/2.7.1/curator-client-2.7.1.jar:/Users/art/.m2/repository/org/apache/curator/curator-recipes/2.7.1/curator-recipes-2.7.1.jar:/Users/art/.m2/repository/com/google/code/findbugs/jsr305/3.0.0/jsr305-3.0.0.jar:/Users/art/.m2/repository/org/apache/htrace/htrace-core/3.1.0-incubating/htrace-core-3.1.0-incubating.jar:/Users/art/.m2/repository/org/apache/zookeeper/zookeeper/3.4.6/zookeeper-3.4.6.jar:/Users/art/.m2/repository/org/apache/commons/commons-compress/1.4.1/commons-compress-1.4.1.jar:/Users/art/.m2/repository/org/tukaani/xz/1.0/xz-1.0.jar:/Users/art/.m2/repository/org/apache/hadoop/hadoop-client/2.7.2/hadoop-client-2.7.2.jar:/Users/art/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-app/2.7.2/hadoop-mapreduce-client-app-2.7.2.jar:/Users/art/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-common/2.7.2/hadoop-mapreduce-client-common-2.7.2.jar:/Users/art/.m2/repository/org/apache/hadoop/hadoop-yarn-client/2.7.2/hadoop-yarn-client-2.7.2.jar:/Users/art/.m2/repository/org/apache/hadoop/hadoop-yarn-server-common/2.7.2/hadoop-yarn-server-common-2.7.2.jar:/Users/art/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-shuffle/2.7.2/hadoop-mapreduce-client-shuffle-2.7.2.jar:/Users/art/.m2/repository/org/apache/hadoop/hadoop-yarn-api/2.7.2/hadoop-yarn-api-2.7.2.jar:/Users/art/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-core/2.7.2/hadoop-mapreduce-client-core-2.7.2.jar:/Users/art/.m2/repository/org/apache/hadoop/hadoop-yarn-common/2.7.2/hadoop-yarn-common-2.7.2.jar:/Users/art/.m2/repository/com/sun/jersey/jersey-client/1.9/jersey-client-1.9.jar:/Users/art/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-jobclient/2.7.2/hadoop-mapreduce-client-jobclient-2.7.2.jar:/Users/art/.m2/repository/org/apache/hadoop/hadoop-hdfs/2.7.2/hadoop-hdfs-2.7.2.jar:/Users/art/.m2/repository/commons-daemon/commons-daemon/1.0.13/commons-daemon-1.0.13.jar:/Users/art/.m2/repository/io/netty/netty/3.6.2.Final/netty-3.6.2.Final.jar:/Users/art/.m2/repository/io/netty/netty-all/4.0.23.Final/netty-all-4.0.23.Final.jar:/Users/art/.m2/repository/xerces/xercesImpl/2.9.1/xercesImpl-2.9.1.jar:/Users/art/.m2/repository/xml-apis/xml-apis/1.3.04/xml-apis-1.3.04.jar:/Users/art/.m2/repository/org/fusesource/leveldbjni/leveldbjni-all/1.8/leveldbjni-all-1.8.jar com.art.mapreduce.wordcount.WordCountDriver
0000-00-00 00:29:37,704 WARN [org.apache.hadoop.util.NativeCodeLoader] - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
0000-00-00 00:29:37,777 INFO [org.apache.hadoop.conf.Configuration.deprecation] - session.id is deprecated. Instead, use dfs.metrics.session-id
0000-00-00 00:29:37,778 INFO [org.apache.hadoop.metrics.jvm.JvmMetrics] - Initializing JVM Metrics with processName=JobTracker, sessionId=
0000-00-00 00:29:38,152 WARN [org.apache.hadoop.mapreduce.JobResourceUploader] - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
0000-00-00 00:29:38,156 WARN [org.apache.hadoop.mapreduce.JobResourceUploader] - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
0000-00-00 00:29:38,180 INFO [org.apache.hadoop.mapreduce.lib.input.FileInputFormat] - Total input paths to process : 3
0000-00-00 00:29:38,219 INFO [org.apache.hadoop.mapreduce.JobSubmitter] - number of splits:3
0000-00-00 00:29:38,278 INFO [org.apache.hadoop.mapreduce.JobSubmitter] - Submitting tokens for job: job_local929576627_0001
0000-00-00 00:29:38,379 INFO [org.apache.hadoop.mapreduce.Job] - The url to track the job: http://localhost:8080/
0000-00-00 00:29:38,389 INFO [org.apache.hadoop.mapreduce.Job] - Running job: job_local929576627_0001
0000-00-00 00:29:38,390 INFO [org.apache.hadoop.mapred.LocalJobRunner] - OutputCommitter set in config null
0000-00-00 00:29:38,394 INFO [org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter] - File Output Committer Algorithm version is 1
0000-00-00 00:29:38,395 INFO [org.apache.hadoop.mapred.LocalJobRunner] - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
0000-00-00 00:29:38,424 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Waiting for map tasks
0000-00-00 00:29:38,424 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Starting task: attempt_local929576627_0001_m_000000_0
0000-00-00 00:29:38,438 INFO [org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter] - File Output Committer Algorithm version is 1
0000-00-00 00:29:38,442 INFO [org.apache.hadoop.yarn.util.ProcfsBasedProcessTree] - ProcfsBasedProcessTree currently is supported only on Linux.
0000-00-00 00:29:38,443 INFO [org.apache.hadoop.mapred.Task] -  Using ResourceCalculatorProcessTree : null
0000-00-00 00:29:38,446 INFO [org.apache.hadoop.mapred.MapTask] - Processing split: file:/Users/art/Documents/demo_datas/wordcount_combine_inputs/b.txt:0+2580600
0000-00-00 00:29:38,466 INFO [org.apache.hadoop.mapred.MapTask] - (EQUATOR) 0 kvi 26214396(104857584)
0000-00-00 00:29:38,466 INFO [org.apache.hadoop.mapred.MapTask] - mapreduce.task.io.sort.mb: 100
0000-00-00 00:29:38,466 INFO [org.apache.hadoop.mapred.MapTask] - soft limit at 83886080
0000-00-00 00:29:38,466 INFO [org.apache.hadoop.mapred.MapTask] - bufstart = 0; bufvoid = 104857600
0000-00-00 00:29:38,466 INFO [org.apache.hadoop.mapred.MapTask] - kvstart = 26214396; length = 6553600
0000-00-00 00:29:38,468 INFO [org.apache.hadoop.mapred.MapTask] - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
0000-00-00 00:29:38,761 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 
0000-00-00 00:29:38,761 INFO [org.apache.hadoop.mapred.MapTask] - Starting flush of map output
0000-00-00 00:29:38,761 INFO [org.apache.hadoop.mapred.MapTask] - Spilling map output
0000-00-00 00:29:38,761 INFO [org.apache.hadoop.mapred.MapTask] - bufstart = 0; bufend = 4600200; bufvoid = 104857600
0000-00-00 00:29:38,761 INFO [org.apache.hadoop.mapred.MapTask] - kvstart = 26214396(104857584); kvend = 24194800(96779200); length = 2019597/6553600
0000-00-00 00:29:38,978 INFO [org.apache.hadoop.mapred.MapTask] - Finished spill 0
0000-00-00 00:29:38,981 INFO [org.apache.hadoop.mapred.Task] - Task:attempt_local929576627_0001_m_000000_0 is done. And is in the process of committing
0000-00-00 00:29:38,986 INFO [org.apache.hadoop.mapred.LocalJobRunner] - map
0000-00-00 00:29:38,986 INFO [org.apache.hadoop.mapred.Task] - Task 'attempt_local929576627_0001_m_000000_0' done.
0000-00-00 00:29:38,986 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Finishing task: attempt_local929576627_0001_m_000000_0
0000-00-00 00:29:38,986 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Starting task: attempt_local929576627_0001_m_000001_0
0000-00-00 00:29:38,987 INFO [org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter] - File Output Committer Algorithm version is 1
0000-00-00 00:29:38,987 INFO [org.apache.hadoop.yarn.util.ProcfsBasedProcessTree] - ProcfsBasedProcessTree currently is supported only on Linux.
0000-00-00 00:29:38,987 INFO [org.apache.hadoop.mapred.Task] -  Using ResourceCalculatorProcessTree : null
0000-00-00 00:29:38,988 INFO [org.apache.hadoop.mapred.MapTask] - Processing split: file:/Users/art/Documents/demo_datas/wordcount_combine_inputs/c.txt:0+30360
0000-00-00 00:29:38,996 INFO [org.apache.hadoop.mapred.MapTask] - (EQUATOR) 0 kvi 26214396(104857584)
0000-00-00 00:29:38,996 INFO [org.apache.hadoop.mapred.MapTask] - mapreduce.task.io.sort.mb: 100
0000-00-00 00:29:38,996 INFO [org.apache.hadoop.mapred.MapTask] - soft limit at 83886080
0000-00-00 00:29:38,996 INFO [org.apache.hadoop.mapred.MapTask] - bufstart = 0; bufvoid = 104857600
0000-00-00 00:29:38,996 INFO [org.apache.hadoop.mapred.MapTask] - kvstart = 26214396; length = 6553600
0000-00-00 00:29:38,996 INFO [org.apache.hadoop.mapred.MapTask] - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
0000-00-00 00:29:39,001 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 
0000-00-00 00:29:39,001 INFO [org.apache.hadoop.mapred.MapTask] - Starting flush of map output
0000-00-00 00:29:39,001 INFO [org.apache.hadoop.mapred.MapTask] - Spilling map output
0000-00-00 00:29:39,001 INFO [org.apache.hadoop.mapred.MapTask] - bufstart = 0; bufend = 54120; bufvoid = 104857600
0000-00-00 00:29:39,001 INFO [org.apache.hadoop.mapred.MapTask] - kvstart = 26214396(104857584); kvend = 26190640(104762560); length = 23757/6553600
0000-00-00 00:29:39,004 INFO [org.apache.hadoop.mapred.MapTask] - Finished spill 0
0000-00-00 00:29:39,082 INFO [org.apache.hadoop.mapred.Task] - Task:attempt_local929576627_0001_m_000001_0 is done. And is in the process of committing
0000-00-00 00:29:39,083 INFO [org.apache.hadoop.mapred.LocalJobRunner] - map
0000-00-00 00:29:39,083 INFO [org.apache.hadoop.mapred.Task] - Task 'attempt_local929576627_0001_m_000001_0' done.
0000-00-00 00:29:39,083 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Finishing task: attempt_local929576627_0001_m_000001_0
0000-00-00 00:29:39,083 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Starting task: attempt_local929576627_0001_m_000002_0
0000-00-00 00:29:39,084 INFO [org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter] - File Output Committer Algorithm version is 1
0000-00-00 00:29:39,084 INFO [org.apache.hadoop.yarn.util.ProcfsBasedProcessTree] - ProcfsBasedProcessTree currently is supported only on Linux.
0000-00-00 00:29:39,084 INFO [org.apache.hadoop.mapred.Task] -  Using ResourceCalculatorProcessTree : null
0000-00-00 00:29:39,085 INFO [org.apache.hadoop.mapred.MapTask] - Processing split: file:/Users/art/Documents/demo_datas/wordcount_combine_inputs/a.txt:0+10120
0000-00-00 00:29:39,093 INFO [org.apache.hadoop.mapred.MapTask] - (EQUATOR) 0 kvi 26214396(104857584)
0000-00-00 00:29:39,093 INFO [org.apache.hadoop.mapred.MapTask] - mapreduce.task.io.sort.mb: 100
0000-00-00 00:29:39,093 INFO [org.apache.hadoop.mapred.MapTask] - soft limit at 83886080
0000-00-00 00:29:39,093 INFO [org.apache.hadoop.mapred.MapTask] - bufstart = 0; bufvoid = 104857600
0000-00-00 00:29:39,093 INFO [org.apache.hadoop.mapred.MapTask] - kvstart = 26214396; length = 6553600
0000-00-00 00:29:39,093 INFO [org.apache.hadoop.mapred.MapTask] - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
0000-00-00 00:29:39,096 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 
0000-00-00 00:29:39,096 INFO [org.apache.hadoop.mapred.MapTask] - Starting flush of map output
0000-00-00 00:29:39,096 INFO [org.apache.hadoop.mapred.MapTask] - Spilling map output
0000-00-00 00:29:39,096 INFO [org.apache.hadoop.mapred.MapTask] - bufstart = 0; bufend = 18040; bufvoid = 104857600
0000-00-00 00:29:39,096 INFO [org.apache.hadoop.mapred.MapTask] - kvstart = 26214396(104857584); kvend = 26206480(104825920); length = 7917/6553600
0000-00-00 00:29:39,098 INFO [org.apache.hadoop.mapred.MapTask] - Finished spill 0
0000-00-00 00:29:39,099 INFO [org.apache.hadoop.mapred.Task] - Task:attempt_local929576627_0001_m_000002_0 is done. And is in the process of committing
0000-00-00 00:29:39,100 INFO [org.apache.hadoop.mapred.LocalJobRunner] - map
0000-00-00 00:29:39,100 INFO [org.apache.hadoop.mapred.Task] - Task 'attempt_local929576627_0001_m_000002_0' done.
0000-00-00 00:29:39,100 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Finishing task: attempt_local929576627_0001_m_000002_0
0000-00-00 00:29:39,100 INFO [org.apache.hadoop.mapred.LocalJobRunner] - map task executor complete.
0000-00-00 00:29:39,102 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Waiting for reduce tasks
0000-00-00 00:29:39,102 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Starting task: attempt_local929576627_0001_r_000000_0
0000-00-00 00:29:39,106 INFO [org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter] - File Output Committer Algorithm version is 1
0000-00-00 00:29:39,106 INFO [org.apache.hadoop.yarn.util.ProcfsBasedProcessTree] - ProcfsBasedProcessTree currently is supported only on Linux.
0000-00-00 00:29:39,106 INFO [org.apache.hadoop.mapred.Task] -  Using ResourceCalculatorProcessTree : null
0000-00-00 00:29:39,107 INFO [org.apache.hadoop.mapred.ReduceTask] - Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@439e6258
0000-00-00 00:29:39,114 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - MergerManager: memoryLimit=5345011200, maxSingleShuffleLimit=1336252800, mergeThreshold=3527707648, ioSortFactor=10, memToMemMergeOutputsThreshold=10
0000-00-00 00:29:39,115 INFO [org.apache.hadoop.mapreduce.task.reduce.EventFetcher] - attempt_local929576627_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
0000-00-00 00:29:39,132 INFO [org.apache.hadoop.mapreduce.task.reduce.LocalFetcher] - localfetcher#1 about to shuffle output of map attempt_local929576627_0001_m_000002_0 decomp: 22002 len: 22006 to MEMORY
0000-00-00 00:29:39,140 INFO [org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput] - Read 22002 bytes from map-output for attempt_local929576627_0001_m_000002_0
0000-00-00 00:29:39,141 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - closeInMemoryFile -> map-output of size: 22002, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->22002
0000-00-00 00:29:39,142 INFO [org.apache.hadoop.mapreduce.task.reduce.LocalFetcher] - localfetcher#1 about to shuffle output of map attempt_local929576627_0001_m_000001_0 decomp: 66002 len: 66006 to MEMORY
0000-00-00 00:29:39,143 INFO [org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput] - Read 66002 bytes from map-output for attempt_local929576627_0001_m_000001_0
0000-00-00 00:29:39,143 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - closeInMemoryFile -> map-output of size: 66002, inMemoryMapOutputs.size() -> 2, commitMemory -> 22002, usedMemory ->88004
0000-00-00 00:29:39,144 INFO [org.apache.hadoop.mapreduce.task.reduce.LocalFetcher] - localfetcher#1 about to shuffle output of map attempt_local929576627_0001_m_000000_0 decomp: 5610002 len: 5610006 to MEMORY
0000-00-00 00:29:39,148 INFO [org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput] - Read 5610002 bytes from map-output for attempt_local929576627_0001_m_000000_0
0000-00-00 00:29:39,148 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - closeInMemoryFile -> map-output of size: 5610002, inMemoryMapOutputs.size() -> 3, commitMemory -> 88004, usedMemory ->5698006
0000-00-00 00:29:39,148 INFO [org.apache.hadoop.mapreduce.task.reduce.EventFetcher] - EventFetcher is interrupted.. Returning
0000-00-00 00:29:39,149 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 3 / 3 copied.
0000-00-00 00:29:39,149 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - finalMerge called with 3 in-memory map-outputs and 0 on-disk map-outputs
0000-00-00 00:29:39,152 INFO [org.apache.hadoop.mapred.Merger] - Merging 3 sorted segments
0000-00-00 00:29:39,152 INFO [org.apache.hadoop.mapred.Merger] - Down to the last merge-pass, with 3 segments left of total size: 5697991 bytes
0000-00-00 00:29:39,365 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - Merged 3 segments, 5698006 bytes to disk to satisfy reduce memory limit
0000-00-00 00:29:39,365 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - Merging 1 files, 5698006 bytes from disk
0000-00-00 00:29:39,366 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - Merging 0 segments, 0 bytes from memory into reduce
0000-00-00 00:29:39,366 INFO [org.apache.hadoop.mapred.Merger] - Merging 1 sorted segments
0000-00-00 00:29:39,366 INFO [org.apache.hadoop.mapred.Merger] - Down to the last merge-pass, with 1 segments left of total size: 5697997 bytes
0000-00-00 00:29:39,366 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 3 / 3 copied.
0000-00-00 00:29:39,373 INFO [org.apache.hadoop.conf.Configuration.deprecation] - mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
0000-00-00 00:29:39,396 INFO [org.apache.hadoop.mapreduce.Job] - Job job_local929576627_0001 running in uber mode : false
0000-00-00 00:29:39,397 INFO [org.apache.hadoop.mapreduce.Job] -  map 100% reduce 0%
0000-00-00 00:29:39,603 INFO [org.apache.hadoop.mapred.Task] - Task:attempt_local929576627_0001_r_000000_0 is done. And is in the process of committing
0000-00-00 00:29:39,603 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 3 / 3 copied.
0000-00-00 00:29:39,603 INFO [org.apache.hadoop.mapred.Task] - Task attempt_local929576627_0001_r_000000_0 is allowed to commit now
0000-00-00 00:29:39,604 INFO [org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter] - Saved output of task 'attempt_local929576627_0001_r_000000_0' to file:/Users/art/Documents/demo_datas/wordcount_combine_outputs/_temporary/0/task_local929576627_0001_r_000000
0000-00-00 00:29:39,605 INFO [org.apache.hadoop.mapred.LocalJobRunner] - reduce > reduce
0000-00-00 00:29:39,605 INFO [org.apache.hadoop.mapred.Task] - Task 'attempt_local929576627_0001_r_000000_0' done.
0000-00-00 00:29:39,605 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Finishing task: attempt_local929576627_0001_r_000000_0
0000-00-00 00:29:39,605 INFO [org.apache.hadoop.mapred.LocalJobRunner] - reduce task executor complete.
0000-00-00 00:29:40,402 INFO [org.apache.hadoop.mapreduce.Job] -  map 100% reduce 100%
0000-00-00 00:29:40,402 INFO [org.apache.hadoop.mapreduce.Job] - Job job_local929576627_0001 completed successfully
0000-00-00 00:29:40,410 INFO [org.apache.hadoop.mapreduce.Job] - Counters: 30
	File System Counters
		FILE: Number of bytes read=21833967
		FILE: Number of bytes written=29504483
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
	Map-Reduce Framework
		Map input records=170940
		Map output records=512820
		Map output bytes=4672360
		Map output materialized bytes=5698018
		Input split bytes=408
		Combine input records=0
		Combine output records=0
		Reduce input groups=5
		Reduce shuffle bytes=5698018
		Reduce input records=512820
		Reduce output records=5
		Spilled Records=1025640
		Shuffled Maps =3
		Failed Shuffles=0
		Merged Map outputs=3
		GC time elapsed (ms)=165
		Total committed heap usage (bytes)=3301441536
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=2621080
	File Output Format Counters 
		Bytes Written=71

Process finished with exit code 0


// CombineTextInputFormat MaxInputSplitSize = 2M  number of splits:2
0000-00-00 00:34:33,662 WARN [org.apache.hadoop.util.NativeCodeLoader] - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
0000-00-00 00:34:33,738 INFO [org.apache.hadoop.conf.Configuration.deprecation] - session.id is deprecated. Instead, use dfs.metrics.session-id
0000-00-00 00:34:33,738 INFO [org.apache.hadoop.metrics.jvm.JvmMetrics] - Initializing JVM Metrics with processName=JobTracker, sessionId=
0000-00-00 00:34:34,036 WARN [org.apache.hadoop.mapreduce.JobResourceUploader] - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
0000-00-00 00:34:34,039 WARN [org.apache.hadoop.mapreduce.JobResourceUploader] - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
0000-00-00 00:34:34,059 INFO [org.apache.hadoop.mapreduce.lib.input.FileInputFormat] - Total input paths to process : 3
0000-00-00 00:34:34,072 INFO [org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat] - DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 10120
0000-00-00 00:34:34,099 INFO [org.apache.hadoop.mapreduce.JobSubmitter] - number of splits:2
0000-00-00 00:34:34,159 INFO [org.apache.hadoop.mapreduce.JobSubmitter] - Submitting tokens for job: job_local1887118463_0001
0000-00-00 00:34:34,249 INFO [org.apache.hadoop.mapreduce.Job] - The url to track the job: http://localhost:8080/
0000-00-00 00:34:34,260 INFO [org.apache.hadoop.mapreduce.Job] - Running job: job_local1887118463_0001
0000-00-00 00:34:34,261 INFO [org.apache.hadoop.mapred.LocalJobRunner] - OutputCommitter set in config null
0000-00-00 00:34:34,264 INFO [org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter] - File Output Committer Algorithm version is 1
0000-00-00 00:34:34,265 INFO [org.apache.hadoop.mapred.LocalJobRunner] - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
0000-00-00 00:34:34,294 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Waiting for map tasks
0000-00-00 00:34:34,294 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Starting task: attempt_local1887118463_0001_m_000000_0
0000-00-00 00:34:34,310 INFO [org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter] - File Output Committer Algorithm version is 1
0000-00-00 00:34:34,314 INFO [org.apache.hadoop.yarn.util.ProcfsBasedProcessTree] - ProcfsBasedProcessTree currently is supported only on Linux.
0000-00-00 00:34:34,314 INFO [org.apache.hadoop.mapred.Task] -  Using ResourceCalculatorProcessTree : null
0000-00-00 00:34:34,316 INFO [org.apache.hadoop.mapred.MapTask] - Processing split: Paths:/Users/art/Documents/demo_datas/wordcount_combine_inputs/c.txt:0+30360,/Users/art/Documents/demo_datas/wordcount_combine_inputs/b.txt:0+1290300,/Users/art/Documents/demo_datas/wordcount_combine_inputs/b.txt:1290300+1290300
0000-00-00 00:34:34,335 INFO [org.apache.hadoop.mapred.MapTask] - (EQUATOR) 0 kvi 26214396(104857584)
0000-00-00 00:34:34,335 INFO [org.apache.hadoop.mapred.MapTask] - mapreduce.task.io.sort.mb: 100
0000-00-00 00:34:34,335 INFO [org.apache.hadoop.mapred.MapTask] - soft limit at 83886080
0000-00-00 00:34:34,335 INFO [org.apache.hadoop.mapred.MapTask] - bufstart = 0; bufvoid = 104857600
0000-00-00 00:34:34,335 INFO [org.apache.hadoop.mapred.MapTask] - kvstart = 26214396; length = 6553600
0000-00-00 00:34:34,337 INFO [org.apache.hadoop.mapred.MapTask] - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
0000-00-00 00:34:34,633 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 
0000-00-00 00:34:34,633 INFO [org.apache.hadoop.mapred.MapTask] - Starting flush of map output
0000-00-00 00:34:34,633 INFO [org.apache.hadoop.mapred.MapTask] - Spilling map output
0000-00-00 00:34:34,633 INFO [org.apache.hadoop.mapred.MapTask] - bufstart = 0; bufend = 4654320; bufvoid = 104857600
0000-00-00 00:34:34,633 INFO [org.apache.hadoop.mapred.MapTask] - kvstart = 26214396(104857584); kvend = 24171040(96684160); length = 2043357/6553600
0000-00-00 00:34:34,866 INFO [org.apache.hadoop.mapred.MapTask] - Finished spill 0
0000-00-00 00:34:34,868 INFO [org.apache.hadoop.mapred.Task] - Task:attempt_local1887118463_0001_m_000000_0 is done. And is in the process of committing
0000-00-00 00:34:34,872 INFO [org.apache.hadoop.mapred.LocalJobRunner] - map
0000-00-00 00:34:34,872 INFO [org.apache.hadoop.mapred.Task] - Task 'attempt_local1887118463_0001_m_000000_0' done.
0000-00-00 00:34:34,872 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Finishing task: attempt_local1887118463_0001_m_000000_0
0000-00-00 00:34:34,873 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Starting task: attempt_local1887118463_0001_m_000001_0
0000-00-00 00:34:34,873 INFO [org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter] - File Output Committer Algorithm version is 1
0000-00-00 00:34:34,874 INFO [org.apache.hadoop.yarn.util.ProcfsBasedProcessTree] - ProcfsBasedProcessTree currently is supported only on Linux.
0000-00-00 00:34:34,874 INFO [org.apache.hadoop.mapred.Task] -  Using ResourceCalculatorProcessTree : null
0000-00-00 00:34:34,874 INFO [org.apache.hadoop.mapred.MapTask] - Processing split: Paths:/Users/art/Documents/demo_datas/wordcount_combine_inputs/a.txt:0+10120
0000-00-00 00:34:34,883 INFO [org.apache.hadoop.mapred.MapTask] - (EQUATOR) 0 kvi 26214396(104857584)
0000-00-00 00:34:34,883 INFO [org.apache.hadoop.mapred.MapTask] - mapreduce.task.io.sort.mb: 100
0000-00-00 00:34:34,884 INFO [org.apache.hadoop.mapred.MapTask] - soft limit at 83886080
0000-00-00 00:34:34,884 INFO [org.apache.hadoop.mapred.MapTask] - bufstart = 0; bufvoid = 104857600
0000-00-00 00:34:34,884 INFO [org.apache.hadoop.mapred.MapTask] - kvstart = 26214396; length = 6553600
0000-00-00 00:34:34,884 INFO [org.apache.hadoop.mapred.MapTask] - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
0000-00-00 00:34:34,888 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 
0000-00-00 00:34:34,888 INFO [org.apache.hadoop.mapred.MapTask] - Starting flush of map output
0000-00-00 00:34:34,888 INFO [org.apache.hadoop.mapred.MapTask] - Spilling map output
0000-00-00 00:34:34,888 INFO [org.apache.hadoop.mapred.MapTask] - bufstart = 0; bufend = 18040; bufvoid = 104857600
0000-00-00 00:34:34,888 INFO [org.apache.hadoop.mapred.MapTask] - kvstart = 26214396(104857584); kvend = 26206480(104825920); length = 7917/6553600
0000-00-00 00:34:34,891 INFO [org.apache.hadoop.mapred.MapTask] - Finished spill 0
0000-00-00 00:34:34,893 INFO [org.apache.hadoop.mapred.Task] - Task:attempt_local1887118463_0001_m_000001_0 is done. And is in the process of committing
0000-00-00 00:34:34,894 INFO [org.apache.hadoop.mapred.LocalJobRunner] - map
0000-00-00 00:34:34,894 INFO [org.apache.hadoop.mapred.Task] - Task 'attempt_local1887118463_0001_m_000001_0' done.
0000-00-00 00:34:34,894 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Finishing task: attempt_local1887118463_0001_m_000001_0
0000-00-00 00:34:34,894 INFO [org.apache.hadoop.mapred.LocalJobRunner] - map task executor complete.
0000-00-00 00:34:34,896 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Waiting for reduce tasks
0000-00-00 00:34:34,898 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Starting task: attempt_local1887118463_0001_r_000000_0
0000-00-00 00:34:34,902 INFO [org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter] - File Output Committer Algorithm version is 1
0000-00-00 00:34:34,903 INFO [org.apache.hadoop.yarn.util.ProcfsBasedProcessTree] - ProcfsBasedProcessTree currently is supported only on Linux.
0000-00-00 00:34:34,903 INFO [org.apache.hadoop.mapred.Task] -  Using ResourceCalculatorProcessTree : null
0000-00-00 00:34:34,904 INFO [org.apache.hadoop.mapred.ReduceTask] - Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@ce80871
0000-00-00 00:34:34,912 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - MergerManager: memoryLimit=5345011200, maxSingleShuffleLimit=1336252800, mergeThreshold=3527707648, ioSortFactor=10, memToMemMergeOutputsThreshold=10
0000-00-00 00:34:34,913 INFO [org.apache.hadoop.mapreduce.task.reduce.EventFetcher] - attempt_local1887118463_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
0000-00-00 00:34:34,934 INFO [org.apache.hadoop.mapreduce.task.reduce.LocalFetcher] - localfetcher#1 about to shuffle output of map attempt_local1887118463_0001_m_000000_0 decomp: 5676002 len: 5676006 to MEMORY
0000-00-00 00:34:34,938 INFO [org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput] - Read 5676002 bytes from map-output for attempt_local1887118463_0001_m_000000_0
0000-00-00 00:34:34,939 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - closeInMemoryFile -> map-output of size: 5676002, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->5676002
0000-00-00 00:34:34,941 INFO [org.apache.hadoop.mapreduce.task.reduce.LocalFetcher] - localfetcher#1 about to shuffle output of map attempt_local1887118463_0001_m_000001_0 decomp: 22002 len: 22006 to MEMORY
0000-00-00 00:34:34,941 INFO [org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput] - Read 22002 bytes from map-output for attempt_local1887118463_0001_m_000001_0
0000-00-00 00:34:34,941 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - closeInMemoryFile -> map-output of size: 22002, inMemoryMapOutputs.size() -> 2, commitMemory -> 5676002, usedMemory ->5698004
0000-00-00 00:34:34,941 INFO [org.apache.hadoop.mapreduce.task.reduce.EventFetcher] - EventFetcher is interrupted.. Returning
0000-00-00 00:34:34,942 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 2 / 2 copied.
0000-00-00 00:34:34,942 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - finalMerge called with 2 in-memory map-outputs and 0 on-disk map-outputs
0000-00-00 00:34:34,945 INFO [org.apache.hadoop.mapred.Merger] - Merging 2 sorted segments
0000-00-00 00:34:34,945 INFO [org.apache.hadoop.mapred.Merger] - Down to the last merge-pass, with 2 segments left of total size: 5697994 bytes
0000-00-00 00:34:35,181 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - Merged 2 segments, 5698004 bytes to disk to satisfy reduce memory limit
0000-00-00 00:34:35,182 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - Merging 1 files, 5698006 bytes from disk
0000-00-00 00:34:35,182 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - Merging 0 segments, 0 bytes from memory into reduce
0000-00-00 00:34:35,182 INFO [org.apache.hadoop.mapred.Merger] - Merging 1 sorted segments
0000-00-00 00:34:35,182 INFO [org.apache.hadoop.mapred.Merger] - Down to the last merge-pass, with 1 segments left of total size: 5697997 bytes
0000-00-00 00:34:35,183 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 2 / 2 copied.
0000-00-00 00:34:35,193 INFO [org.apache.hadoop.conf.Configuration.deprecation] - mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
0000-00-00 00:34:35,267 INFO [org.apache.hadoop.mapreduce.Job] - Job job_local1887118463_0001 running in uber mode : false
0000-00-00 00:34:35,268 INFO [org.apache.hadoop.mapreduce.Job] -  map 100% reduce 0%
0000-00-00 00:34:35,434 INFO [org.apache.hadoop.mapred.Task] - Task:attempt_local1887118463_0001_r_000000_0 is done. And is in the process of committing
0000-00-00 00:34:35,435 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 2 / 2 copied.
0000-00-00 00:34:35,435 INFO [org.apache.hadoop.mapred.Task] - Task attempt_local1887118463_0001_r_000000_0 is allowed to commit now
0000-00-00 00:34:35,435 INFO [org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter] - Saved output of task 'attempt_local1887118463_0001_r_000000_0' to file:/Users/art/Documents/demo_datas/wordcount_combine_outputs/_temporary/0/task_local1887118463_0001_r_000000
0000-00-00 00:34:35,436 INFO [org.apache.hadoop.mapred.LocalJobRunner] - reduce > reduce
0000-00-00 00:34:35,436 INFO [org.apache.hadoop.mapred.Task] - Task 'attempt_local1887118463_0001_r_000000_0' done.
0000-00-00 00:34:35,436 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Finishing task: attempt_local1887118463_0001_r_000000_0
0000-00-00 00:34:35,436 INFO [org.apache.hadoop.mapred.LocalJobRunner] - reduce task executor complete.
0000-00-00 00:34:36,270 INFO [org.apache.hadoop.mapreduce.Job] -  map 100% reduce 100%
0000-00-00 00:34:36,270 INFO [org.apache.hadoop.mapreduce.Job] - Job job_local1887118463_0001 completed successfully
0000-00-00 00:34:36,278 INFO [org.apache.hadoop.mapreduce.Job] - Counters: 30
	File System Counters
		FILE: Number of bytes read=19276338
		FILE: Number of bytes written=23619435
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
	Map-Reduce Framework
		Map input records=170940
		Map output records=512820
		Map output bytes=4672360
		Map output materialized bytes=5698012
		Input split bytes=502
		Combine input records=0
		Combine output records=0
		Reduce input groups=5
		Reduce shuffle bytes=5698012
		Reduce input records=512820
		Reduce output records=5
		Spilled Records=1025640
		Shuffled Maps =2
		Failed Shuffles=0
		Merged Map outputs=2
		GC time elapsed (ms)=83
		Total committed heap usage (bytes)=2139619328
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=71

Process finished with exit code 0


// CombineTextInputFormat MaxInputSplitSiz = 128M    number of splits:1
0000-00-00 00:35:57,739 WARN [org.apache.hadoop.util.NativeCodeLoader] - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
0000-00-00 00:35:57,813 INFO [org.apache.hadoop.conf.Configuration.deprecation] - session.id is deprecated. Instead, use dfs.metrics.session-id
0000-00-00 00:35:57,813 INFO [org.apache.hadoop.metrics.jvm.JvmMetrics] - Initializing JVM Metrics with processName=JobTracker, sessionId=
0000-00-00 00:35:57,968 WARN [org.apache.hadoop.mapreduce.JobResourceUploader] - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
0000-00-00 00:35:57,972 WARN [org.apache.hadoop.mapreduce.JobResourceUploader] - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
0000-00-00 00:35:57,993 INFO [org.apache.hadoop.mapreduce.lib.input.FileInputFormat] - Total input paths to process : 3
0000-00-00 00:35:58,004 INFO [org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat] - DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 2621080
0000-00-00 00:35:58,032 INFO [org.apache.hadoop.mapreduce.JobSubmitter] - number of splits:1
0000-00-00 00:35:58,088 INFO [org.apache.hadoop.mapreduce.JobSubmitter] - Submitting tokens for job: job_local1308594653_0001
0000-00-00 00:35:58,175 INFO [org.apache.hadoop.mapreduce.Job] - The url to track the job: http://localhost:8080/
0000-00-00 00:35:58,176 INFO [org.apache.hadoop.mapreduce.Job] - Running job: job_local1308594653_0001
0000-00-00 00:35:58,176 INFO [org.apache.hadoop.mapred.LocalJobRunner] - OutputCommitter set in config null
0000-00-00 00:35:58,179 INFO [org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter] - File Output Committer Algorithm version is 1
0000-00-00 00:35:58,180 INFO [org.apache.hadoop.mapred.LocalJobRunner] - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
0000-00-00 00:35:58,222 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Waiting for map tasks
0000-00-00 00:35:58,222 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Starting task: attempt_local1308594653_0001_m_000000_0
0000-00-00 00:35:58,236 INFO [org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter] - File Output Committer Algorithm version is 1
0000-00-00 00:35:58,240 INFO [org.apache.hadoop.yarn.util.ProcfsBasedProcessTree] - ProcfsBasedProcessTree currently is supported only on Linux.
0000-00-00 00:35:58,240 INFO [org.apache.hadoop.mapred.Task] -  Using ResourceCalculatorProcessTree : null
0000-00-00 00:35:58,243 INFO [org.apache.hadoop.mapred.MapTask] - Processing split: Paths:/Users/art/Documents/demo_datas/wordcount_combine_inputs/c.txt:0+30360,/Users/art/Documents/demo_datas/wordcount_combine_inputs/b.txt:0+2580600,/Users/art/Documents/demo_datas/wordcount_combine_inputs/a.txt:0+10120
0000-00-00 00:35:58,262 INFO [org.apache.hadoop.mapred.MapTask] - (EQUATOR) 0 kvi 26214396(104857584)
0000-00-00 00:35:58,262 INFO [org.apache.hadoop.mapred.MapTask] - mapreduce.task.io.sort.mb: 100
0000-00-00 00:35:58,262 INFO [org.apache.hadoop.mapred.MapTask] - soft limit at 83886080
0000-00-00 00:35:58,262 INFO [org.apache.hadoop.mapred.MapTask] - bufstart = 0; bufvoid = 104857600
0000-00-00 00:35:58,262 INFO [org.apache.hadoop.mapred.MapTask] - kvstart = 26214396; length = 6553600
0000-00-00 00:35:58,264 INFO [org.apache.hadoop.mapred.MapTask] - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
0000-00-00 00:35:58,531 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 
0000-00-00 00:35:58,531 INFO [org.apache.hadoop.mapred.MapTask] - Starting flush of map output
0000-00-00 00:35:58,531 INFO [org.apache.hadoop.mapred.MapTask] - Spilling map output
0000-00-00 00:35:58,531 INFO [org.apache.hadoop.mapred.MapTask] - bufstart = 0; bufend = 4672360; bufvoid = 104857600
0000-00-00 00:35:58,531 INFO [org.apache.hadoop.mapred.MapTask] - kvstart = 26214396(104857584); kvend = 24163120(96652480); length = 2051277/6553600
0000-00-00 00:35:58,784 INFO [org.apache.hadoop.mapred.MapTask] - Finished spill 0
0000-00-00 00:35:58,790 INFO [org.apache.hadoop.mapred.Task] - Task:attempt_local1308594653_0001_m_000000_0 is done. And is in the process of committing
0000-00-00 00:35:58,797 INFO [org.apache.hadoop.mapred.LocalJobRunner] - map
0000-00-00 00:35:58,797 INFO [org.apache.hadoop.mapred.Task] - Task 'attempt_local1308594653_0001_m_000000_0' done.
0000-00-00 00:35:58,797 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Finishing task: attempt_local1308594653_0001_m_000000_0
0000-00-00 00:35:58,797 INFO [org.apache.hadoop.mapred.LocalJobRunner] - map task executor complete.
0000-00-00 00:35:58,799 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Waiting for reduce tasks
0000-00-00 00:35:58,799 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Starting task: attempt_local1308594653_0001_r_000000_0
0000-00-00 00:35:58,803 INFO [org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter] - File Output Committer Algorithm version is 1
0000-00-00 00:35:58,803 INFO [org.apache.hadoop.yarn.util.ProcfsBasedProcessTree] - ProcfsBasedProcessTree currently is supported only on Linux.
0000-00-00 00:35:58,803 INFO [org.apache.hadoop.mapred.Task] -  Using ResourceCalculatorProcessTree : null
0000-00-00 00:35:58,805 INFO [org.apache.hadoop.mapred.ReduceTask] - Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@6f730bcb
0000-00-00 00:35:58,812 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - MergerManager: memoryLimit=5345011200, maxSingleShuffleLimit=1336252800, mergeThreshold=3527707648, ioSortFactor=10, memToMemMergeOutputsThreshold=10
0000-00-00 00:35:58,814 INFO [org.apache.hadoop.mapreduce.task.reduce.EventFetcher] - attempt_local1308594653_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
0000-00-00 00:35:58,835 INFO [org.apache.hadoop.mapreduce.task.reduce.LocalFetcher] - localfetcher#1 about to shuffle output of map attempt_local1308594653_0001_m_000000_0 decomp: 5698002 len: 5698006 to MEMORY
0000-00-00 00:35:58,848 INFO [org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput] - Read 5698002 bytes from map-output for attempt_local1308594653_0001_m_000000_0
0000-00-00 00:35:58,848 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - closeInMemoryFile -> map-output of size: 5698002, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->5698002
0000-00-00 00:35:58,849 INFO [org.apache.hadoop.mapreduce.task.reduce.EventFetcher] - EventFetcher is interrupted.. Returning
0000-00-00 00:35:58,849 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 1 / 1 copied.
0000-00-00 00:35:58,850 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
0000-00-00 00:35:58,854 INFO [org.apache.hadoop.mapred.Merger] - Merging 1 sorted segments
0000-00-00 00:35:58,854 INFO [org.apache.hadoop.mapred.Merger] - Down to the last merge-pass, with 1 segments left of total size: 5697997 bytes
0000-00-00 00:35:59,070 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - Merged 1 segments, 5698002 bytes to disk to satisfy reduce memory limit
0000-00-00 00:35:59,071 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - Merging 1 files, 5698006 bytes from disk
0000-00-00 00:35:59,071 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - Merging 0 segments, 0 bytes from memory into reduce
0000-00-00 00:35:59,071 INFO [org.apache.hadoop.mapred.Merger] - Merging 1 sorted segments
0000-00-00 00:35:59,071 INFO [org.apache.hadoop.mapred.Merger] - Down to the last merge-pass, with 1 segments left of total size: 5697997 bytes
0000-00-00 00:35:59,072 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 1 / 1 copied.
0000-00-00 00:35:59,084 INFO [org.apache.hadoop.conf.Configuration.deprecation] - mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
0000-00-00 00:35:59,178 INFO [org.apache.hadoop.mapreduce.Job] - Job job_local1308594653_0001 running in uber mode : false
0000-00-00 00:35:59,179 INFO [org.apache.hadoop.mapreduce.Job] -  map 100% reduce 0%
0000-00-00 00:35:59,326 INFO [org.apache.hadoop.mapred.Task] - Task:attempt_local1308594653_0001_r_000000_0 is done. And is in the process of committing
0000-00-00 00:35:59,327 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 1 / 1 copied.
0000-00-00 00:35:59,327 INFO [org.apache.hadoop.mapred.Task] - Task attempt_local1308594653_0001_r_000000_0 is allowed to commit now
0000-00-00 00:35:59,328 INFO [org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter] - Saved output of task 'attempt_local1308594653_0001_r_000000_0' to file:/Users/art/Documents/demo_datas/wordcount_combine_outputs/_temporary/0/task_local1308594653_0001_r_000000
0000-00-00 00:35:59,328 INFO [org.apache.hadoop.mapred.LocalJobRunner] - reduce > reduce
0000-00-00 00:35:59,328 INFO [org.apache.hadoop.mapred.Task] - Task 'attempt_local1308594653_0001_r_000000_0' done.
0000-00-00 00:35:59,328 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Finishing task: attempt_local1308594653_0001_r_000000_0
0000-00-00 00:35:59,328 INFO [org.apache.hadoop.mapred.LocalJobRunner] - reduce task executor complete.
0000-00-00 00:36:00,184 INFO [org.apache.hadoop.mapreduce.Job] -  map 100% reduce 100%
0000-00-00 00:36:00,184 INFO [org.apache.hadoop.mapreduce.Job] - Job job_local1308594653_0001 completed successfully
0000-00-00 00:36:00,191 INFO [org.apache.hadoop.mapreduce.Job] - Counters: 30
	File System Counters
		FILE: Number of bytes read=16638976
		FILE: Number of bytes written=17659913
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
	Map-Reduce Framework
		Map input records=170940
		Map output records=512820
		Map output bytes=4672360
		Map output materialized bytes=5698006
		Input split bytes=339
		Combine input records=0
		Combine output records=0
		Reduce input groups=5
		Reduce shuffle bytes=5698006
		Reduce input records=512820
		Reduce output records=5
		Spilled Records=1025640
		Shuffled Maps =1
		Failed Shuffles=0
		Merged Map outputs=1
		GC time elapsed (ms)=79
		Total committed heap usage (bytes)=1350565888
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=71

Process finished with exit code 0


done

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值