SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 18/12/28 10:37:46 INFO client.RMProxy: Connecting to ResourceManager at datanode3/192.168.1.103:8032 18/12/28 10:37:48 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 18/12/28 10:37:50 INFO input.FileInputFormat: Total input paths to process : 2 18/12/28 10:37:50 INFO mapreduce.JobSubmitter: number of splits:2 18/12/28 10:37:51 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1545964109134_0001 18/12/28 10:37:53 INFO impl.YarnClientImpl: Submitted application application_1545964109134_0001 18/12/28 10:37:54 INFO mapreduce.Job: The url to track the job: http://datanode3:8088/proxy/application_1545964109134_0001/ 18/12/28 10:37:54 INFO mapreduce.Job: Running job: job_1545964109134_0001 18/12/28 10:38:50 INFO mapreduce.Job: Job job_1545964109134_0001 running in uber mode : false 18/12/28 10:38:50 INFO mapreduce.Job: map 0% reduce 0% 18/12/28 10:39:28 INFO mapreduce.Job: map 100% reduce 0% 18/12/28 10:39:48 INFO mapreduce.Job: map 100% reduce 100% 18/12/28 10:39:50 INFO mapreduce.Job: Job job_1545964109134_0001 completed successfully 18/12/28 10:39:51 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=78 FILE: Number of bytes written=353015 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=258 HDFS: Number of bytes written=31 HDFS: Number of read operations=9 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=2 Launched reduce tasks=1 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=67297 Total time spent by all reduces in occupied slots (ms)=16699 Total time spent by all map tasks (ms)=67297 Total time spent by all reduce tasks (ms)=16699 Total vcore-milliseconds taken by all map tasks=67297 Total vcore-milliseconds taken by all reduce tasks=16699 Total megabyte-milliseconds taken by all map tasks=68912128 Total megabyte-milliseconds taken by all reduce tasks=17099776 Map-Reduce Framework Map input records=8 Map output records=8 Map output bytes=78 Map output materialized bytes=84 Input split bytes=212 Combine input records=8 Combine output records=6 Reduce input groups=4 Reduce shuffle bytes=84 Reduce input records=6 Reduce output records=4 Spilled Records=12 Shuffled Maps =2 Failed Shuffles=0 Merged Map outputs=2 GC time elapsed (ms)=3303 CPU time spent (ms)=8060 Physical memory (bytes) snapshot=470183936 Virtual memory (bytes) snapshot=6182424576 Total committed heap usage (bytes)=261361664 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=46 File Output Format Counters Bytes Written=31
for (String word : words) { if (word.equals("GoodWord")) { context.setStatus("GoodWord is coming"); context.getCounter(ReportTest.GoodWord).increment(1); } elseif (word.equals("ErroWord")) { context.setStatus("BadWord is coming!"); context.getCounter(ReportTest.ErroWord).increment(1); } else { context.write(new Text(word), new IntWritable(1)); }
publicclassTxtReducerextendsReducer<Text, IntWritable, Text, IntWritable> { @Override protectedvoidreduce(Text key, Iterable<IntWritable> values, Context context)throws IOException, InterruptedException { int sum = 0;
Iterator<IntWritable> it = values.iterator(); while (it.hasNext()) { IntWritable value = it.next(); sum += value.get(); } if (key.toString().equals("hello")) { context.setStatus("BadKey is comming!"); context.getCounter(ReportTest.ReduceReport).increment(1); } context.write(key, new IntWritable(sum)); } }
FileInputFormat.addInputPath(job, new Path("/input/counter/*")); //为 Job设置输入路径 FileOutputFormat.setOutputPath(job, new Path("/output/counter_result")); //为Job设置输出路径
<property> <name>mapreduce.jobhistory.address</name> <value>datanode1:10020</value> <description>MapReduce JobHistory Server IPC host:port</description> </property>
<property> <name>mapreduce.jobhistory.webapp.address</name> <value>datanode1:19888</value> <description>MapReduce JobHistory Server Web UI host:port</description> </property>
启动服务
1
mr-jobhistory-daemon.sh start historyserver
web界面查看
最值
最大值、最小值、平均值、均方差、众数、中位数等都是统计学中经典的数值统计,也是常用的统计属性字段,如果想知道最大的10个数,最小的10个数,这涉及到Top N/Bottom N 问题。