[zkpk@master Desktop]$ hadoop jar JGe1.jar TianmaoSJ_07.Tianmao /TM/TMSJ_7.txt /TmSJ_14
16/05/17 08:59:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/05/17 08:59:53 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.25.111:18040
16/05/17 08:59:55 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
16/05/17 08:59:56 INFO input.FileInputFormat: Total input paths to process : 1
16/05/17 08:59:56 INFO mapreduce.JobSubmitter: number of splits:1
16/05/17 08:59:57 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1463497530325_0011
16/05/17 08:59:58 INFO impl.YarnClientImpl: Submitted application application_1463497530325_0011
16/05/17 08:59:58 INFO mapreduce.Job: The url to track the job: http://master:18088/proxy/application_1463497530325_0011/
16/05/17 08:59:58 INFO mapreduce.Job: Running job: job_1463497530325_0011
16/05/17 09:00:25 INFO mapreduce.Job: Job job_1463497530325_0011 running in uber mode : false
16/05/17 09:00:25 INFO mapreduce.Job: map 0% reduce 0%
16/05/17 09:00:42 INFO mapreduce.Job: Task Id : attempt_1463497530325_0011_m_000000_0, Status : FAILED
Error: java.lang.NumberFormatException: For input string: ""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:504)
at java.lang.Integer.parseInt(Integer.java:527)
at TianmaoSJ_07.Mymapper.map(Mymapper.java:23)
at TianmaoSJ_07.Mymapper.map(Mymapper.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
核心代码:
String[] arr = value.toString().split("\t");
IntWritable sales= new IntWritable();
if (null != arr && arr.length == 7) {
String salesStr=arr[4];
sales.set(Integer.parseInt(salesStr));
context.write(value,sales);
}
}
16/05/17 08:59:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/05/17 08:59:53 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.25.111:18040
16/05/17 08:59:55 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
16/05/17 08:59:56 INFO input.FileInputFormat: Total input paths to process : 1
16/05/17 08:59:56 INFO mapreduce.JobSubmitter: number of splits:1
16/05/17 08:59:57 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1463497530325_0011
16/05/17 08:59:58 INFO impl.YarnClientImpl: Submitted application application_1463497530325_0011
16/05/17 08:59:58 INFO mapreduce.Job: The url to track the job: http://master:18088/proxy/application_1463497530325_0011/
16/05/17 08:59:58 INFO mapreduce.Job: Running job: job_1463497530325_0011
16/05/17 09:00:25 INFO mapreduce.Job: Job job_1463497530325_0011 running in uber mode : false
16/05/17 09:00:25 INFO mapreduce.Job: map 0% reduce 0%
16/05/17 09:00:42 INFO mapreduce.Job: Task Id : attempt_1463497530325_0011_m_000000_0, Status : FAILED
Error: java.lang.NumberFormatException: For input string: ""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:504)
at java.lang.Integer.parseInt(Integer.java:527)
at TianmaoSJ_07.Mymapper.map(Mymapper.java:23)
at TianmaoSJ_07.Mymapper.map(Mymapper.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
核心代码:
String[] arr = value.toString().split("\t");
IntWritable sales= new IntWritable();
if (null != arr && arr.length == 7) {
String salesStr=arr[4];
sales.set(Integer.parseInt(salesStr));
context.write(value,sales);
}
}
}
一条包含数据七个字段如下:
data:image/gif;base64,R0lGODlhAQABAIAAAP///wAAACH5BAEAAAAALAAAAAABAAEAAAICRAEAOw==
2999
https://detail.tmall.com/item.htm?id=527355597126&skuId=3164706238377&areaId=410100&cat_id=2&rn=b2ab33e07e635b52d9bea3f8073faa39
当天发小米5【钢膜耳机壳】Xiaomi/小米 小米手机5 全网通尊享版
518
19
D:\Picture\
问题分析:读取数据如518 一套数据中可能含有空格数据,出现空字符串转化异常,所以在字符转化Integer是进行过滤,如下红色代码
2999
https://detail.tmall.com/item.htm?id=527355597126&skuId=3164706238377&areaId=410100&cat_id=2&rn=b2ab33e07e635b52d9bea3f8073faa39
当天发小米5【钢膜耳机壳】Xiaomi/小米 小米手机5 全网通尊享版
518
19
D:\Picture\
解决方案:
String[] arr = value.toString().split("\t");
IntWritable sales= new IntWritable();
if (null != arr && arr.length == 7) {
String salesStr=arr[4];
if (salesStr!=null||!salesStr.equals("")) {//过滤空字符串和 “”情况
if(salesStr.matches("^[0-9]+$")){//过滤非数字字符串
sales.set(Integer.parseInt(salesStr));
context.write(value,sales);
}
}
}
}