如果报错
11/08/05 10:51:22 INFO mapred.JobClient: Running job: job_201108051007_0010
11/08/05 10:51:23 INFO mapred.JobClient: map 0% reduce 0%
11/08/05 10:51:36 INFO mapred.JobClient: Task Id : attempt_201108051007_0010_m_000000_0, Status : FAILED
java.util.NoSuchElementException
at java.util.AbstractList$Itr.next(AbstractList.java:350)
at uv_info.__loadFromFields(uv_info.java:194)
at uv_info.parse(uv_info.java:143)
at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:79)
at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:38)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at com.cloudera.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:187)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
解决此错误需要注意2点:
1)确保你创建的表指定了分隔符,例如:
create table if not exists apachelogsummary(
host STRING,
size STRING,
sumsize STRING)
partitioned by (dt STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' STORED AS TEXTFILE;
如果你不指定,那么hdfs产生的数据文件会像这样:
221.204.248.109^A162612.7783203125^A1.8820923416702835
61.164.153.130^A24674.1982421875^A0.28558099817346644
222.73.95.121^A20020.7822265625^A0.23172201651114005
202.108.251.23^A14764.1708984375^A0.1708816076208044
61.164.153.171^A14522.759765625^A0.1680874972873264
122.227.222.116^A12595.080078125^A0.14577638979311341
114.80.142.87^A9494.8564453125^A0.10989417182074653
125.39.39.84^A7066.0595703125^A0.0817830968786169
218.25.106.248^A5449.583984375^A0.06307388870804398
123.126.50.68^A5217.9619140625^A0.06039307770905671
都这样了,你让sqoop怎么告诉mysql哪个字段对应哪个字段呢。。
2)sqoop导出命令的时候,要指定对应的数据文件分隔符--input-fields-terminated-by '\t'使它能够正确的解析文件字段。
./sqoop export --connect jdbc:mysql://localhost:3306/datacenter --username root --password admin --table uv_info --export-dir /user/hive/warehouse/uv/dt=2011-08-03 --input-fields-terminated-by '\t'
11/08/05 10:51:22 INFO mapred.JobClient: Running job: job_201108051007_0010
11/08/05 10:51:23 INFO mapred.JobClient: map 0% reduce 0%
11/08/05 10:51:36 INFO mapred.JobClient: Task Id : attempt_201108051007_0010_m_000000_0, Status : FAILED
java.util.NoSuchElementException
at java.util.AbstractList$Itr.next(AbstractList.java:350)
at uv_info.__loadFromFields(uv_info.java:194)
at uv_info.parse(uv_info.java:143)
at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:79)
at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:38)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at com.cloudera.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:187)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
解决此错误需要注意2点:
1)确保你创建的表指定了分隔符,例如:
create table if not exists apachelogsummary(
host STRING,
size STRING,
sumsize STRING)
partitioned by (dt STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' STORED AS TEXTFILE;
如果你不指定,那么hdfs产生的数据文件会像这样:
221.204.248.109^A162612.7783203125^A1.8820923416702835
61.164.153.130^A24674.1982421875^A0.28558099817346644
222.73.95.121^A20020.7822265625^A0.23172201651114005
202.108.251.23^A14764.1708984375^A0.1708816076208044
61.164.153.171^A14522.759765625^A0.1680874972873264
122.227.222.116^A12595.080078125^A0.14577638979311341
114.80.142.87^A9494.8564453125^A0.10989417182074653
125.39.39.84^A7066.0595703125^A0.0817830968786169
218.25.106.248^A5449.583984375^A0.06307388870804398
123.126.50.68^A5217.9619140625^A0.06039307770905671
都这样了,你让sqoop怎么告诉mysql哪个字段对应哪个字段呢。。
2)sqoop导出命令的时候,要指定对应的数据文件分隔符--input-fields-terminated-by '\t'使它能够正确的解析文件字段。
./sqoop export --connect jdbc:mysql://localhost:3306/datacenter --username root --password admin --table uv_info --export-dir /user/hive/warehouse/uv/dt=2011-08-03 --input-fields-terminated-by '\t'