Sqoop同步MySQL数据时遇到Timestamp列为0000-00-00 00:00:00.000000报错
CONSOLE# 20/12/15 03:09:43 INFO mapreduce.Job: Task Id : attempt_1595581261973_624830_m_000000_0, Status : FAILED
CONSOLE# Error: java.io.IOException: SQLException in nextKeyValue
CONSOLE# at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:275)
CONSOLE# at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:568)
CONSOLE# at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
CONSOLE# at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
CONSOLE# at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
CONSOLE# at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
CONSOLE# at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
CONSOLE# at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
CONSOLE# at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
CONSOLE# at java.security.AccessController.doPrivileged(Native Method)
CONSOLE# at javax.security.auth.Subject.doAs(Subject.java:422)
CONSOLE# at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
CONSOLE# at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
CONSOLE# Caused by: java.sql.SQLException: Value '0000-00-00 00:00:00.0000002020121420201214' can not be represented as java.sql.Timestamp
CONSOLE# at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:965)
CONSOLE# at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:898)
CONSOLE# at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:887)
CONSOLE# at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:861)
CONSOLE# at com.mysql.jdbc.ResultSetRow.getTimestampFast(ResultSetRow.java:947)
CONSOLE# at com.mysql.jdbc.BufferRow.getTimestampFast(BufferRow.java:559)
CONSOLE# at com.mysql.jdbc.ResultSetImpl.getTimestampInternal(ResultSetImpl.java:5913)
CONSOLE# at com.mysql.jdbc.ResultSetImpl.getTimestamp(ResultSetImpl.java:5581)
CONSOLE# at org.apache.sqoop.lib.JdbcWritableBridge.readTimestamp(JdbcWritableBridge.java:108)
CONSOLE# at QueryResult.readFields(QueryResult.java:540)
CONSOLE# at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:242)
CONSOLE# ... 12 more
解决办法:
(1)将格式不对的数据转化为 null 处理
在JDBC连接后加上?zeroDateTimeBehavior=convertToNull
例如:
sqoop import \
--connect jdbc:mysql://localhost:3306/Test?zeroDateTimeBehavior=convertToNull \
--username root \
--password 123456 \
--target-dir /tmp/ds=${etl_day1} \
--delete-target-dir \
--fields-terminated-by '\t' \
--m 1 \
--query "select * from Test.test_sqoop where \$CONDITIONS and day1=${etl_day1}"
(2)也可以改成 round------->替换成最近的日期即yyyy-MM-dd
?zeroDateTimeBehavior=round