问题1:用sqoop从DB导入数据到hdfs报错:Error: java.io.IOException: SQLException in nextKeyValue
问题原因:Sqoop在导入MySQL数据时遇到Timestamp列为0000-00-00 00:00:00报错
解决方案:在JDBC连接后加上?zeroDateTimeBehavior=convertToNull
如:
jdbc:mysql://14.21.71.212:25722/$db_name2?zeroDateTimeBehavior=convertToNull
问题2:通过hue+Oozie构建sqoop_import.sh(上面脚本)工作流执行脚本报错
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:410)
at org.apache.oozie.action.hadoop.LauncherAM.access$300(LauncherAM.java:55)
at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:223)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:217)
at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:153)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:141)
Caused by: java.io.IOException: Cannot run program "sqoop_import.sh" (in directory "/pdata/yarn/nm/usercache/ptxdw/appcache/application_1582889750670_0349/container_1582889750670_0349_01_000001"): error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at org.apache.oozie.action.hadoop.ShellMain.execute(ShellMain.java:114)
at org.apache.oozie.action.hadoop.ShellMain.run(ShellMain.java:73)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:104)
at org.apache.oozie.action.hadoop.ShellMain.main(ShellMain.java:63)
... 16 more
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
... 20 more
原因:sqoop_import.sh没有读写执行权限,需要修改权限
解决方案:修改脚本执行权限。
问题3:用户权限问题
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=yarn, access=WRITE, inode="/origin_data/ptx_db/db/s_map_sale_target":ptxbd:ptxbd:drwxr-xr-x
20/03/10 09:46:25 ERROR tool.ImportTool: Import failed: org.apache.hadoop.security.AccessControlException: Permission denied: user=yarn, access=WRITE, inode="/origin_data/ptx_db/db/s_map_sale_target":ptxbd:ptxbd:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:400)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:259)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:194)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1855)
at org.apache.hadoop.hdfs.server.namenode.FSDirDeleteOp.delete(FSDirDeleteOp.java:110)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2962)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1092)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:681)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1568)
at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:879)
at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:876)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:886)
at org.apache.sqoop.tool.ImportTool.deleteTargetDir(ImportTool.java:564)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:530)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:633)
at org.apache.sqoop.Sqoop.run(Sqoop.java:146)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:182)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:233)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:242)
at org.apache.sqoop.Sqoop.main(Sqoop.java:251)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=yarn, access=WRITE, inode="/origin_data/ptx_db/db/s_map_sale_target":ptxbd:ptxbd:drwxr-xr-x
四种方案参考:https://blog.csdn.net/wanbf123/article/details/82148633
解决方案:
问题4:关于在在hue当中执行shell脚本使用oozie调度,一直出现laucher异常退出。
Main Class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]
Oozie Launcher ends
Logs not available for container_1582889750670_0353_01_000001. Aggregation may not be complete, Check back later or try the nodemanager at ptx-bigdata4:8041
Or see application log at http://ptx-bigdata4:8041/node/application/application_1582889750670_0353
这个就是出现的错误,然后在hue上面查看日志发现没有错误信息,然后我又去yarn上面查看错误信息。发现也没有。。在集群的命令行当中的shell脚本也是正常执行的不报错。
问题分析:查看linux服务器ptx-bigdata4任务运行日志,报错是由于权限导致。
解决方案:修改读写权限:
问题5:oozie执行sqoop导入数据脚本拨错
java.sql.SQLException: The connection property 'zeroDateTimeBehavior' only accepts values of the form: 'exception', 'round' or 'convertToNull'. The value 'convertToNull??' is not in this set.
问题分析:url=jdbc:mysql://xxx.aliyuncs.com:3306/xxxxx?useUnicode=true&characterEncoding=UTF-8&zeroDateTimeBehavior=convertToNull \ 字符串后面多出的空格是不可见字符,导致脚本执行错误。
错误信息中已经提示了 zeroDateTimeBehavior 属性可以接受 'exception', 'round' or 'convertToNull'中的一个。
默认是 exception,抛出异常
round 是返回:0001-01-01 00:00:00.0
convertToNull 就是转换为 null
import_data_from_db_single_product(){
/usr/bin/sqoop import \
--connect jdbc:mysql://14.21.71.xxx:xxx/$db_name2?zeroDateTimeBehavior=convertToNull \
--username cc \
--password mysql123 \
--target-dir /origin_data/$db_name2/db/$1/$db_date \
--delete-target-dir \
--num-mappers 1 \
--fields-terminated-by "\t" \
--query "$2"' and $CONDITIONS;'
}
解决方案:变量抽取出来,通过${}引用
问题6:Main Class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]
20/03/12 09:24:01 ERROR tool.ImportTool: Import failed: org.apache.hadoop.security.AccessControlException: Permission denied: user=yarn, access=WRITE, inode="/origin_data/ptx_db/db/s_map_sale_target":ptxdw:hdfs:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:400)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:259)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:194)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1855)
at org.apache.hadoop.hdfs.server.namenode.FSDirDeleteOp.delete(FSDirDeleteOp.java:110)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2962)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1092)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:681)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1568)
at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:879)
at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:876)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:886)
at org.apache.sqoop.tool.ImportTool.deleteTargetDir(ImportTool.java:564)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:530)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:633)
at org.apache.sqoop.Sqoop.run(Sqoop.java:146)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:182)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:233)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:242)
at org.apache.sqoop.Sqoop.main(Sqoop.java:251)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=yarn, access=WRITE, inode="/origin_data/ptx_db/db/s_map_sale_target":ptxdw:hdfs:drwxr-xr-x
问题分析:查看yarn-job日志,如上所示,权限问题
解决方案:workflow添加环境变量HADOOP_USER_NAME=${wf:user()}
参考:https://blog.csdn.net/weixin_30835649/article/details/95001851
问题7:output.properties data exceeds its limit [2048]
程序运行最终结果没问题,根据异常看出是输出大小异常。
原因分析:
Add the below in the shell action or put it in /etc/oozie/conf/oozie-site.xml
and restart the oozie server. This will increase the console output data that can be captured as part of the shell action tag <capture-output/>
. The default is 2048 which is Max size in characters for output data.
The <capture-output/>
is not required if you not using the console output from the shell action for any decisions in next action. Try just by removing the tag <capture-output/>
if not required.
<configuration>
<property>
<name>oozie.action.max.output.data</name>
<value>204800</value>
</property>
</configuration>
解决方案:在oozie-site.xml文件中添加如下属性后,需要重启oozie
1)修改oozie-site.xml配置文件
2)重启oozie服务