前言
其实这个错误是执行到该行出现了问题,原因有很多(函数、集群、数据),这里是集群的问题,下面说一下我的排错过程
运行环境
hadoop 2.7.2
电脑虚拟机 hadoop102 hadoop103 hadoop104
执行脚本
sql="
set hive.exec.dynamic.partition =true;
set hive.exec.dynamic.partition.mode=nonstrict;
use $APP;
insert overwrite table t_dwd_edata_day partition(pt_day)
select
room_id,
meter_num,
if(value is null,0,value),
if(abs_value is null,0,abs_value),
collect_time,
day as pt_day
from
t_ods_edata_day edata_day
where edata_day.pt_day='$do_date';
"
$hive -e "$sql"
错误如下
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
{"id":38985,"room_id":7849,"meter_num":"D001752","value":13265,"abs_value":1743,"day":"2020-09-28","collect_time":"2020-09-28 00:00:00.0","hour_values":"[-1, -1, -1, -1, -1, -1, 12137, 12145, 12149, 12151, 12154, 12157, 12162, 12274, 12376, 12473, 12573, 12690, 12807, 12905, 13000, 13103, 13213, 13265]",
"hour_abs_values":"[0, 0, 0, 0, 0, 0, 0, 8, 4, 2, 3, 3, 5, 112, 102, 97, 100, 117, 117, 98, 95, 103, 110, 52]","pt_day":"2021-04-10"}
解决过程
1、发现添加limit 的时候可以插入成功
2、在hive端select 语句可以成功查询
== 在这里可以确认应该不是语法或者配置的问题了==
3、jps发现hadoop102 和 hadoop104 的 datanode 节点没有起来
单独执行/opt/module/hadoop-2.7.2/bin/Hadoop-daemon.sh start DataNode
发现还是没有启动
vim /opt/module/hadoop-2.7.2/logs/hadoop-Sixkd-datanode-hadoop102.log
查看log日志发现权限不够
2021-04-12 10:57:19,124 WARN org.apache.hadoop.hdfs.server.common.Storage: java.io.FileNotFoundException: /opt/module/hadoop-2.7.2/data/tmp/dfs/data/in_use.lock (权限不够)
2021-04-12 10:57:19,125 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to hadoop102/192.168.85.102:9000. Exiting.
java.io.IOException: All specified directories are failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1358)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1323)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:223)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:802)
at java.lang.Thread.run(Thread.java:748)
在对应路径查看,发现是之前root用户启动占用了datanode
在确保root用户没有在使用的情况下修改lock 改为Sixkd 或者删掉,再次启动datanode成功执行
hive 导入脚本顺利执行
总结
由于root用户启动了hadoop集群,导致Sixkd用户无法启动其中两个节点,导致了hive执行失败,虽然是本地虚拟机操作太随意导致的,但注意出了问题得先看log