大数据问题排查系列 - 因HIVE 中元数据与HDFS中实际的数据不一致引起的问题的修复
前言
大家好,我是明哥!
本片博文是“大数据问题排查系列”之一,讲述某HIVE SQL 作业因为 HIVE 中的元数据与 HDFS中实际的数据不一致引起的一个问题的排查和修复。
以下是正文。
问题现象
客户端报错如下:
Unable to move source xxx to destination xxx
问题分析
客户端的报错信息,并没有完全展现问题背后的全貌。我们进入 hiveserver2 所在节点查看hiveserver2的日志,可以看到如下相关信息:
2021-09-01 11:47:46,795 INFO org.apache.hadoop.hive.ql.exec.Task: [HiveServer2-Background-Pool: Thread-1105]: Loading data to table hs_ods.ods_ses_acct_assure_scale partition (part_date=20210118) from hdfs://hs01:8020/user/hundsun/dap/hive/hs_ods/ods_ses_acct_assure_scale/part_date=20210118/.hive-staging_hive_2021-09-01_11-47-31_216_694180642957006705-35/-ext-10000
2021-09-01 11:47:46,795 INFO hive.metastore: [HiveServer2-Background-Pool: Thread-1105]: HMS client filtering is enabled.
2021-09-01 11:47:46,795 INFO hive.metastore: [HiveServer2-Background-Pool: Thread-1105]: Trying to connect to metastore with URI thrift://hs01:9083
2021-09-01 11:47:46,795 INFO hive.metastore: [HiveServer2-Background-Pool: Thread-1105]: Opened a connection to metastore, current connections: 54
2021-09-01 11:47:46,796 INFO hive.metastore: [HiveServer2-Background-Pool: Thread-1105]: Connected to metastore.
2021-09-01 11:47:46,928 INFO org.apache.hadoop.hive.ql.exec.MoveTask: [HiveServer2-Background-Pool: Thread-1105]: Partition is: {
part_date=20210118}
2021-09-01 11:47:46,945 INFO org.apache.hadoop.hive.common.FileUtils: [HiveServer2-Background-Pool: Thread-1105]: Creating directory if it doesn't exist: hdfs://hs01:8020/user/hundsun/dap/hive/hs_ods/ods_ses_acct_assure_scale/part_date=20210118
2021-09-01 11:47:46,947 ERROR hive.ql.metadata.Hive: [HiveServer2-Background-Pool: Thread-1105]: Failed to move: {
}
2021-09-01 11:47:46,947 ERROR org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-1105]: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source hdfs://hs01:8020/user/hundsun/dap/hive/hs_ods/ods_ses_acct_assure_scale/part_date=20210118/.hive-staging_hive_2021-09-01_11-47-31_216_694180642957006705-35/-ext-10000 to destination hdfs://hs01:8020/user/hundsun/dap/hive/hs_ods/ods_ses_acct_assure_scale/part_date=20210118
2021-09-01 11:47:46,948 INFO org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-1105]: Completed executing command(queryId=hive_20210901114731_d7a78302-fb2a-4b45-9472-db6a9787f710); Time taken: 15.489 seconds
2021-09-01 11:47:46,957 ERROR org.apache.hive.service.cli.operation.Operation: [HiveServer2-Background-Pool: Thread-1105]: Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source hdfs://hs01:8020/user/hundsun/dap/hive/hs_ods/ods_ses_acct_assure_scale/part_date=20210118/.hive-staging_hive_2021-09-01_11-47-31_216_694180642957006705-35