概述
当CDH升级到5.7.1时候引入HIVE BUG。具体情况如下:
对于一个存储格式为ORC的分区表,并且该表在填入数据以后还新增加了列。
场景如下:
create table foobar ( foo string, bar string ) partitioned by (dt string) stored as orc;
alter table foobar add partition( dt='20160620' ) ;
alter table foobar add columns(goo string );
错误复现后需当我们执行诸如如下SQL时候将会引发血案
--精确查询
select create_time,
real_app_id,
channel_id,
plugin_ver,
network_type,
plugin_package_name,
users
from dim.test a
where day_key='20160601'
and plugin_package_name='yahu';
--聚合操作
select count(1)
from dim.test a
where day_key='20160601'
and plugin_package_name='yahu';
诸如以上的操作都会引发如下错误:
Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
--查看日志错误信息如下:
Query ID = dbs_20160704162222_91d0eceb-c25b-4c68-a182-a7a5580bed2c
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1467533369404_5863, Tracking URL = http://master:8088/proxy/application_1467533369404_5863/
Kill Command = /opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/hadoop/bin/hadoop job -kill job_1467533369404_5863
Hadoop job information for Stage-1: number of mappers: 16; number of reducers: 0
2016-07-04 16:23:01,939 Stage-1 map = 0%, reduce = 0%
2016-07-04 16:23:30,006 Stage-1 map = 100%, reduce = 0%
Ended Job = job_1467533369404_5863 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1467533369404_5863_m_000009 (and more) from job job_1467533369404_5863
Examining task ID: task_1467533369404_5863_m_000012 (and more) from job job_1467533369404_5863
Examining task ID: task_1467533369404_5863_m_000006 (and more) from job job_1467533369404_586