Hive ClassCastException
问题描述
创建表后插入数据 报错
java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.hive.serde2.io.ParquetHiveRecord
at org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:124)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:753)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1016)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:821)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:695)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:761)
… 8 more
问题分析
关键词
- ClassCastException
- ParquetHiveRecord
结论
表结构
desc formatted dws_radar_web_js_click_d;
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
存储格式 和 压缩方式 对不上
解决方案
alter table dws_radar_web_js_click_d set fileformat parquet;
parquet 存储就要用 parquet压缩