每天记录一点点

Anla Likes Sunshine

已于 2023-02-16 14:16:26 修改

阅读量317

点赞数 1

分类专栏：大数据文章标签： hive postgresql

于 2021-09-02 10:55:43 首次发布

本文链接：https://blog.csdn.net/AnlaGodness/article/details/119611634

版权

1、hive 同步到 holo,数据量不一致，可能是 holo 的主键不唯一导致的，即hive的粒度不唯一。
2、sql文件注释不能有分号。不然会报找不到字段错误，且非常不明显。
3、查看hdfs文件大小： hdfs dfs -du -h -s hdfs路径
4、holo主键的任一字段不能为 null 值，否则 hive 2 holo 会失败，数据量会对不齐，可在aiflow的Log 上看到 first err:Put primary key cannot be null:field_name
5、kafka 一个消费组可以订阅多个不同的topic
6、hive 2 holo的 sql必须字段名对齐holo的表字段名，如果没对齐，会报很奇怪的错误: NullPointerException / Interger2Int。如果表结构字段名不一致，通过 as holo_field_name 解决。
7、意外将hive表某分区数据删掉了怎么恢复？

hdfs dfs -mkdir -p table_hdfs_path/partition_name=xxx
hdfs dfs -mv /user/hadoop/.Trash/Current/table_hdfs_path/partition_name=xxx/* table_hdfs_path/partition_name=xxx/

8、group by 的字段里，能含null值

with t as (
select 1 as id,null as name,5 as score
union all
select 1 as id,null as name,15 as score
)
select id,name,sum(score)
from t 
group by id,name

9、去掉左边的空格,以0替换

select replace(ltrim(replace(field,'0'

最低0.47元/天解锁文章

Anla Likes Sunshine

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
每天记录一点点

1、hive 同步到 holo,数据量不一致，可能是 holo 的主键不唯一导致的，即hive的粒度不唯一。2、sql文件注释不能有分号。不然会报找不到字段错误，且非常不明显。3、查看hdfs文件大小： hdfs dfs -du -h -s hdfs路径...
复制链接

扫一扫

专栏目录