hive如何解决字段错位问题的几种解决方法

最新推荐文章于 2024-08-19 19:15:45 发布

rubysxl

最新推荐文章于 2024-08-19 19:15:45 发布

阅读量5.1k

点赞数

文章标签： hive数据库研究

本文链接：https://blog.csdn.net/rubysxl/article/details/97246669

版权

在工作中，我们经常使用到hive进行sql语句的数据探查工作，而将数据导出或者复制粘贴到excel中时经常出现字段列错位问题，以下我根据实际工作经验提出几种解决方法：

1.利用concat函数进行字段拼接成一个大的字段

例如：

#先利用concat()函数对所有字段进行拼接，并加上竖线|
select
concat(nvl(string(a.ucs_cust_num),''),'|',
			 nvl(string(a.cust_name),''),'|',
			 nvl(string(a.cust_belongcity),''),'|',
			 nvl(string(a.cust_tel),''),'|',
			 nvl(string(a.gender_cd),''),'|',
			 nvl(string(a.repay_dt),''),'|',
			 nvl(string(a.loan_cate),''),'|',
			 nvl(string(a.loan_amt),''),'|',
			 nvl(string(a.cust_src),''),'|',
			 nvl(string(a.cert_num),''),'|',
			 nvl(string(a.birth_dt),''),'|',
			 nvl(string(a.ovdue_dt),''),'|')
from a;

然后将查询结果导出到txt中，然后利用excel中的数据–自文本插入数据，分列的时候选择其他“|”分隔即可，点击完成最后导入到excel中。

2.借助linux管道替换输出分隔符

# 方法一：sed
hive -e "select * from pms.pms_algorithm_desc" | sed 's/\t/,/g' > ./aaa.txt

# 方法二：tr
hive -e "select * from pms.pms_tp_config" | tr "\t" ","

3. 借助Hive的insert语法

  insert overwrite local directory '/home/pms/workspace/ouyangyewei/data/bi_lost'
    row format delimited
    fields terminated by ','
    select xxxx 
    from xxxx;

上面的sql将会把查询结果写到/home/pms/workspace/ouyangyewei/data/bi_lost_add_cart目录中，字段之间以,分隔