1 未登陆hive客户端
【数据库笔记】VM15运行hive の 命令
就是说只是启动了hadoop
这种方法无法指定生成csv文件的路径,会生成在hadoop启动的文件目录下
方法一:bin/hive -e
bin/hive -e "
set hive.cli.print.header=true;
use kaikeba;
SELECT a.user_name,
sum(a.pay_amount),
sum(a.refund_amount)
FROM
(
SELECT user_name,
sum(pay_amount) as pay_amount,
0 as refund_amount
FROM user_trade
WHERE dt>'0'
GROUP BY user_name
UNION ALL
SELECT user_name,
0 as pay_amount,
sum(refund_amount) as refund_amount
FROM user_refund
WHERE dt>'0'
GROUP BY user_name
)a
GROUP BY a.user_name;" | sed 's/\t/,/g'>>res.csv
最后输出的位置在hadoop的文件夹下
bin/hive -e
:用于启动/opt/module/apache-hive-3.1.1-bin
文件夹中的bin文件中的hiveset hive.cli.print.header=true;
:令输出的CSV包含字段名use kaikeba;
:指定具体的数据库,也就是select语句中的表来自那个数据库;| sed 's/\t/,/g'
:指定分隔符,不然csv中的数据只有一列[Hive]Hive指定查询输出分隔符res.csv
:指定输出的文件名
方法二:bin/hive -f
bin/hive -f outputs/test.hql| sed 's/\t/,/g' >> test.csv
test.hql
use kaikeba;
SELECT a.user_name,
sum(a.pay_amount),
sum(a.refund_amount)
FROM
(
SELECT user_name,
sum(pay_amount) as pay_amount,
0 as refund_amount
FROM user_trade
WHERE dt>'0'
GROUP BY user_name
UNION ALL
SELECT user_name,
0 as pay_amount,
sum(refund_amount) as refund_amount
FROM user_refund
WHERE dt>'0'
GROUP BY user_name
)a
GROUP BY a.user_name;
2 已登陆hive客户端
方法一:导出到HDFS
#登陆到hive客户端
insert overwrite local directory '/home/pms/workspace/ouyangyewei/data/bi_lost'
row format delimited
fields terminated by ','
select xxxx
from xxxx;
上面的hql将会把查询结果写到/home/pms/workspace/ouyangyewei/data/bi_lost_add_cart
目录中,字段之间以,
分隔,默认生成00000_0的文件