04-Hive数据导出

最新推荐文章于 2023-07-06 11:33:25 发布

当法律与事业相遇

最新推荐文章于 2023-07-06 11:33:25 发布

阅读量950

点赞数

分类专栏： Hive 文章标签： hive

本文链接：https://blog.csdn.net/qq_29622761/article/details/51568905

版权

Hive 专栏收录该内容

9 篇文章 3 订阅

订阅专栏

大家好！砸门又见面了。今天来玩一下Hive数据导出。
导出的方式有以下几种
1)hadoop命令的方式
get
text
2)通过insert…directory方式

insert overwrite[local] directory '/tmp/ca employees'
[row format delimited fields terminated by '\t']
select name,salary,address
from employees
样本代码

3)shell命令加管道：hive -f/e | sed/grep/awk>file
4)第三方工具，如sqoop

好，砸门开始实验吧！
1)hadoop命令的方式

hive> select * from testtext;
OK
wer	46
wer	89
weree	78
rr	89
Time taken: 0.212 seconds
hive>

[root@hadoop1 host]# hadoop fs -get /user/hive/warehouse/testtext /usr/host/data2/ 
[root@hadoop1 host]# cd data2
[root@hadoop1 data2]# ll
total 4
drwxr-xr-x. 2 root root 4096 Jun  2 02:23 testtext
[root@hadoop1 data2]#

[root@hadoop1 data2]# hadoop fs -text /user/hive/warehouse/testtext/*
wer	46
wer	89
weree	78
rr	89
[root@hadoop1 data2]#

注意：我的hive保存在hdfs上的路径是/user/hive/warehouse，这是在配置hive-site.xml文件时指定的。
当然还可以重定向：

[root@hadoop1 data2]# hadoop fs -text /user/hive/warehouse/testtext/* > newdata2
[root@hadoop1 data2]# ll
total 8
-rw-r--r--. 1 root root   29 Jun  2 02:26 newdata2
drwxr-xr-x. 2 root root 4096 Jun  2 02:23 testtext
[root@hadoop1 data2]# cat newdata2 
wer	46
wer	89
weree	78
rr	89
[root@hadoop1 data2]#

(两个大于号是追加，一个大于号是重写覆盖)
2)通过insert…directory方式

hive> insert overwrite local directory '/usr/host/data3'
    > select name,addr
    > from testtext;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Job running in-process (local Hadoop)
Hadoop job information for null: number of mappers: 1; number of reducers: 0
2016-06-02 02:32:44,084 null map = 0%,  reduce = 0%
2016-06-02 02:32:56,469 null map = 100%,  reduce = 0%, Cumulative CPU 0.83 sec
2016-06-02 02:32:57,543 null map = 100%,  reduce = 0%, Cumulative CPU 0.83 sec
2016-06-02 02:32:58,658 null map = 100%,  reduce = 0%, Cumulative CPU 0.83 sec
MapReduce Total cumulative CPU time: 830 msec
Ended Job = job_1464828076391_0014
Execution completed successfully
Mapred Local Task Succeeded . Convert the Join into MapJoin
Copying data to local directory /usr/host/data3
Copying data to local directory /usr/host/data3
OK
Time taken: 29.525 seconds
hive>

查一下data3文件夹

[root@hadoop1 host]# cd data3
[root@hadoop1 data3]# ll
total 4
-rw-r--r--. 1 root root 29 Jun  2 02:32 000000_0
[root@hadoop1 data3]# cat 000000_0 
wer46
wer89
weree78
rr89
[root@hadoop1 data3]#

下载到hdfs上，则去掉local，不需要格式row…

hive> insert overwrite directory '/data3' 
    > select name,addr
    > from testtext;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Job running in-process (local Hadoop)
Hadoop job information for null: number of mappers: 1; number of reducers: 0
2016-06-02 02:37:27,137 null map = 0%,  reduce = 0%
2016-06-02 02:37:34,847 null map = 100%,  reduce = 0%
Ended Job = job_1464828076391_0015
Execution completed successfully
Mapred Local Task Succeeded . Convert the Join into MapJoin
Moving data to: /data3
OK
Time taken: 27.902 seconds
hive>

真是百发百中，屡试不爽呀！

3)shell命令加管道：
hive -f/e | sed/grep/awk>file

[root@hadoop1 data3]# hive -e "select * from testtext"
OK
wer	46
wer	89
weree	78
rr	89
Time taken: 5.879 seconds

[root@hadoop1 data3]# hive -S -e "select * from testtext" | grep wer
wer	89
weree	78
[root@hadoop1 data3]#

加-S的好处是控制台上可以少很多信息。

有些累了，休息一会。如果你看到此文，想进一步学习或者和我沟通，加我微信公众号：名字：谢华东。

在这里插入图片描述

当法律与事业相遇

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
打赏
1
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录