java导出hive数据_Hive数据导出的几种方式

最新推荐文章于 2022-04-22 20:02:25 发布

等待鱼鱼

最新推荐文章于 2022-04-22 20:02:25 发布

阅读量436

点赞数

文章标签： java导出hive数据

本文链接：https://blog.csdn.net/weixin_42494890/article/details/114724079

版权

Hive数据导出的几种方式

参考资料地址：http://blog.csdn.net/qianshangding0708/article/details/50394789

感谢分享

(1)导出到本地文件系统

hive> INSERT OVERWRITE LOCAL DIRECTORY '/home/hadoop/output' ROW FORMAT DELIMITED FIELDS TERMINATED by ',' select * from testA;

Total jobs = 1

Launching Job 1 out of 1

Number of reduce tasks is set to 0 since there's no reduce operator

Starting Job = job_1451024007879_0001, Tracking URL = http://hadoopcluster79:8088/proxy/application_1451024007879_0001/

Kill Command = /home/hadoop/apache/hadoop-2.4.1/bin/hadoop job -kill job_1451024007879_0001

Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0

2015-12-25 17:04:30,447 Stage-1 map = 0%, reduce = 0%

2015-12-25 17:04:35,616 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.16 sec

MapReduce Total cumulative CPU time: 1 seconds 160 msec

Ended Job = job_1451024007879_0001

Copying data to local directory /home/hadoop/output

MapReduce Jobs Launched:

Job 0: Map: 1 Cumulative CPU: 1.16 sec HDFS Read: 305 HDFS Write: 110 SUCCESS

Total MapReduce CPU Time Spent: 1 seconds 160 msec

Time taken: 16.701 seconds

查看数据结果：

[hadoop@hadoopcluster78 output]$ cat /home/hadoop/output/000000_0

1,fish1,SZ,2015-07-08

2,fish2,SH,2015-07-08

3,fish3,HZ,2015-07-08

4,fish4,QD,2015-07-08

5,fish5,SR,2015-07-08

通过INSERT OVERWRITE LOCAL DIRECTORY将hive表testA数据导入到/home/hadoop目录，众所周知，HQL会启动Mapreduce完成，其实/home/hadoop就是Mapreduce输出路径，产生的结果存放在文件名为：000000_0。

(2)导出到HDFS

导入到HDFS和导入本地文件类似，去掉HQL语句的LOCAL就可以了

hive> INSERT OVERWRITE DIRECTORY '/home/hadoop/output' select * from testA;

Total jobs = 3

Launching Job 1 out of 3

Number of reduce tasks is set to 0 since there's no reduce operator

Starting Job = job_1451024007879_0002, Tracking URL = http://hadoopcluster79:8088/proxy/application_1451024007879_0002/

Kill Command = /home/hadoop/apache/hadoop-2.4.1/bin/hadoop job -kill job_1451024007879_0002

Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0

2015-12-25 17:08:51,034 Stage-1 map = 0%, reduce = 0%

2015-12-25 17:08:59,313 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.4 sec

MapReduce Total cumulative CPU time: 1 seconds 400 msec

Ended Job = job_1451024007879_0002

Stage-3 is selected by condition resolver.

Stage-2 is filtered out by condition resolver.

Stage-4 is filtered out by condition resolver.

Moving data to: hdfs://hadoop2cluster/home/hadoop/hivedata/hive-hadoop/hive_2015-12-25_17-08-43_733_1768532778392261937-1/-ext-10000

Moving data to: /home/hadoop/output

MapReduce Jobs Launched:

Job 0: Map: 1 Cumulative CPU: 1.4 sec HDFS Read: 305 HDFS Write: 110 SUCCESS

Total MapReduce CPU Time Spent: 1 seconds 400 msec

Time taken: 16.667 seconds

查看hfds输出文件：

[hadoop@hadoopcluster78 bin]$ ./hadoop fs -cat /home/hadoop/output/000000_0

1fish1SZ2015-07-08

2fish2SH2015-07-08

3fish3HZ2015-07-08

4fish4QD2015-07-08

5fish5SR2015-07-08

其他

采用hive的-e和-f参数来导出数据。

参数为：-e的使用方式，后面接SQL语句。>>后面为输出文件路径

[hadoop@hadoopcluster78 bin]$ ./hive -e "select * from testA" >> /home/hadoop/output/testA.txt

15/12/25 17:15:07 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead

Logging initialized using configuration in file:/home/hadoop/apache/hive-0.13.1/conf/hive-log4j.properties

Time taken: 1.128 seconds, Fetched: 5 row(s)

[hadoop@hadoopcluster78 bin]$ cat /home/hadoop/output/testA.txt

1 fish1 SZ 2015-07-08

2 fish2 SH 2015-07-08

3 fish3 HZ 2015-07-08

4 fish4 QD 2015-07-08

5 fish5 SR 2015-07-08

参数为：-f的使用方式，后面接存放sql语句的文件。>>后面为输出文件路径

SQL语句文件：

[hadoop@hadoopcluster78 bin]$ cat /home/hadoop/output/sql.sql

select * from testA

使用-f参数执行：

[hadoop@hadoopcluster78 bin]$ ./hive -f /home/hadoop/output/sql.sql >> /home/hadoop/output/testB.txt

15/12/25 17:20:52 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead