Hive之——数据导出

最新推荐文章于 2024-02-25 11:27:50 发布

冰河

最新推荐文章于 2024-02-25 11:27:50 发布

阅读量1.3k

点赞数

分类专栏：精通大数据系列文章标签： Hive

本文链接：https://blog.csdn.net/l1028386804/article/details/80550840

版权

精通大数据系列专栏收录该内容

269 篇文章 85 订阅

订阅专栏

转载请注明出处：https://blog.csdn.net/l1028386804/article/details/80550840

一、导出的方式

1、Hadoop命令方式

get
hadoop fs -get hdfs://liuyazhuang121:9000/user/hive/warehouse/lyz.db/test_p/st=20180602/data
text
hadoop fs -text hdfs://liuyazhuang121:9000/user/hive/warehouse/lyz.db/test_p/st=20180602/data

2、通过insert ... directory方式

insert overwrite [local] directory '/tmp/ca_employees'
[row format delimited fields terminated by '\t']
select name, salary, address
from employees

3、shell命令加管道

hive -f/e sed/grep/awk > file

4、第三方工具 sqoop

二、动态分析

1、不需要为不同的分区添加不同的插入语句

2、分区不确认，需要从数据获取

3、几个参数

#使用动态分区
set hive.exec.dynamic.partition=true;
#无限制模式，如果模式是strict，则必须有一个静态分区，且放在最前面
set hive.exec.dynamic.partition.mode=nonstrict | strict;
#每个节点生成动态分区的最大个数
set hive.exec.max.dynamic.partitions.pernode=10000;
#生成动态分区的最大个数
set hive.exec.max.dynamic.partitions=100000;
#一个任务最多可以创建的文件数目
#set hive.exec.max.created.files=150000;
限定一次最多打开的文件数
#set dfs.datanode.max.xcievers=8192;

实例：

#创建动态分区表
create table d_part(
name string
)
partitioned by (value string)
row format delimited fields terminated by '\t' lines terminated by '\n'
stored as textfile;

#根据动态分区导入数据
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table d_part partition(value)
select name, st as value
from test_p;

冰河

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
打赏
0
评论
Hive之——数据导出

一、导出的方式1、Hadoop命令方式get hadoop fs -get hdfs://liuyazhuang121:9000/user/hive/warehouse/lyz.db/test_p/st=20180602/datatext hadoop fs -text hdfs://liuyazhuang121:9000/user/hive/warehouse/lyz.db/test...
复制链接

扫一扫