Hive中DML数据操作

最新推荐文章于 2021-05-25 16:35:09 发布

火成哥哥

最新推荐文章于 2021-05-25 16:35:09 发布

阅读量283

点赞数

分类专栏： hive 文章标签： hive hadoop 大数据 hdfs sqoop

本文链接：https://blog.csdn.net/a1786742005/article/details/104363679

版权

本文详细介绍了Hive中的数据操作，包括数据导入：通过load data、insert语句、as select创建表以及import数据；数据导出：使用insert、Hadoop命令、hive shell以及export和sqoop；以及如何清除表中数据（Truncate）。重点讨论了各种操作的语法、案例实操及注意事项。

摘要由CSDN通过智能技术生成

一、数据导入

1.1 向表中装载数据

1、语法

hive> load data [local] inpath '/opt/module/datas/student.txt' [overwrite] | into table student 
[partition (partcol1=val1,…)];

(1) load data：表示加载数据
(2) local：表示从本地加载数据到 hive 表，否则从 HDFS 加载数据到 hive 表
(3) inpath：表示加载数据的路径
(4) overwrite：表示覆盖表中已有数据，否则表示追加
(5) into table：表示加载到哪张表
(6) student：表示具体的表
(7) partition：表示上传到指定分区

2、案例实操
(1) 创建一张表

hive (db_hive1)>  create table student(id string, name string) row format delimited fields terminated by '\t'; 
OK
Time taken: 0.215 seconds

(2) 加载本地文件到 hive

hive (db_hive1)> load data local inpath '/opt/module/datas/student.txt' into table db_hive1.student;
Loading data to table db_hive1.student
Table db_hive1.student stats: [numFiles=1, totalSize=24]
OK
Time taken: 0.31 seconds
hive (db_hive1)> select * from student;
OK
student.id	student.name
1	a
2	b
3	c
4	d
5	e
6	f
Time taken: 0.083 seconds, Fetched: 6 row(s)

(3) 加载 HDFS 文件到 hive 中
A、上传文件到 HDFS

hive (db_hive1)> dfs -put /opt/module/datas/student.txt /user/hive/student.txt;

B、加载 HDFS 上数据

hive (db_hive1)> load data inpath '/user/hive/student.txt' into table db_hive1.student; 
Loading data to table db_hive1.student
Table db_hive1.student stats: [numFiles=2, totalSize=48]
OK
Time taken: 0.379 seconds
hive (db_hive1)> select * from student;
OK
student.id	student.name
1	a
2	b
3	c
4	d
5	e
6	f
1	a
2	b
3	c
4	d
5	e
6	f
Time taken: 0.074 seconds, Fetched: 12 row(s)

(4) 加载数据覆盖表中已有数据
A、上传文件到 HDFS

hive (db_hive1)> dfs -put /opt/module/datas/student.txt /user/hive/student1.txt;

B、加载数据覆盖表中已有的数据

hive (db_hive1)> load data inpath '/user/hive/student1.txt' overwrite into table db_hive1.student;
Loading data to table db_hive1.student
Moved: 'hdfs://hadoop151:9000/user/hive/warehouse/db_hive1.db/student/student.txt' to trash at: hdfs://hadoop151:9000/user/test/.Trash/Current
Moved: 'hdfs://hadoop151:9000/user/hive/warehouse/db_hive1.db/student/student_copy_1.txt' to trash at: hdfs://hadoop151:9000/user/test/.Trash/Current
Table db_hive1.student stats: [numFiles=1, numRows=0, totalSize=24, rawDataSize=0]
OK
Time taken: 0.36 seconds
hive (db_hive1)> select * from student;
OK
student.id	student.name
1	a
2	b
3	c
4	d
5	e
6	f
Time taken: 0.068 seconds, Fetched: 6 row(s)

1.2 通过查询语句向表中插入数据(Insert)

1、创建一张分区表

hive (db_hive1)> create table student1(id int, name string) partitioned by (month string) row format delimited fields terminated by '\t';
OK
Time taken: 0.918 seconds

2、基本插入数据

hive (db_hive1)> insert into table  student1 partition(month='201709') values(1,'wangwu'); 
Query ID = test_20200217212020_42f330b2-1238-4e8a-bb82-6853bb21feda
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1581943453985_0001, Tracking URL = http://hadoop152:8088/proxy/application_1581943453985_0001/
Kill Command = /opt/module/hadoop-2.7.2/bin/hadoop job  -kill job_1581943453985_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2020-02-17 21:20:40,968 Stage-1 map = 0

最低0.47元/天解锁文章

火成哥哥

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Hive中DML数据操作

一、数据导入1.1 向表中装载数据1、语法hive> load data [local] inpath '/opt/module/datas/student.txt' [overwrite] | into table student [partition (partcol1=val1,…)]; (1) load data：表示加载数据(2) local：表示从本地加载数据到 h...
复制链接

扫一扫

专栏目录