Hive中DML数据操作

本文详细介绍了Hive中的数据操作,包括数据导入:通过load data、insert语句、as select创建表以及import数据;数据导出:使用insert、Hadoop命令、hive shell以及export和sqoop;以及如何清除表中数据(Truncate)。重点讨论了各种操作的语法、案例实操及注意事项。
摘要由CSDN通过智能技术生成

一、数据导入

1.1 向表中装载数据

1、语法

hive> load data [local] inpath '/opt/module/datas/student.txt' [overwrite] | into table student 
[partition (partcol1=val1,…)]; 

(1) load data:表示加载数据
(2) local:表示从本地加载数据到 hive 表,否则从 HDFS 加载数据到 hive 表
(3) inpath:表示加载数据的路径
(4) overwrite:表示覆盖表中已有数据,否则表示追加
(5) into table:表示加载到哪张表
(6) student:表示具体的表
(7) partition:表示上传到指定分区

2、案例实操
(1) 创建一张表

hive (db_hive1)>  create table student(id string, name string) row format delimited fields terminated by '\t'; 
OK
Time taken: 0.215 seconds

(2) 加载本地文件到 hive

hive (db_hive1)> load data local inpath '/opt/module/datas/student.txt' into table db_hive1.student;
Loading data to table db_hive1.student
Table db_hive1.student stats: [numFiles=1, totalSize=24]
OK
Time taken: 0.31 seconds
hive (db_hive1)> select * from student;
OK
student.id	student.name
1	a
2	b
3	c
4	d
5	e
6	f
Time taken: 0.083 seconds, Fetched: 6 row(s)

(3) 加载 HDFS 文件到 hive 中
A、上传文件到 HDFS

hive (db_hive1)> dfs -put /opt/module/datas/student.txt /user/hive/student.txt;

B、加载 HDFS 上数据

hive (db_hive1)> load data inpath '/user/hive/student.txt' into table db_hive1.student; 
Loading data to table db_hive1.student
Table db_hive1.student stats: [numFiles=2, totalSize=48]
OK
Time taken: 0.379 seconds
hive (db_hive1)> select * from student;
OK
student.id	student.name
1	a
2	b
3	c
4	d
5	e
6	f
1	a
2	b
3	c
4	d
5	e
6	f
Time taken: 0.074 seconds, Fetched: 12 row(s)

(4) 加载数据覆盖表中已有数据
A、上传文件到 HDFS

hive (db_hive1)> dfs -put /opt/module/datas/student.txt /user/hive/student1.txt;

B、加载数据覆盖表中已有的数据

hive (db_hive1)> load data inpath '/user/hive/student1.txt' overwrite into table db_hive1.student;
Loading data to table db_hive1.student
Moved: 'hdfs://hadoop151:9000/user/hive/warehouse/db_hive1.db/student/student.txt' to trash at: hdfs://hadoop151:9000/user/test/.Trash/Current
Moved: 'hdfs://hadoop151:9000/user/hive/warehouse/db_hive1.db/student/student_copy_1.txt' to trash at: hdfs://hadoop151:9000/user/test/.Trash/Current
Table db_hive1.student stats: [numFiles=1, numRows=0, totalSize=24, rawDataSize=0]
OK
Time taken: 0.36 seconds
hive (db_hive1)> select * from student;
OK
student.id	student.name
1	a
2	b
3	c
4	d
5	e
6	f
Time taken: 0.068 seconds, Fetched: 6 row(s)

1.2 通过查询语句向表中插入数据(Insert)

1、创建一张分区表

hive (db_hive1)> create table student1(id int, name string) partitioned by (month string) row format delimited fields terminated by '\t';
OK
Time taken: 0.918 seconds

2、基本插入数据

hive (db_hive1)> insert into table  student1 partition(month='201709') values(1,'wangwu'); 
Query ID = test_20200217212020_42f330b2-1238-4e8a-bb82-6853bb21feda
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1581943453985_0001, Tracking URL = http://hadoop152:8088/proxy/application_1581943453985_0001/
Kill Command = /opt/module/hadoop-2.7.2/bin/hadoop job  -kill job_1581943453985_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2020-02-17 21:20:40,968 Stage-1 map = 0
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值