学习Hive命令大全，请看它！

最新推荐文章于 2024-05-03 11:47:20 发布

病妖

最新推荐文章于 2024-05-03 11:47:20 发布

阅读量533

点赞数 4

文章标签：数据库 hive 大数据

本文链接：https://blog.csdn.net/weixin_42507474/article/details/107228484

版权

一、创建数据库语句

create database 数据库名;

二、创建表语句

1.创建一个指定名字的内部表：CREATE TABLE 表名，如果名字相同则抛出异常，，可以用IF NOT EXISTS 来忽略这个异常。
2. 创建一个外部表：CREATE external TABLE 表名
3.like建表：允许用户复制现有的表结构，但是不复制数据例如：create table 表名1 like 表名2
4. 记录行数：ROW FORMAT DELIMITED
5. 分割列（字段）：fields terminated by ‘\t’
6. 分割集合和映射：collection items terminated by ‘,’ ; map keys terminated by ‘:’
7. 指定文件存储格式：stored as textfile
8. hdfs数据存储路径：location ‘hdfs存储路径’
9. 创建分区表：create table 表名（…）partitioned by (…)

示例

1、创建一个内部表示例

create table emp(
empno int,empname string,job string)
row format delimited fields terminated by '\t'
collection items terminated by ',' 
map keys terminated by ':'
stored as textfile;

2、创建一个外部表示例

create external table emp_external(
empno int,empname string,job string)
row format delimited fields terminated by '\t'
collection items terminated by ',' 
map keys terminated by ':'
stored as textfile;

3、创建一个分区表示例

create table order_partition(
orderno string,event_time string)
partitioned by (event_month string)
row format delimited fields terminated by '\t'
collection items terminated by ',' 
map keys terminated by ':'
stored as textfile;

三、修改表语句

1. 重命名表语法：alter table tb_name rename to new_tb_name
2. 添加/更新列语法：alter table tb_name add|replace columns(col_name data_type,…)
添加列示例：
//创建测试表

 create table student(id int,age int,name string)
 row format delimited fields terminated by '\t';
 //添加一列adress
 alter table student add columns(address string);
 //更新所有的列
 alter table student add columns(id int,name string);
 //查看表结构,现在表中只有id和name两列
 desc student;

四、显示命令

查看所有数据库：show databases;
查看某个数据库中的所有表：show tables;
查看某个表的所有分区信息：show partitions;
查看hive支持的所有函数：show functions;
查看表的信息：desc extended tb_name;
查看更加详细的表信息：desc formatted tb_name;

五、load加载

load操作只是单纯的复制/移动操作，将数据文件移动到Hive表对应的位置
（1）filepath：
相对路径,例如：project/datal
绝对路径，例如：/user/hive/project/datal
（2）local关键字：
指定了local关键字，load命令会去查找本地文件系统中的filepath
如果没有指定local关键字，则根据inpath中的uri查找文件，包含模式的完整uri
（3）overwrite关键字
如果使用了，则目标中的内容会被删除，然后再将filepath指向的文件/目录中的内容添加到表/分区中
,如果目录表分区已经有一个文件并且文件名和filepath中的文件名冲突，那么新文件会取代旧文件。

示例1：加载本地文件到Hive表

load data local inpath '/tmp/warehouse/emp.txt' into table emp;

示例2：加载HDFS文件到Hive表
//上传本地文件到HDFS

cd /root/data
hdfs dfs -mkdir -p /data/hive
hdfs dfs -put  /tmp/warehouse/emp.txt /data/hive

//加载hdfs文件到Hive表
load data local inpath '/tmp/warehouse/emp.txt' into table emp;

示例3：使用overwrite，覆盖表中已有的数据

load data local inpath '/tmp/warehouse/emp.txt'  overwrite into table emp;

示例4：加载数据到hive分区表

load data local inoath '/tmp/warehouse/emp.txt'overwrite into table order_partition  partition(event_month='2020=07');

六、insert

insert将查询结果插入到Hive表/分区：

 insert overwrite table tb_name1 [partition(partcoll=val1,..)] select_statment1 from from_statement

多insert插入：from 表/分区

 insert overwrite table tb_name1[partition(partcpll=val1,...)]select _statement1
 [ insert overwrite table tb_name1[partition(partcpll=val1,...)]select _statement2]...

动态分区插入：

insert overwrite table tb_name1 partition(partcoll=val1,..) select_statment1 from from_statement

通过指定列插入：

 insert into employee(name) select 'John' from test limit 1;

通过指定值插入：

insert into employee(name) value('Judy'),('John');

从同一数据源插入本地文件，hdfs文件，表

from ctas_employee
insert overwrite[local] directory '/tmp/out1'  select * from..
insert overwrite directory '/tmp/out1' select * from..
insert overwrite table employee_internal select * from..;

以指定格式插入数据

insert overwrite directory '/tmp/out3'
row format delimited fields terminated by ','
select * from ctas_employee;

七、select

这里我们直接用示例来做比较直接。

示例1：全表查询、指定字段查询

select * from emp;
select empno,empname from emp;

示例2：条件过滤
//1.等值过滤

select * from emp where empno=10;
select * from emp where empname='jdk';

//2.>=,<=过滤

select * from emp where empno>=10;
select * from emp where empno<=10;

//3.between and区间过滤

select empname,empno from where empno between 80 and 100;

//4.limit控制结果集记录条数

select * from emp limit 4;

//5.in / not in

select * from emp where empname in('opt','tmp');
select * from emp where empname not in('opt','tmp');

//6.is /not null

select * from emp where comm is null;
select * from emp where comm is not null;

示例3：使用聚合统计函数（max/min/count/avg/sum）

//1.统计部门编号为10的部门员工数

select count(*) from emp where empno=10;

//2.求最高工资，最低工资，工资总和，平均工资

select max(sal),min(sal),avg(sal),sum(sal) from emp;

//3.求每个部门的平均工资

select empname,deptnp,avg(sal) from emp group by depto

注意：gropu by 为聚合运算，用于分组，如果没有则默认聚合整个表，一般计算某部门的平均工资类似的问题都要聚合某个hive里的变量

//4.CTE和嵌套查询
1.CTE

with tab1 as (select id,name,age from people) 
select * from tab1;

2.嵌套查询

SELECT * FROM (SELECT * FROM employee) a;

病妖

关注

4
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
学习Hive命令大全，请看它！

一、创建数据库语句create database 数据库名;二、创建表语句1.创建一个指定名字的内部表：CREATE TABLE 表名，如果名字相同则抛出异常，，可以用IF NOT EXISTS 来忽略这个异常。2. 创建一个外部表：CREATE external TABLE 表名3.like建表：允许用户复制现有的表结构，但是不复制数据例如：create table 表名1 like 表名24. 记录行数：ROW FORMAT DELIMITED5. 分割列（字段）：fields term
复制链接

扫一扫