Hive的常规操作

最新推荐文章于 2024-08-27 23:41:00 发布

菜虚虚

最新推荐文章于 2024-08-27 23:41:00 发布

阅读量384

点赞数 9

文章标签： hive hadoop 数据仓库

本文链接：https://blog.csdn.net/m0_72075031/article/details/135717215

版权

Hive的常规操作

hive常用交互命令

-e执行sql语句

[root@master ~]# hive -e "show databases";

-e执行sql脚本

[root@master ~]# -f /usr/local/demo.sql

查看hive中输入的所有命令

[root@master ~]# ~/.hivehistory

操作库

1创建库

hive (default)>create database 库名称;

2.查看库

2.1查看所有数据库

hive (default)>show databases;

2.2查看数据库信息

hive (default)>desc database 数据库名称;

2.3查看数据库详细信息

hive (default)>desc database extended 数据库名称;

3.使用库

hive >use demo;

4.修改库

使用alter database命名为某个数据库的dbproperties设置键值对属性值。用于描述数据库的属性信息，数据库的其他数据信息无法更改，包括数据库所在目录地址。

hive (default)>alter database 数据库名称 set dbproperties('createtime'='20220620');

5.删除数据库

5.1删除空数据库

hive (default)>drop database demo(数据库名称) ;

5.2删除存在的数据库

hive (default)>drop database if exists 数据库名称;

5.3不为空的数据库

drop databdase 数据库名称 cascade;

操作表

创建表

语法：

creata[externa] table [if not exists] table_name
(col_name data_type)[comment col_commment]，....）
[commentn table_comment]
[partitioned by (col_name data-type [comment col-comment]...)]
[cloustered by(col_name,col_name)]
[sorted by(col_name [asc|desc],...)into num_buckets bbuckets]
[stored hdfs_path]
[like]

creata table:
- 创建一个指定名称的表，如果相同的表存在，则抛出异常，用于可以使用if not exists选项忽略这个一场。
external：
- 关键字在创建一个外部表，在建表的同时指定一个指向数据的路径location，hive创建内部表时，会将数据移动到数据仓库指向的路径：如果创建外部表，仅记录数据所在的路径，不会对数据的位置做任何改变，在删除表时，内部表的元数据和数据会被一起删除，而外部表只删除元数据，不删除数据。
commnet
- 为表和列添加注释
partitioned by
- 创建分区表
cloustered by
- 创建分桶表
sorted by
- 桶内排序
stored as
- 指定存储文件类型，常见的类型有二进制文件（sequencefile），文本（txtfile),列式存储格式文件（rcfile），如果文本数据是纯文本，可以使用stored as textfile。如果需要压缩存储，可以使用stored as sequencefile
location
- 指定表在hdfs上的存储位置
like
- 允许用户复制现有的表结构，但不是复制数据。

1.创建分区表

hive(demo)>creata table if not exists demo_tab_01(colume01 string);
          >partitioned by (colume02 string)
          >row format delimited
          >fields terminated by '\t';

2.创建外部表

hive(demo)>create external table if exists demo_tab_02(id int,name string)
          >row format delimited fields terminated by '\t';

3.删除数据表

hive(demo)>drop table student;

修改表

1.修改内部表为外部表

hive(demo)>alter table demo_tab_01 set tblproperties('exterual'='true');

2.修改外部表为内部表

hive(demo)>alter table demo_tab_01 set tblproperties('exterual'='false');

3.重命名表

hive(demo)>alter table demo_tab_01 rename to demo_new_tab_01;

4.添加字段信息

hive(demo)>alter table demo_tab_02 add columns(age int);

5.修改字段信息

hive(demo)>alter table demo_tab_02 change column id new_id string;

6.替换列

不会修改存储在hdfs中的数据，只是修改元数据的列而已。如果hdfs存储的是string，如果列replace列为int，则查不到对于的数据。

hive(demo)>alter table demo_tab_02 replace columns (age int);

删除表

1.删除表

hive(demo)>drop table demo_new_01(表名）;

2.清空表

只能清空管理表，不能清空外部表

hive(demo)>truncate table demo_new_01；

分区表

概念

分表实际上是对于一个HDFS文件系统上的独立系统，改文件夹是该分区所有的数据文件。hive中的分区就是分目录，把一个大的数据集切割成多个小的数据集，在查询是可以通过where选定的分区查询对应的数据。

操作

1.创建分区表

hive(demo)>creata table if not exists demo_tab_01(colume01 string)
          >partitioned by (colume02 string)
          >row format delimited
          >fields terminated by '\t';

2.查看分区信息

hive(demo)>show partitins demo_tab_01;

3.查看分区表结构

hive(demo)>desc formatted demo_tab_01;

4.查看分区数据

hive(demo)>select*from demo_tab_o1;
hive(demo)>select*from demo_tab_01 where colume02;

5.增加单个分区

hive(demo)>alter table demo_tab_01 add partition(colume01 ='yyyy');

6.增加多个分区

hive(demo)>alter table demo_tab_01 add partition(colume01 ='yyyy')partition(colume01 ='hhhh')partition(colume01 ='xxxx');

7.删除单个分区

hive(demo)>alter table demo_tab_01 drop partition (colume='yyyy');

8.删除多个分区

hive(demo)>alter table demo_tab_01 drop partition (colume='yyyy') partition (colume='hhhh') partition (colume='xxxx');

菜虚虚

关注

9
点赞
踩
10

收藏

觉得还不错? 一键收藏
0
评论
Hive的常规操作

e执行sql语句-e执行sql脚本查看hive中输入的所有命令。
复制链接

扫一扫