hive基本操作

最新推荐文章于 2022-02-10 20:07:15 发布

且从容.

最新推荐文章于 2022-02-10 20:07:15 发布

阅读量287

点赞数

文章标签： hive

本文链接：https://blog.csdn.net/qq_43592674/article/details/109303202

版权

启动Metastore server

hive --service metastore &

启动hive server2

hive --service hiveserver2 &

查看所有的数据库

show  databases;

查看当前库中所有的表

show tables;

查看表中所有的数据

select * from table_name;

hive数据类型
结构体
键值对
数组
在这里插入图片描述
hive会为每个数据库创建一个目录，数据库中的表都将会以这个数据库目录的子目录形式存储，有一个例外是default数据库中的表，这个数据库本身没有自己的数据库

创建数据库

create (data|shema) [if not exists] database_name
[comment database_comment]
[location hdfs_path]
[with dbproperties (property_name=property_value,...)

if not exists 当数据库不存在使创建
comment 添加注释
location 增加hdfs的地址，不指定迷人使用数据仓库地址
with dbproperties 指定数据库的属性信息
data和shema 功能相同，可以互换，代表数据库
切换数据库

use database_name;

删除数据库

drop (database|shema)[if exists] database_name [restrict|cascade]

IF EXISTS 当数据库不存在时不报异常’
RESTRICT|CASCADE 约束|级联默认约束约束即如果删除的数据库中有表数据，则删除失败，如果指定为级联无论删除的数据库是否有表数据，强制删除
过滤显示数据库前缀为db_hive的所有数据库

show databases like 'db_hive*';

查看当前使用的数据库

select current_database();

使提示符显示当前所在的数据库

set hive.cli.print.current.db=true;

若要不显示，true改为false即可
为数据库添加描述信息

 create database financials1 comment 'holds all financial tables';

查看comment添加的描述信息

describe database database_name;

显示详细信息

 describe database extended database_name;

修改数据库

alter database databse_name set dbproperties(' ...'='. ..',...);

注意，没有办法可以删除或者重置数据库属性，数据库的其他元信息都是不能更改的，我们可以修改data、owner、location等，但修改location后，原有的内容不会被修改，只是再写入的时候会往新的location里写入。

hive 表操作

Create [EXTERNAL] TABLE [IF NOT EXISTS] table_name
[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
[CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]
[ROW FORMAT row_format]
[STORED AS file_format]
[LOCATION hdfs_path]

EXTERNAL 声明外部表
PARTITIONED 创建分区
CLUSTERED 创建桶
…
创建表的时候指定表所在的数据库有两种方法：（1）在创建表的之前使用USE命令指定当前使用的数据库（2）在表名前添加数据库声明，例如database_name.table_name

查看表结构

desc table_name；

查看详细的表结构

desc formatted table_name;

将本地文件导入hive

load data local inpath'/你的文件路径/' into table table_name;

若不在hive命令行下则执行这个命令

hive -e "load data local inpath '/你的路径/' into table table_name";

删除表

drop table if exists table_name;

我们可以拷贝一张已经存在的表的表格式（而无需拷贝数据）

create table if not exists database_name.newtable_name  like database_name.oldtable_name;

创建外部表

create external table test_db.emp(id int,name string);

创建外部表，指定location关键字，指定表创建的位置

hive >create external table test_db.emp2(
>id int,
>name STRING)
>ROW FORMAT DELTMITED FIFLDS
>TERMINATED BY '\t' LOCATION '/input/hive';

删除外部表

drop table 库名.表名

分区表
创建分区表

hive> create table partitiontable(
    > id int ,name string,gender string)
    > partitioned by (age int)
    > row format delimited fields terminated by '\t';

查询分区
在这里插入图片描述

增加分区

alter table table_name add partition(分区条件）;

增加多个分区，eg：

alter table studentx add partition(age=22) partition (age=33)
 ;

删除分区

alter table studentx drop partition (age=33);

查看分区

show partitions partitiontable_name;

导入数据时同时指定分区

load data local inpath '/../../..' into table table_name partition(..);

使用load关键字导入数据列表中，默认使用的是静态分区
分桶表的创建

hive> create table user_info2(user_id int,name string)
    > clustered by(user_id)
    > into 6 buckets
    > row format delimited fields terminated by'\t';

创建中间表

create table user_info_tmp2(user_id int,name string)
 row format delimited fields terminated by '\t';

将临时表导入分区表

insert into table table_name select 需要的列，需要的列 from tmptable_name;

这是一个MapReduce操作，所以有点慢
查看桶数据对应的hdfs数据仓库地址
在这里插入图片描述

在这里插入图片描述

且从容.

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫