小白学Hadoop日记day10——hive库表操作

最新推荐文章于 2023-06-06 13:50:01 发布

兰翎翡竹

最新推荐文章于 2023-06-06 13:50:01 发布

阅读量165

点赞数

本文链接：https://blog.csdn.net/qq_42515611/article/details/118595932

版权

hive的注释使用： --
hive的sql也任然使用;来作为结束符号，并驱动hive执行。
hive的cli中执行linux命令需要使用!后面跟linux命令即可。
hive (default)> !pwd;
/root
hive (default)> !hdfs dfs -ls / ;
大小写规则:
1. Hive的数据库名、表名都不区分大小写
2. 建议关键字大写。user_base_info
命名规则:
1. 名字不能使用数字开头
2. 不能使用关键字
3. 尽量不使用特殊符号
4. 如果表比较多,那么表名和字段名可以定义规则加上前缀.
5.见名知意。
-- 创建库的本质：在hive的warehouse目录下创建一个目录（库名.db命名的目录）
-- 创建表的本质：在对应的库目录下创建已表名为目录名的目录
-- 切换库：
hive> use test;
--建库
hive> create database qfdb comment 'this is database of my';
--查看库描述
hive (test)> desc database qfdb;
--创建表
hive> create table t_user(id int,name string);
-- 使用库+表的形式创建表：
hive> create table default.t_user(id int,name string);
# 查看当前数据库的表
hive> show tables;
# 查看另外一个数据库中的表
hive> show tables in test;
#查看表描述
desc test.t_user;
describe test.t_user;
desc formatted test.t_user;
show create table test.t_user; --推荐使用

建表语法

hive的基本数据类型：

创建表语法：

    CREATE [EXTERNAL] TABLE [IF NOT EXISTS] TABLENAME
    [COLUMNNAME COLUMNTYPE [COMMENT 'COLUMN COMMENT'],...]
    [COMMENT 'TABLE COMMENT']
    [PARTITIONED BY (COLUMNNAME COLUMNTYPE [COMMENT 'COLUMN COMMENT'],...)]
    [CLUSTERED BY (COLUMNNAME COLUMNTYPE [COMMENT 'COLUMN COMMENT'],...) [SORTED BY (COLUMNNAME [ASC|DESC])...] INTO NUM_BUCKETS BUCKETS]
    [ROW FORMAT ROW_FORMAT]
    [lines terminated by \n]
    [STORED AS FILEFORMAT]
    [LOCATION HDFS_PATH];

表操作
表分为2类：内部表和外部表。区别如下：
创建区别：默认创建内部表；外部表加external关键字
删除区别：内部表删除时，将会删除元数据和hdfs中的数据内容；外部表只删除元数据。
使用区别：内部表用于临时表；外部表用于永久性。
表修改

本质就是修改元数据和对应的hdfs中数据位置。
alter table t_user rename to t_user1;
修改列名--change column
alter table t_user1 change column name namestr string;
alter table t_user1 change column namestr name int; --修改字段名并修改类型
修改列位置--change column
after:在某个字段之后
first:提到第一位。
alter table t_user1 change column sex sex string after id;
alter table t_user1 change column age age string after facevalue;
alter table t_user1 change column facevalue facevalue double after age;
alter table t_user1 change column facevalue facevalue double after age;
增加字段--add columns
alter table t_user1 add columns(age int,sex int,facevalue double);
替换字段--replace columns
相当于覆盖。
alter table t_user1 replace columns(
id1 int,
sex1 string,
namestr1 string,
age string,
facevalue double
);
内外部表转换--set tblproperties
alter table t_user1 set tblproperties('EXTERNAL'='TRUE');  -- EXTERNAL 和 true都要大写
alter table t_user1 set tblproperties('EXTERNAL'='false');  -- EXTERNAL必须大写

数据添加
本质就是将数据放到表所对应的目录。
1、直接使用命令(hdfs、linux等)将数据文件放到表目录下即可。
2、load方式加载数据(本地load和hdfs的load)
3、insert into方式插入。
4、克隆表并带数据
5、使用ctas,将s的数据放到创建的表中。
6、建表时候使用location关键字。

create table if not exists test.tu(
id int,
name string,
age int
)
row format delimited fields terminated by '\t';

create table if not exists test.tu1(
id int,
name string,
age int
)
row format delimited fields terminated by '\t';

create table if not exists test.tu2(
id int,
name string
)
row format delimited fields terminated by '\t';

1、使用hdfs命令上传到表目录(常用)：
-- 准备数据
1       goudan  19
2       mazi    20
3       nange   30
#上传

[root@hadoop01 ~]# hdfs dfs -put /home/tu /user/hive/warehouse/test.db/tu
[root@hadoop01 ~]# hdfs dfs -put /home/tu /user/hive/warehouse/test.db/tu/tu1

-- 查询

select * from tu;

2、load方式：

-- 本地load（将本地数据上传对应表目录）：

hive (test)> load data local inpath '/home/tu' into table tu;
#覆盖关键字 overwrite into
hive (test)> load data local inpath '/home/tu' overwrite into overwrite table tu;

-- hdfs load（将hdfs数据剪切对应表目录）：
[root@hadoop01 ~]# hdfs dfs -put /home/tu /
hive (test)> load data inpath '/tu' into table tu;

3、insert into方式（用得较多）

-- 单表插入

insert into tu1
select id,name,age from tu;

-- 多表插入

from tu
insert into tu1
select id,name,age
insert into tu2
select id,name;

-- 覆盖 -- overwrite table
insert overwrite table tu1
select id,name,age from tu;

4、克隆表并带数据：

create table tu3 like tu2;  -- 只克隆结构
create table tu4 like tu2 location '/user/hive/warehouse/test.db/tu2'; 

5、ctas结构（字段是别名）：

create table tu5
as
select
id,
name
from tu2
group by id,name;

6、使用location：
create table if not exists tu6(
content string
)
location '/input';

兰翎翡竹

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
小白学Hadoop日记day10——hive库表操作

hive的注释使用： --hive的sql也任然使用;来作为结束符号，并驱动hive执行。hive的cli中执行linux命令需要使用!后面跟linux命令即可。hive (default)> !pwd;/roothive (default)> !hdfs dfs -ls / ;大小写规则:1. Hive的数据库名、表名都不区分大小写2. 建议关键字大写。user_base_info命名规则:1. 名字不能使用数字开头2. 不能使用关键字3. 尽量不使用特殊符号4. 如
复制链接

扫一扫