Hive简单用法总结

最新推荐文章于 2023-03-07 16:40:06 发布

夜空07

最新推荐文章于 2023-03-07 16:40:06 发布

阅读量221

点赞数

文章标签： hadoop linux hive 数据库

本文链接：https://blog.csdn.net/weixin_38655836/article/details/78040580

版权

1》创建hive表
CREATE TABLE IF NOT EXISTS employee (
id int,
name String,
salary String)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'

STORED AS TEXTFILE;

2》从本地文件加载数据到数据库(有overwrite 是重写没有overwrite 是追加)
load data local inpath '/opt/modules/hadoop-2.5.0-cdh5.3.6/text/employee.txt' overwrite into table employee;
employee.txt
0   zhangsan   5000
1   lisi   6000
2   wnangwu   7000
3   css   50000

3》将文件上传到hdfs文件系统上
bin/hdfs dfs -put text/employee.txt /usr/css/mapreduce/wordcount/input/

4》从hdfs加载数据到数据库(有overwrite 是重写没有overwrite 是追加)
load data inpath '/usr/css/mapreduce/wordcount/input/employee.txt' overwrite into table employee;
加载hdfs文件系统文件到hive（文件实际也是存在hdfs文件系统上）之后，hdfs系统上的原来文件会被删除。

根据查询表来创建子表(相当于复制)
create table if not exists default.employee02 as select *from employee;
where 条件会选择特定的数据插入新表
create table if not exists default.employee03 as select *from employee where name='zhangsan';

5》.bin/hive
   从本地文件加载数据到数据库
   >load data local inpath '/opt/app/datas/dep.txt' overwrite into table dep;

   根据查询表来创建子表
   >create table if exists default.dept_cats
   as select *from dept;

   清除一个表的数据
   >truncate table dept_cats;

   创建表
   >create table if not exists default.dept_like
   like
   default.dept;
   修改表
   >alter table dept_like rename to dept_like_rename;
   删除表
   drop table if exists table_like_rename;
   重命名表
   alter table employee02 rename to employee;
   查看Hive表的详细信息
   desc formatted employee;

6》创建外部表 external是外部表关键词 location 代表在hdfs系统上文件存放的位置，外部表删除之后不会删除hdfs上的文件
create external table exter_table(
id int,
name string,
birthday string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE
location '/user/hive/warehouse/external';

//加载本地文件到外部表复制文件到外部表的hdfs文件系统存放位置
load data local inpath '/opt/modules/hadoop-2.5.0-cdh5.3.6/text/employee.txt' into table exter_table;

7》分区建表分为2种，一种是单分区，也就是说在表文件夹目录下只有一级文件夹目录。另外一种是多分区，表文件夹下出现多文件夹嵌套模式。
单分区建表语句：create table day_table (id int, content string) partitioned by (dt string);单分区表，按天分区，在表结构中存在id，content，dt三列。
双分区建表语句：create table day_hour_table (id int, content string) partitioned by (dt string, hour string);双分区表，按天和小时分区，在表结构中新增加了dt和hour两列。

创建分区表
create table part_table(
id int,
name string)
partitioned by (dt string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE
location '/user/hive/warehouse/part';

//加载文件到table
LOAD DATA LOCAL INPATH '/opt/modules/hadoop-2.5.0-cdh5.3.6/text/employee.txt' INTO TABLE part_table PARTITION(dt='2008-08- 08');

夜空07

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
2
评论
Hive简单用法总结

1》创建hive表CREATE TABLE IF NOT EXISTS employee ( id int,name String,salary String)ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'STORED AS TEXTFILE;2》从本地文件加载数据到数据库(有overwrite 是重写没有overwri
复制链接

扫一扫