hive使用记录

最新推荐文章于 2024-04-23 07:00:00 发布

葳蕤_wish

最新推荐文章于 2024-04-23 07:00:00 发布

阅读量403

点赞数 1

分类专栏： linux sql hive 文章标签： hive oracle 数据库

本文链接：https://blog.csdn.net/weixin_37846736/article/details/118304491

版权

linux 同时被 3 个专栏收录

4 篇文章 0 订阅

订阅专栏

sql

1 篇文章 0 订阅

订阅专栏

hive

1 篇文章 0 订阅

订阅专栏

hive使用记录

执行命令hive进入hive库，ctrl+c退出hive
hive里输入命令都要加上分号结束

设置队列

SET mapreduce.job.queuename= huhe005;

查看数据库

show databases;

创建数据库

create database aaa location '/user/admin/aaa.db';

查询数据库是否创建成功

hadoop fs -du -h /user/admin/aaa.db/

进入数据库

use aaa;

查看表

show tables;

查看表的分区字段：

show partitions 表名;

查看已有表的建表语句

show create table temp_aaa;

新增表字段

alter table aaa.TEMP_AAA add columns (a1 string comment ''字段注释");

改变字段位置

alter table TEMP_AAA change a1 a1 string after a2;

修改表名

alter table table_name rename to new_table_name

修改列名

alter table aaa.temp_aaa  change column a1 a6 INT;

修改字段

alter table table_name change column a a1 string;

增加列

 alter table tablename add columns new_column_name

删除对应分区的数据

alter table aaa.TEMP_AAA drop partition(month_id='202106',DAY_ID ='01');

建表语句

CREATE TABLE aaa(
  a1 string COMMENT'111',
  a2 string COMMENT'222',
  a3 string COMMENT'333'
)
PARTITIONED BY ( 
month_id string,
day_id string
)
row format delimited 
lines terminated by '\n'
stored as textfile;

可以用 fields terminated by ‘,’ 指定分隔符

将a表的数据复制到b表中
动态分区插入语法如下

set  hive.exec.dynamic.partition=true;  
set  hive.exec.dynamic.partition.mode=nonstrict;  
set  hive.exec.max.dynamic.partitions.pernode=1000;  
INSERT OVERWRITE TABLE b PARTITION (province, city) SELECT ....... , a.province, a.city FROM a;

注意：
动态分区字段一定要放在所有静态字段的后面，这里业务字段在前，最后 a.province, a.city作为动态分区字段会被赋到PARTITION (province, city)中

删除表数据及表结构

drop table if exists iot_devicelocation;

仅创建表结构

create table newtable like oldtable;

既创建了表结构又复制了旧表的数据
– 方式1：

select * into newtable from oldtable;

– 方式2：

create table newtable as select * from oldtable;

小写字符串

lower()

hive的数据写入oracle库时txt文件有中文时要进行转码
文件入oracle库时中文转码，因为汉字本身为utf-8编码

iconv -c -f "utf-8"  -t  "gbk" a_1.txt > a_2.txt

utf-8文件转成gbk，然后再放到hive库

txt数据输入hive表

load data local inpath '/data/admin/1.txt' OVERWRITE into table aaa.temp_aaa;

Hive的数据导出txt文件
（1）第一种方法

insert overwrite local directory '/data/aaa/' 
row format delimited fields terminated by '\t'
select * from TEMP_AAA
WHERE MONTH_ID='202103' AND DAY_ID='16';

（2）第二种方法
Linux下执行，逗号分隔
hive -e “sql” | tr “\t” “,” > file
iconv -f UTF-8 -c -t GBK test.csv > test_new.csv
转码之后才能ftp传到windows本地，保证excel打开不乱码

hive和oracle的语法区别总结
1.hive要用limit来限制显示行数
2. oracle字符串类型是varchar(10)，hive中的字符串类型是string
3. 注释方法不同。Oracle是建完表之后有专门的语句来添加注释，但hive是直接建表时后边加comment ‘注释’即可
4. Hive中建表时建表字段无分区字段，分区字段是在partition by后边写，表格中会自动加到所有字段后边
5. Hive中经过处理的表一定要有别名
6. Hive执行sql时要指定队列，否则容易造成拥堵
7. 不加所属库时一定要在之前写use 哪个库
8. 覆盖写入overwrite
9. 多个表的时候要用left join … on

葳蕤_wish

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
hive使用记录

hive使用记录执行命令hive进入hive库，ctrl+c退出hivehive里输入命令都要加上分号结束设置队列SET mapreduce.job.queuename= huhe005;查看数据库show databases;创建数据库create database aaa location '/user/admin/aaa.db'; 查询数据库是否创建成功hadoop fs -du -h /user/admin/aaa.db/进入数据库use aaa;查看表show
复制链接

扫一扫