【linux】Hive常用命令总结

最新推荐文章于 2024-06-18 18:18:57 发布

softbook

最新推荐文章于 2024-06-18 18:18:57 发布

阅读量1.9k

点赞数 4

分类专栏： shell 文章标签： hive 数据仓库

本文链接：https://blog.csdn.net/bk120/article/details/103859278

版权

shell 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

写在开头:hive是基于Hadoop的一个数据仓库工具，用来进行数据提取、转化、加载，这是一种可以存储、查询和分析存储在Hadoop中的大规模数据的机制。hive数据仓库工具能将结构化的数据文件映射为一张数据库表，并提供SQL查询功能，能将SQL语句转变成MapReduce任务来执行。–来源某度词条。
下面整理了几个常用的hive操作命令。

1.创建库
>create database abc;
>create database if not exists abc;

2.查看库
>show database;
>show database like 'h_*';

3.创建库后修改其存储路径
>create database abc;
>location "******";

4.添加库的描述
>create database abc;
>comment "******";

5.查看库的描述及其位置
>describe database abc；

6.使用库
>use abc;

7.设置显示当前正在使用库
>set hive.cli.print.current.db=true;

8.删除数据库
>drop database if exists abc;

9.先删除库中的表，再删除库
>drop database if exists abc cascade;

10.创建表
>create table if not exists mydb.ttb(
		name string comment 'name ss',
		salary int comment 'name ss',
		sub float comment 'name ss',
		id array<String> comment 'name ss',
		name map<string,string> comment 'name ss',
	)
	comment 'fdfdfd'
	location 'fdf/fdfd/fdfd'


11.查看一个库下的所有表和视图
>show tables;
>show tables in myDb;
>show tables 'huu_*';

12.查看一个表的结构
>describe extended myDb.tb;
>describe myDb.tb.salary;

13.创建外部表接入外部数据,特定目录下所有文件-特定的字段分割符
>create external table if not exists stocks(
		exchange string,
		symbol stirng,
		ymb string,
		volum int
	)
	row format delimited fields terminated by ','
	location '/data/stocks/'

14.删除表
>drop table if exists tb1;

15.重命名表
>alter table mess rename to message;

16.表增加列
>alter table log add columns(
		app_name string comment "dsdsd",
		session_id int comment "fdfd"
	)


17.修改列

18.删除列
复杂数据类型：array,map,struct.struct类似于map

19.查询一个array类型的字段中的第二条数据。
>select name,sub[1] from empl;

20.使用正则表达式来指定查询的字段，以price为前缀的字段
>select id,‘price.*’ from empl;

21.设置查询尽可能的使用本地模式而并非mapreduce模式，一般情况下的复杂查询会使用mapreduce操作。
>set hive.exec.mode.local.auto=true;

22.不能在where条件中直接使用列的别名进行操作，可以进行嵌套查询替代。

23.hive中进行查询输入的小数实际上在运行时是double类型的数据格式，即0.2可能为0.20001或0.200000001，进行一些数值比较时需注意。-----IEEE规标准进行浮点性编码的数据皆有此类问题。
	//处理两种方式
	a.定义该字段的时候设定为double类型，最后几位double类型何double类型的比较-不存在转换。----增加查询内存消耗
	b.进行手动的函数式条件转换--cast(0.2 as float)，--最后即为float类型与float类型数据比较。
	-----与钱相关字段避免使用float类型定义
	
24.查询语句Rlike可以接正则表达式操作。
>select id,name,book where name Rlike '(*.de)|(*.ff)';

25.灵活使用笛卡尔积join查询---但会耗费大量时间。


26.将查询结果导入到一个新表中
>insert overwrite table tbsa select ****;

27.扫描一次表-将结果装入两张新表中
>from tb_his
>insert overwrite tb_new1 select *****
>insert overwrite tb_new2 select *****

28.创建视图
>create view viw1 as select *****;
>create view if not exists shipe11 comment 'descc' as select ****;
	
29.将视图复制到表
>create table tb1 like vie1;

30.删除视图
>drop view if exists view1;

32.重建索引--对特定分区的一个表进行索引创建--不指定分区则为全部分区
>alter index empl_index on table tb1 partition(country='us')  rebuild;

33.显示一个表的索引信息--index/indexs
>show formatted indexs on tab1;
	
34.删除索引
>drop index if exists empl_index on table tb1;

35.显示所有函数
>show functions;

36.查看函数使用说明
>describe function concat;

37.创建表并映射数据文件
>create table peop_info 
(a string,
b string,
c string
) row format serde 
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
with serdeproperties("field.delim"='\t',"serialization.encoding"='UTF-8')
stored as textfile;
LOAD DATA LOCAL INPATH '/home/demo/a.txt' OVERWRITE INTO TABLE peop_info;

softbook

关注

4
点赞
踩
18

收藏

觉得还不错? 一键收藏
0
评论
【linux】Hive常用命令总结

写在开头:hive是基于Hadoop的一个数据仓库工具，用来进行数据提取、转化、加载，这是一种可以存储、查询和分析存储在Hadoop中的大规模数据的机制。hive数据仓库工具能将结构化的数据文件映射为一张数据库表，并提供SQL查询功能，能将SQL语句转变成MapReduce任务来执行。–来源某度词条。下面整理了几个常用的hive操作命令。1.创建库>create database ...
复制链接

扫一扫

专栏目录