2020.9.16hive基础

操作	HiveServer2 Beeline	HiveServer1 CLI
Server Connection	beeline –u < jdbcurl > -n <username> -p <password>	hive –h <hostname> -p <port>
Help	beeline -h or beeline --help	hive -H
Run Query	beeline -e <query in quote> beeline -f <query file name>	hive -e <query in quote> hive -f <query file name>
Define Variable	beeline -- hivevar key=value	hive -- hivevar key=value

交互模式

操作	HiveServer2 Beeline	HiveServer1 CLI
Enter Mode	beeline	hive
Connect	!connect < jdbcurl >	N/A
List Tables	!table	show tables;
List Columns	!column < table_name >	desc table_name ;
Save Result	!record < file_name > !record	N/A
Run Shell CMD	! sh ls	! ls ;
Run DFS CMD	dfs - ls	dfs - ls ;
Run SQL File	!run < file_name >	source < file_name >;
Check Version	! dbinfo	!hive --version;
Quit Mode	!quit	quit;

7.进入交互模式

（1）hiveserver2 start

（2）复制当前窗口，到另外一个窗口输入

beeline -u "jdbc:hive2://localhost:10000"

如果出现权限不足的报错，则需要给tmp文件赋权

hdfs dfs -chmod -R 777 /tmp

(3)建表：

create table ccc (aname String);

如果此时出现权限不足报错

同理给opt目录赋权

hdfs dfs -chmod -R 777 /opt

(4)插入内容

insert into ccc values('hello'),('happy')

(5)删表

drop table ccc；

此时报错，则检查hive目录下的lib目录中是否存在：

不存在加上即可

二、hive命令

1.创建数据库

create database 数据库名

2.删除数据库

drop database 数据库名 cascade

查看数据库：

show database；

使用数据库：

use 数据库名

显示当前所在的数据库:

select current_database();

set hive.cli.print.current.db=true;

2.加载文件（对所有表通用）

加载本地文件给表

load data local inpath "本地路径名/employee.txt"

into table employee_partition

partition(country='china',add='nanjing') ;

加载hdfs文件给表

load data inpath '路径/employee.txt'

into table employee_partition

partition (country='china',add='liaoning');

3.库级操作

同mysql

4.表级操作

提要：默认内部表，会默认在指定的存储空间中建立对应文件夹

只要把文件放入，表就可以读取到数据（需要和表结构匹配）

分区表，会在表下创建文件夹，数据在各分区文件夹下

(1)建表步骤：

文件内容如下：

Michael|Montreal,Toronto|Male,30|DB:80|Product:DeveloperLead

Will|Montreal|Male,35|Perl:85|Product:Lead,Test:Lead

Shelley|New York|Female,27|Python:80|Test:Lead,COE:Architect

Lucy|Vancouver|Female,57|Sales:89,HR:94|Sales:Lead

（1）先建表

create table employee(

name string,

address array<string>,

Info struct<sex:string,age:int>,

salary map<string,int>,

jobs map<string,string>)

row format delimited

fields terminated by '|'

collection items terminated by ','

map keys terminated by ':'

lines terminated by '\n';

注：这里表后面的row...lines语句的顺序不能颠倒交换

将文件放入到hdfs对应的表下面

一般创建好数据库之后，hdfs上面就会多出一个数据库的路径，当创建表后，数据库下面同时也会多出一个表路径，将文件放到这下面就可以了

（2）hdfs dfs -put employee.txt /opt/hive/warehouse/hivetest2.db/employee

示例1：

文件内容如下：

建表语句：

create table employee_id(

name string,

id int,

address array<string>,

Info struct<sex:string,age:int>,

salary map<string,int>,

jobs map<string,string>)

row format delimited

fields terminated by '|'

collection items terminated by ','

map keys terminated by ':'

lines terminated by '\n';

传入文件：

hdfs dfs -put employee_id.txt /opt/hive/warehouse/hivetest2.db/employee_id

示例2：

文件如下：

Matias McGrirl|1|945-639-8596|2011-11-24

Gabriela Feldheim|2|706-232-4166|2017-12-16

Billy O'Driscoll|3|660-841-7326|2017-02-17

Kevina Rawet|4|955-643-0317|2012-01-05

Patty Entreis|5|571-792-2285|2013-06-11

Claudetta Sanderson|6|350-766-4559|2016-11-04

Bentley Oddie|7|446-519-0975|2016-05-02

Theressa Dowker|8|864-330-9976|2012-09-26

Jenica Belcham|9|347-248-4379|2011-05-02

Reube Preskett|10|918-740-2357|2015-03-26

Mary Skeldon|11|361-159-8710|2016-03-09

Ethelred Divisek|12|995-145-7392|2016-10-18

Field McGraith|13|149-133-9607|2015-10-06

Andeee Wiskar|14|315-207-5431|2012-05-10

Lloyd Nayshe|15|366-495-5398|2014-06-28

Mike Luipold|16|692-803-9373|2011-05-14

Tallie Swaine|17|570-709-6561|2011-08-06

Worth Ledbetter|18|905-586-2348|2012-09-25

Reine Leyborne|19|322-644-5798|2015-01-05

Norby Bellson|20|736-881-5785|2012-12-31

Nellie Jewar|21|551-505-3957|2017-06-18

Hoebart Deeth|22|780-240-0213|2011-09-19

Shel Haddrill|23|623-169-5495|2014-02-04

Christalle Cervantes|24|275-309-7794|2017-01-01

Dorita Miche|25|476-242-9769|2014-10-26

Conny Bowmen|26|398-181-4961|2011-10-21

Sabra O' Donohoe|27|327-773-8515|2015-01-28

Rahal Ashbe|28|561-777-0202|2012-12-13

Tye Greenstreet|29|499-510-1700|2012-01-17

Gordy Cristoforetti|30|955-110-7073|2015-10-09

Marsha Sharkey|31|221-696-5744|2017-01-29

Corbie Cruden|32|979-583-4252|2011-08-20

Anya Easen|33|428-602-5117|2011-08-16

Clea Brereton|34|909-198-4992|2018-01-08

Kimberley Pinnijar|35|608-177-4402|2015-06-03

Wilma Mackriell|36|637-304-3580|2012-06-23

Mitzi Gorman|37|134-675-2460|2017-07-16

Ashlin Rennick|38|816-635-9974|2014-04-20

Whitaker Shedd|39|614-792-6663|2016-05-19

Mandi Stronack|40|753-688-2327|2016-04-24

Niki Driffield|41|225-867-0712|2014-02-15

Regine Agirre|42|784-395-9982|2017-05-01

Evelina Craddy|43|274-850-6569|2017-06-14

Yasmin Ubsdall|44|679-739-9660|2012-03-10

Vivianna Shoreman|45|873-271-7100|2014-09-06

Chance Murra|46|248-160-3759|2017-12-31

Ferdy Adriano|47|735-447-2642|2013-11-11

Nikolos Tichner|48|869-871-9057|2014-02-15

Doro Rushman|49|861-337-3364|2011-08-27

Lela Hinzer|50|147-386-3735|2011-06-03

Hoyt Winspar|51|120-561-6266|2016-05-05

Vicki Rimington|52|257-204-8227|2014-11-21

Louis Dalwood|53|735-885-8087|2014-02-17

Joseph Zohrer|54|178-152-4726|2015-11-04

Kennett Senussi|55|182-904-2652|2017-05-20

Letta Musk|56|534-353-2038|2013-11-04

Giulietta Glentz|57|761-390-2806|2011-09-08

Wright Frostdyke|58|932-838-9710|2015-07-15

Bat Hannay|59|404-841-2981|2015-04-04

Devlen Hutsby|60|830-520-6401|2015-07-12

Lynnea Bembrigg|61|408-264-4116|2013-02-24

Udall Nelle|62|485-420-4327|2011-07-01

Kyle Matheson|63|153-149-2140|2011-07-03

Jarid Sprowell|64|848-408-9569|2017-11-08

Jeanie Griffitt|65|442-599-1231|2018-03-09

Joana Sleith|66|264-979-0388|2017-02-13

Doris Ilyushkin|67|877-472-3918|2015-08-03

Michaelina Rennels|68|949-522-9333|2012-07-05

Onofredo Butchard|69|392-833-3926|2017-11-05

Beatrice Amis|70|963-487-6585|2015-01-24

Joyan O'Hanlon|71|952-969-7279|2017-09-22

Mikaela Cardoo|72|960-275-3958|2015-01-24

Lori Dale|73|530-116-2773|2017-07-05

Stevena Roloff|74|241-314-8328|2015-12-21

Fayth Carayol|75|907-502-3752|2012-12-04

Carita Bruun|76|117-771-8056|2017-05-31

Darnell Hardwell|77|718-247-8505|2012-05-09

Jonathon Grealy|78|136-515-3637|2014-03-29

Laurice Rosini|79|352-594-3238|2017-02-15

Emelia Auten|80|311-899-1782|2014-09-10

Trace Fontelles|81|414-607-8366|2016-03-09

Hope Sket|82|461-595-7667|2017-09-30

Cilka Heijne|83|772-704-7366|2011-08-27

Maurise Gallico|84|546-158-7983|2011-12-21

Casey Greenfield|85|204-108-7707|2012-03-18

Wes Jaffrey|86|848-465-5131|2016-02-14

Jilly Eisikowitz|87|431-355-2777|2017-02-18

Auguste Kobel|88|562-494-1360|2012-02-29

Zackariah Pietrusiak|89|810-738-9846|2012-02-25

Pearline Marcq|90|200-835-9497|2016-02-10

Sayre Osbaldeston|91|340-132-2361|2011-11-30

Floyd Cano|92|133-768-6535|2016-02-27

Ciro Arendt|93|792-967-0588|2015-11-07

Auguste Kares|94|230-184-3438|2014-03-13

Skipp Spurden|95|747-133-1382|2012-03-15

Alyssa Prydden|96|963-170-0545|2014-11-07

Orlando Pallatina|97|354-125-1208|2012-07-12

Zoe Adacot|98|704-987-0702|2015-09-29

Blaine Fawdry|99|477-109-9014|2012-07-14

Cleon Haresnape|100|625-338-3965|2014-12-04

建表语句：

create table info(

name string,

id int,

iphone string,

data date)

row format delimited

fields terminated by '|'

lines terminated by '\n';

传入文件： hdfs dfs -put employee_hr.txt /opt/hive/warehouse/hivetest2.db/info

（2）创建带有分区的表

create table employee_partition(

name string,

address array<string>,

info struct<gender:string,age:int>,

techol map<string,int>,

jobs map<string,string>)

partitioned by (country string,add string)

row format delimited

fields terminated by '|'

collection items terminated by ','

map keys terminated by ':'

lines terminated by '\n';

（3）创建外表

create external table emp_id(

name string,

id int,

address array<string>,

info struct<sex:string,age:int>,

workAndRole map<string,int>,

jobAndRole map<string,string>)

row format delimited

fields terminated by '|'

collection items terminated by ','

map keys terminated by ':'

lines terminated by '\n'

stored as textfile

location '/usr/test/employ';

外表的地址不在库中，而在只是会加载hdfs上的表

并且删除外表只会删除元数据信息，而内部表则会把对应的数据也删除掉

（4）创建表并分桶

1.set hive.enforce.bucketing=true;//设置强制分桶

2.set mapreduce.job.reduce=3;//设置分桶的数量；

3.创建表

create table emp_id2(

name string,

id int,

address array<string>,

info struct<sex:string,age:int>,

workAndRole map<string,int>,

jobAndRole map<string,string>)

clustered by (id) into 3 buckets

row format delimited

fields terminated by '|'

collection items terminated by ','

map keys terminated by ':'

lines terminated by '\n';

插入数据自动分桶：

insert into emp_id2 select * from emp_id;

查看表的结构信息

dfs -ls /opt/hive/warehouse/hivetest2.db/emp_id2;

(2).查询:

1.查看表

show tables；

2.查看表结构

desc 表名

3.查看表中具体内容

（1）如果查询内容类型是array，本质是一个数组，下标表示第几个

select 字段名[0] from 表名

（2）查询内容类型是map<key,value>

1.select 字段名[key] from 表名

这样得出来的是这个相同key值的不同value

2.select map_key(map的字段名) from 表名；

得出来得是该字段名得所有key

3.select size(map的字段名) from 表名；

得出每个该字段map的长度

4.select map的字段名 from 表名

得出所有该字段的所有键值对

（3）查询struct类型

select 字段名.属性 from 表名

4.向表中插入内容

方法1:在hdfs路径下，将文本文件放到相应的表目录下

方法2：insert into table 表名 select 字段名 from 表名

方法3：导入本地的文本文档

load data local inpath "路径名/employee.txt"

into table employee_partition

partition(country='china',add='nanjing') ;

方法4：导入hdfs路径上的文件

load data inpath '路径/employee.txt'

into table employee_partition

partition (country='china',add='liaoning');

注意：此方法导入过后，该文件在hdfs原路径中将不存在

方法5：向分区表中插入：

表内容：

create table p_test(

pid int,

pname string)

partitioned by (person string)

row format delimited

fields terminated by ','

lines terminated by '\n'

插入： insert into p_test partition (person='sam') values(1,'a'),(2,'b'),(3,'c');

(5)用sqlyog来创建数据库并添加内容顺序

找到hive数据库

并依次打开填写下面的表：

CDS-->SERDES-->SDS-->TBLS-->DBS-->COLUMNS_V2

COLUMNS_V2

分别表示所在表的位置，注释，字段名，字段类型，字段索引

DBS

分别表示 databaseID，数据库是默认还是自建，数据库的UI，数据库名，用户名，所有者类型，一般为user

SERDES

存放的是jar包指向地址，会去SERDES PRAMS调用方法

CDS

存放CD_ID

SDS存放jar包和方法以及CD_ID,SERDE_ID

TBLS：有关表的各个内容

三、补充：

hive（hiveserver）和beeline（hiveserver2）的区别：

hive不需要启动服务再访问

beeline需要先启动服务端，再访问客户端

beeline在查询效率上比hive高，beeline不支持update和delete

普普通通小易

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
2020.9.16hive基础

目录一、hive1.定义2.历史和版本：3.Hive的优势和特点4.Hive元数据管理5.Hive体系架构6.命令窗口模式7.进入交互模式二、hive命令1.创建数据库2.删除数据库2.加载文件（对所有表通用）3.库级操作4.表级操作(1)建表步骤：（2）创建带有分区的表（3）创建外表（4）创建表并分桶(2).查询:1.查看表2.查看表结构3.查看表中具体内容三、补充：一、hive1.定义基于Ha.
复制链接

扫一扫