目录
目录
CTE (CTAS with Common Table Expression)
Hive数据排序 - sort bydistribute by
hive两种启动方式
第一种:[root@gree143 ~]# hive
第二种:
[root@gree143 ~]# hive --service hiveserver2beeline -u jdbc:hive2://IP地址:10000
[root@gree143 ~]# beeline -u jdbc:hive2://192.168.61.146:10000
注意:hive --service hiveserver2这样的话就会占用两个窗口,一个显示操作是否成功,一个为hive操作。可以使用下面的语句让反馈信息放到一个黑洞里面这样就不用占用一个窗口用来反馈信息
[root@hadoop3 stufile]nohup hiveserver2 1>/dev/null 2>&1 &
安装网络工具
[root@gree143 ~]# yum -y install net-tools
在hdfs上给定文件执行,读写权限
[root@gree143 ~]# hdfs dfs -chmod -R 777 /tmp
[root@gree143 ~]# hadoop fs -chmod -R 777 /tmp
查看数据库详细信息
0: jdbc:hive2://192.168.61.146:10000>describe database default;
查看当前所在的数据库
0: jdbc:hive2://192.168.61.146:10000>select current_database();
建库,建表
0: jdbc:hive2://192.168.61.146:10000>create database if not exists aabb;
0: jdbc:hive2://192.168.61.146:10000>create table if not exists exam(id int, name string);
查看表信息
0: jdbc:hive2://192.168.61.146:10000>describe exam;
查看建表语句
0: jdbc:hive2://192.168.61.146:10000>show create table examStu1;
删除表
0: jdbc:hive2://192.168.61.146:10000>drop table if exists examStu1;
删库
0: jdbc:hive2://192.168.61.146:10000>drop database if exists aabb;
强制删库
0: jdbc:hive2://192.168.61.146:10000>drop database if exists bigdata cascade;
hive复杂类型--集合数据类型
map 是一种(key-value)键值对类型;
array 是一种数组类型,array 中存放相同类型的数据;
struct 是一种集合类型。
1.map
map中保存的是k-v对的数据,map名[key值] = 对应的value值
select map(“aa”,1,1,“bb”) > {“aa”:“1”,“1”:“bb”}
select map(“aa”,1,1,“bb”)[“aa”] > 1定义通常定义为:MAP<string,string>
2.struct
struct中保存的是value值,相当于对象,stuct名.col1 = value1
select struct(1,2,3) > {“col1”:1,“col2”:2,“col3”:3}
select struct(1,2,3).col1 > 1定义通常定义为:STRUCT<fruit:string,weight:int>
3.array
array中保存的是value值,array名[index值] = 对应位置的value值(index从0开始)
select array(1,2,3) > [1,2,3]
select array(1,2,3)[0] > 1定义通常定义为:ARRAY<string>
hive的数据结构
创建一个student.txt文件,将下面的信息放入该文件
vim student.txt
id,姓名,爱好,住址
1,xiaoming1,lol-book-movie,beijing:bdqn-nanjing:zhongbo
2,xiaoming2,lol-book-movie,beijing:bdqn-nanjing:zhongbo
3,xiaoming3,lol-book-movie,beijing:bdqn-nanjing:zhongbo
4,xiaoming4,lol-book-movie,beijing:bdqn-nanjing:zhongbo
5,xiaoming5,lol-movie,beijing:bdqn-nanjing:zhongbo
6,xiaoming6,book-movie,beijing:bdqn-nanjing:zhongbo
7,xiaoming7,lol-book,beijing:bdqn-nanjing:zhongbo
8,xiaoming8,lol-book,beijing:bdqn-nanjing:zhongbo
9,xiaoming9,lol-book-movie,beijing:bdqn-nanjing:zhongbo
数据表
数据表分为内部表和外部表
内部表(管理表)
- HDFS中为所属数据库目录下的子文件夹
- 数据完全由Hive管理,删除表(元数据)会删除数据
外部表(External Tables)
- 数据保存在指定位置的HDFS路径中
- Hive不完全管理数据,删除表(元数据)不会删除数据
创建内部表
create table student(
id int,
name string,
likes array<string>,
address map<string,string>
)
row format delimited fields terminated by ','
collection items terminated by '-'
map keys terminated by ':'
lines terminated by '\n';
row format delimited 含义是:分隔符设置开始语句
fields terminated by '|' 含义是:设置字段与字段之间的分隔符为“|”
collection items terminated by ',' 含义是:设置一个复杂类型(array,struct)字段的各个item之间的分隔符为 “,”
map keys terminated by ':' 含义是:设置一个复杂类型(Map)字段的key value之间的分隔符为 “:”
lines terminated by '\n'; 含义是:设置行与行之间的分隔符为 “\n”
加载本地数据
0: jdbc:hive2://192.168.61.146:10000>load data local inpath '/opt/student.txt' into table student;
加载HDFS文件
0: jdbc:hive2://192.168.61.146:10000>load data inpath '/student.txt' into table student;
加载数据后覆盖原数据
0: jdbc:hive2://192.168.61.146:10000>load data inpath '/student.txt' overwrite into table student;
创建分区表
create table student2(
id int,
name string,
likes array<string>,
address map<string,string>
)
partitioned by (age int)
row format delimited fields terminated by ','
collection items terminated by '-'
map keys terminated by ':'
lines terminated by '\n';
创建分区表插入数据
0: jdbc:hive2://192.168.61.146:10000>load data local inpath '/opt/student.txt' into table student2 partition(age=20);
0: jdbc:hive2://192.168.61.146:10000>load data local inpath '/opt/student.txt' into table student2 partition(age=30);
查看分区表的信息
show partitions student2;
多字段分区表
create table student3(
id int,
name string,
likes array<string>,
address map<string,string>
)
partitioned by (age int, gender string)
row format delimited fields terminated by ','
collection items terminated by '-'<