hive的行转列函数

collect_set函数

1. 建表语句

		drop table if exists stud; 
		create table stud (name string,area string ,course string,score int);

2.插入数据

		insert into table stud values ('zhangsan','baijin','math',88);
		insert into table stud values ('lisi','shanghai','chese',90);
		insert into table stud values ('wangwu','tianjin','eningsh',95);
		insert into table stud values ('zhaoliu','baijin','eningsh',90);
		insert into table stud values ('tan7','shanghai','chese',80);

3. 查看表格

hive (default)> select * from stud
              > ;
OK
stud.name	stud.area	stud.course	stud.score
zhangsan	baijin		math			88
lisi		shanghai	Chinese			90
wangwu		tianjin		English			95
zhaoliu		baijin		English			90
tan7		shanghai	Chinese			80
Time taken: 0.06 seconds, Fetched: 5 row(s)

4.算出各课程的平均数,要附带上地区(这个时候就需要行转列函数了,因为课程平均数只有一行,但是地区有多个,就需要把多个地区放在一起,用到了行转列函数(collect_set))

hive (default)> select course ,collect_set(area) areas ,avg(score) avg_score from stud group by course;
Query ID = atguigu_20191219111821_b9373434-fee3-4083-be12-2ba180a39e99
Total jobs = 1
Launching Job 1 out of 1


Status: Running (Executing on YARN cluster with App id application_1576713221960_0006)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
Reducer 2 ......   SUCCEEDED      1          1        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 6.00 s     
--------------------------------------------------------------------------------
OK
course	areas					avg_score
Chinese	["shanghai"]			85.0
English	["tianjin","baijin"]	92.5
math	["baijin"]				88.0
Time taken: 7.257 seconds, Fetched: 3 row(s)

5.也可以单独的取指定的位置合并后的地区

hive (default)> select course ,collect_set(area)[1] area ,avg(score) avg_score from stud group by course;
Query ID = atguigu_20191219112100_79c34b1d-61a7-4716-9545-05dbe6aa0f9f
Total jobs = 1
Launching Job 1 out of 1


Status: Running (Executing on YARN cluster with App id application_1576713221960_0006)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
Reducer 2 ......   SUCCEEDED      1          1        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 5.71 s     
--------------------------------------------------------------------------------
OK
course	area	avg_score
Chinese	NULL	85.0
English	baijin	92.5
math	NULL	88.0
Time taken: 6.687 seconds, Fetched: 3 row(s)
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值