CC00014.hadoop——|Hadoop&Hive.V14|——|Hive.v14|Hive函数用法.v02|

一、UDTF函数【重要】
### --- UDTF函数【重要】

~~~     # UDTF : 
~~~     User Defined Table-Generating Functions。
~~~     用户定义表生成函数,一行输入,多行输出。
### --- explode,炸裂函数
~~~     就是将一行中复杂的 array 或者 map 结构拆分成多行

hive (mydb)> select explode(array('A','B','C')) as col;
col
A
B
C

hive (mydb)> select explode(map('a', 8, 'b', 88, 'c', 888));
key value
a   8
b   88
c   888
~~~     UDTF's are not supported outside the SELECT clause, nornested in expressions
~~~     SELECT pageid, explode(adid_list) AS myCol... is not supported
~~~     SELECT explode(explode(adid_list)) AS myCol... is not supported
~~~     lateral view 常与 表生成函数explode结合使用
~~~     lateral view 语法:

lateralView: LATERAL VIEW udtf(expression) tableAlias AS
columnAlias (',' columnAlias)*
fromClause: FROM baseTable (lateralView)*
~~~     # lateral view 的基本使用

hive (mydb)>  with t1 as (
select 'OK' cola, split('www.yanqi.com', '\\.') colb
)
select cola, colc
from t1
lateral view explode(colb) t2 as colc;
--输出参数
cola    colc
OK  www
OK  yanqi
OK  com
二、UDTF 案例1:
~~~     # 数据(uid tags):

[root@linux123 ~]# vim /home/hadoop/data/tab1.txt
1   1,2,3
2   2,3
3   1,2
--编写sql,实现如下结果:
1   1
1   2
1   3
2   2
2   3
3   1
3   2
~~~     # 建表加载数据

hive (mydb)> create table market(
id int,
storage string,
allocation string,
outdt string
)
row format delimited fields terminated by '\t';

hive (mydb)> load data local inpath '/home/hadoop/data/tab1.txt' into table market;
~~~     # SQL

hive (mydb)> select uid, tag
from t1
lateral view explode(split(tags,",")) t2 as tag;
三、DTF 案例2:
~~~     # 数据准备

[root@linux123 ~]# vim /home/hadoop/data/score.dat
lisi|Chinese:90,Math:80,English:70
wangwu|Chinese:88,Math:90,English:96
maliu|Chinese:99,Math:65,English:60
~~~     # 创建表
hive (mydb)> create table studscore(
name string
,score map<String,string>)
row format delimited
fields terminated by '|'
collection items terminated by ','
map keys terminated by ':';

~~~     # 加载数据
hive (mydb)> load data local inpath '/home/hadoop/data/score.dat' overwrite
into table studscore;
~~~     # 需求:找到每个学员的最好成绩
~~~     第一步,使用 explode 函数将map结构拆分为多行
hive (mydb)> select explode(score) as (subject, socre) from studscore;
subject socre
Chinese 90
Math    80
English 70
Chinese 88
Math    90
English 96
Chinese 99
Math    65
English 60

~~~     # 但是这里缺少了学员姓名,加上学员姓名后出错。下面的语句有是错的
hive (mydb)> select name, explode(score) as (subject, socre) from studscore;
~~~     # 第二步:explode常与 lateral view 函数联用,这两个函数结合在一起能关联其他字段

hive (mydb)> select name, subject, score1 as score from studscore
lateral view explode(score) t1 as subject, score1;
name    subject score
lisi    Chinese 90
lisi    Math    80
lisi    English 70
wangwu  Chinese 88
wangwu  Math    90
wangwu  English 96
maliu   Chinese 99
maliu   Math    65
maliu   English 60
~~~     # 第三步:找到每个学员的最好成绩
hive (mydb)> select name, max(mark) maxscore
from (select name, subject, mark
from studscore lateral view explode(score) t1 as
subject, mark) t1
group by name;
--输出参数
name    maxscore
wangwu  96
lisi    90
maliu   99

hive (mydb)>  with tmp as (
select name, subject, mark
from studscore lateral view explode(score) t1 as subject,
mark
)
select name, max(mark) maxscore
from tmp
group by name;
--输出参数
name    maxscore
wangwu  96
lisi    90
maliu   99
四、小结:
### --- 小结:

~~~     将一行数据转换成多行数据,可以用于array和map类型的数据;
~~~     lateral view 与 explode 联用,解决 UDTF 不能添加额外列的问题
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

yanqi_vip

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值