Hive—(行转换列,列转行(聚集))

行<=>列转换

原始数据:

-- 数据:id course
1 java
1 hadoop
1 hive
1 hbase
2 java
2 hive
2 spark
2 flink
3 java
3 hadoop
3 hive
3 kafka

所需结果:

-- 结果如下(1表示选修,0表示未选修)
id java hadoop hive hbase spark flink kafka
1   1     1      1    1     0     0     0
2   1     0      1    0     1     1     0
3   1     1      1    0     0     0     1

解决思路: 使用case when;group by + sum,通过case when进行选修和未选修的判断,再通过sum在当前求和,通过group by 分组

hive (default)> select id,
              > sum(case when course="java" then 1 else 0 end) as java,
              > sum(case when course="hadoop" then 1 else 0 end) as hadoop,
              > sum(case when course="hive" then 1 else 0 end) as hive,
              > sum(case when course="hbase" then 1 else 0 end) as hbase,
              > sum(case when course="spark" then 1 else 0 end) as spark,
              > sum(case when course="flink" then 1 else 0 end) as flink,
              > sum(case when course="kafka" then 1 else 0 end) as kafka
              > from rowline1
              > group by id;
Query ID = root_20221218153246_1df8e0f4-f57e-4f48-92fe-07bc86490bbb
Total jobs = 1
Launching Job 1 out of 1
Tez session was closed. Reopening...
Session re-established.
Status: Running (Executing on YARN cluster with App id application_1671347868984_0002)

----------------------------------------------------------------------------------------------
        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED  
----------------------------------------------------------------------------------------------
Map 1 .......... container     SUCCEEDED      1          1        0        0       0       0  
Reducer 2 ...... container     SUCCEEDED      1          1        0        0       0       0  
----------------------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 9.32 s     
----------------------------------------------------------------------------------------------
OK
id	java	hadoop	hive	hbase	spark	flink	kafka
1	1	1	1	1	0	0	0
2	1	0	1	0	1	1	0
3	1	1	1	0	0	0	1

列转行(聚集)

原始数据。

id1 id2 flag
a b 2
a b 1
a b 3
c d 6
c d 8
c d 8

所需结果

id1 id2 flag
a b 2|1|3
c d 6|8

解决方案

  1. 首先将flag聚集起来,然后再拼接在一起
hive (default)> select id1,id2,concat_ws("|",collect_set(cast (flag as string))) flag
              > from rowline2
              > group by id1,id2;
Query ID = root_20221218154421_20995988-3030-42f2-a207-286c9bb7ed9c
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1671347868984_0003)

----------------------------------------------------------------------------------------------
        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED  
----------------------------------------------------------------------------------------------
Map 1 .......... container     SUCCEEDED      1          1        0        0       0       0  
Reducer 2 ...... container     SUCCEEDED      1          1        0        0       0       0  
----------------------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 5.58 s     
----------------------------------------------------------------------------------------------
OK
id1	id2	flag
a	b	2|1|3
c	d	6|8
Time taken: 6.041 seconds, Fetched: 3 row(s)
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Hive中的行转列列转行可以通过使用函数EXPLODE和LATERAL VIEW来实现。函数EXPLODE(col)用于将Hive的一中复杂的数组或者Map结构拆分成多。LATERAL VIEW是用于和split、explode等函数一起使用的,可以将一展开成多或将多合并成一。 举个例子来说明,假设我们有一个表person_info,其中包含name、constellation和blood_type三。要将表中的name中的值拆分成多,可以使用LATERAL VIEW和EXPLODE函数,具体的语句如下: SELECT t1.base, concat_ws('|', collect_set(t1.name)) name FROM (SELECT name, concat(constellation, ",", blood_type) base FROM person_info) t1 GROUP BY t1.base; 在上述查询中,通过LATERAL VIEW和EXPLODE函数,将name拆分成多,然后使用concat_ws函数将拼接后的结果作为新的name。最后,通过GROUP BY对base分组。这样就实现了Hive中的行转列操作。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* [hive列转行案例](https://download.csdn.net/download/weixin_38581777/14037437)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_1"}}] [.reference_item style="max-width: 50%"] - *2* *3* [hive操作(行转列列转行)](https://blog.csdn.net/aiduo3346/article/details/102085019)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_1"}}] [.reference_item style="max-width: 50%"] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值