HIVE统计WordCount

HIVE WORDCOUNT

目录

HIVE WORDCOUNT

一、WORDCOUNT

1.我们先创建一个新的数据库

2.创建表并插入数据

3.统计WORDCOUNT

4.UNION ALL 用法

5.WITH AS 用法


1.WORDCOUNT

1)我们先创建一个新的数据库

create database learn3;

use learn3;

2)创建表并插入数据

CREATE TABLE learn3.wordcount(
word STRING COMMENT "单词"
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE;

INSERT INTO TABLE learn3.wordcount (word) VALUES ("hello,word"),("hello,java"),("hive,hello");

3)统计WORDCOUNT

select split(word,",")[0] clo1
from learn3.wordcount;


select split(word,",")[1] clo2
from learn3.wordcount;

select split(word,",")[0] clo1
,split(word,",")[1] clo2
from learn3.wordcount;

select count(*) as num
,c1.clo1 as clo from
(
select split(word,",")[0] clo1
,split(word,",")[1] clo2
from learn3.wordcount
) c1 group by clo1;

select count(*) as num
,c2.clo2 as clo from
(
select split(word,",")[0] clo1
,split(word,",")[1] clo2
from learn3.wordcount
) c2 group by clo2;

select count(*) as num
,c1.clo1 as clo from
(
select split(word,",")[0] clo1
,split(word,",")[1] clo2
from learn3.wordcount
) c1 group by clo1
UNION ALL
select count(*) as num
,c2.clo2 as clo from
(
select split(word,",")[0] clo1
,split(word,",")[1] clo2
from learn3.wordcount
) c2 group by clo2;

select sum(num)
,c.clo from
(
select count(*) as num
,c1.clo1 as clo from
(
select split(word,",")[0] clo1
,split(word,",")[1] clo2
from learn3.wordcount
) c1 group by clo1
UNION ALL
select count(*) as num
,c2.clo2 as clo from
(
select split(word,",")[0] clo1
,split(word,",")[1] clo2
from learn3.wordcount
) c2 group by clo2) c group by c.clo;

2.UNION ALL 用法

在 Hive 中,`UNION ALL` 用于合并多个查询结果集,包括所有重复的行。

假设我们有两张表 `table1` 和 `table2`,它们具有相同的列结构。现在,我们想要将这两张表中的数据合并成一个结果集,而不去除任何重复的行,就可以使用 `UNION ALL`。

下面是一个简单的示例:

```sql
SELECT * FROM table1
UNION ALL
SELECT * FROM table2;
```

这个查询将会返回 `table1` 和 `table2` 中所有的行,包括重复的行。如果你只想返回不重复的行,可以使用 `UNION` 而不是 `UNION ALL`。 

需要注意的是,使用 `UNION ALL` 时,两个查询的列数和数据类型必须完全相同。

3.WITH AS 用法

格式:

WITH table1  AS (
 SELECT 查询语句1
)
, table2 AS (
 SELECT 查询语句2
)

[INSERT INTO TABLE] SELECT

FROM


 

WITH split_res AS
(
select split(word,",")[0] as clo1,
split(word,",")[1] as clo2
from learn3.wordcount
)
,c1 AS
(
select clo1 as clo,count(*) as num
from split_res group by clo1
)
,c2 AS
(
select clo2 as clo,count(*) as num
from split_res group by clo2
)
select
c.clo,sum(num)
from
(
select * from c1
UNION ALL
select * from c2
) c group by clo;

  • 9
    点赞
  • 13
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值