创建一个alice 表:
create table alice(line string);
导入alice-in-wonderland.txt
统计每一行有多少个单词
select split(line,' ') from alice where line!='' limit 3;
统计每一行有多少个单词
select size(split(line,' ')) from alice;
但是其中包含很多空字符串,去掉空字符串
desc function array;
select * from alice where line!='aa';
select sum(size(split(line,' '))) from alice where line!='';
在hive中:函数分为:
udf:单行函数,一条记录一个返回结果
udaf:聚合函数
udtf:一条数据进去,多条记录出来,如explode
排序:
with av as (select explode(split(line,' ')) as one from alice
where line!='')
select av.one,count(1) as one_num from av group by one order by one_num desc limit 50;