hive的sql题练习,关联查询语句,排序

最新推荐文章于 2024-05-21 15:21:28 发布

一把秀儿

最新推荐文章于 2024-05-21 15:21:28 发布

阅读量306

点赞数

分类专栏： linux和hadoop生态

本文链接：https://blog.csdn.net/m0_52106226/article/details/110477321

版权

linux和hadoop生态专栏收录该内容

7 篇文章 0 订阅

订阅专栏

`count(1)  统计这一列有多少个单`位
sql语法执行顺序
select
            4 执行
from		1
tb_name 
where		2
group by    3分组
having      5
order by	6全局排序
limit
****insert into 要插入的表名 values(要插入的一行数据中间','隔开)    表中插入一行数据

关联查询的语句总结

 join,inner join,left join,right join,full join ,union,left semi join
join    --关联查询     如下
select *
from
a
join
b
join
c
on  a.id=b.id and b.id=c.id 
left join --左关联 
right join --又关联
innerjoin -- 内联   如下  跟直接join的结果一样  都会产生笛卡尔积
select *
from a
inner join b
left semi join    --类似于子查询  把左表当主表 返回关联条件相同的左表数据
select            --注意只返回左表数据  条件是右表也有的关联数据的
*
from
tb_b
left semi join 
tb_a
on tb_a.id = tb_b.id ;
union  --连接两个查询的结果集  要求字段个数和数据类型一致  union all不会去重   union去重
select  --注意必须两个表数据列一致 数据类型一致才可关联
*
from
tb_a
union all
select
*
from
tb_a ; 
full join   -- 关联俩表  没有的数据用null补齐
select *
from a
full join b
on a.id = b.id ;

排序

在执行SQL的时候默认是一个reducetesk 
set mapreduce.job.reduces=n;  -- 配置reduce的个数 
set mapreduce.job.reduces;  -- 查看配置结果

order by   全局最终结果排序 
distribute by   指定分区字段    分区
sort by     -- 区内数据排序 
cluster by   当分区字段和排序字段相同 并且是升序的时候使用cluster by 替换 distribute by sort by

order by --全局最终结果排序
select * from tb_a order by id   --不写默认升序(asc)
select * from tb_a order by id desc   --倒序
---------------------------------------------------------------------------------------
distribute by --指定分区字段 分区
select * from tb_x distribute by name; --指定分区字段    分区
--------------------------------------------------------------------------------------
sort by --区内数据排序
select *  from tb_x  distribute by  name  sort by name  desc; --和distribute by联合使用的时候sort在后
---------------------------------------------------------------------------------------
cluster by --当分区字段和排序字段相同 并且是升序的时候使用cluster by 替换 distribute by sort by
select *  from tb_x  distribute by  name  sort by name  desc;
select *  from tb_x  cluster by  name ; --使用cluster代替distrubute和sort,只能升序使用

王奔	A	男
娜娜	A	男
宋宋	B	男
凤姐	A	女
热巴	B	女
慧慧	B	女
create table tb_emp(
name string ,
dname string ,
gender string 
)
row format delimited fields terminated by "\t" ;
load data local inpath "/data/12.2/" into table tb_emp ;
根据分组AB分组然后求出男女各多少
      男     女
A     2       1
B     1       2

select 
dname ,
sum(case gender when '男' then 1 else 0 end) f ,  --分组中如果是男就返回1,再sun聚合相加
sum(case gender when '女' then 1 else 0 end) m
from
tb_emp
group by
dname ;

一把秀儿

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
hive的sql题练习,关联查询语句,排序

`count(1) 统计这一列有多少个单`位sql语法执行顺序select 4 执行from 1tb_name where 2group by 3分组having 5order by 6全局排序limit****insert into 要插入的表名 values(要插入的一行数据中间','隔开) 表中插入一行数据关联查询的语句总结 join,inner join,left join,right join,full join ,un
复制链接

扫一扫

专栏目录