关于jps进程中的runjar解决办法
kill -9 runjar进程号
– win电脑不要在云实验平台中使用ctrl+c 复制内容
su - root # 切换到root用户,并且使用root用户对应的环境变量
– 注意在无论进行何种表链接时,一定要将各种表进行重命名,防止错误发生
一 内连接
join 和inner join 是相同的
1.找出既在user_list_1也在user_list_2的用户:
方法一:
select *
from user_list_1 a
join user_list_2 b on a.user_id=b.user_id;
方法二:
select *
from user_list_1 a
inner join user_list_2 b on a.user_id=b.user_id
limit 10;
需求1:找出在2019年购买后又退款的用户
select a.user_name
from
(select distinct user_name
from user_trade
where year(dt)=2019)a
join
(select distinct user_name
from user_refund
where year(dt)=2019)b
on a.user_name=b.user_name;
需求2:在2017年和2018年都购买的用户
select a.user_name
from
(select distinct user_name
from user_trade
where year(dt)=2017)a
join
(
select distinct user_name
from user_trade
where year(dt)=2018)b
on a.user_name=b.user_name;
需求3:在2017年、2018年、2019都有交易的用户
方法1:推荐使用
select distinct a.user_name
from
((select distinct user_name
from trade_2017)a
join
(select distinct user_name
from trade_2018)b
on a.user_name=b.user_name
join
(select distinct user_name
from trade_2019)c
on b.user_name=c.user_name);
方法2:前提是三个表数据量都特别少
select distinct a.user_name
from trade_2017 a
join trade_2018 b
on a.user_name=b.user_name
join trade_2019 c
on b.user_name=c.user_name;
总结:内连接就是将多个表合并,以一定的字段为依据取出共有的行
二.左连接
对表1和表2进行左连接
select *
from user_list_1 a
left join user_list_2 b
on a.user_id=b.user_id;
如何取出,在user_list_1表中但是不在user_list_2的用户?
select a.user_id,a.user_name
from user_list_1 a
left join user_list_2 b
on a.user_id=b.user_id
where b.user_id is null;
– 注意: 此时查询时,一定要查a表的 此时差b表结果显示空
需求4:在2019年购买,但是没有退款的用户
select a.user_name
from
((select distinct user_name
from user_trade
where year(dt)=2019)a
left join
(select distinct user_name
from user_refund
where year(dt)=2019)b
on a.user_name=b.user_name)
where b.user_name is null;
需求5:在2019年有购买的用户的学历分布
select b.edu,count(a.user_name)
from
((select distinct user_name
from user_trade
where year(dt)=2019)a
left join
(select distinct user_name,get_json_object(extra1,'$.education')as edu
from user_info)b
on a.user_name=b.user_name)
group by b.edu;
需求6:在2017和2018年都购买,但是没有在2019年购买的用户
select a.user_name
from
(select distinct user_name
from trade_2017)a
left join
(select distinct user_name
from trade_2018)b
on a.user_name=b.user_name
left join
(select distinct user_name
from trade_2019)c
on b.user_name=c.user_name
where c.user_name is null;
总结:left join 为左连接,以左边的表的某一个字段为依据,将多个表
进行拼接,若第二个或第三个表中的数据在最左边的数据没有,则显示空值.
若要取出空值,查询时用左表查询,条件为空值的表某字段为空值.
三 全连接 full join
对表1和表2进行全连接
select *
from user_list_1 a
full join user_list_2 b
on a.user_id=b.user_id;
注意:全连接是纵向连接,和左连接的连接方法一样,没有的显示空值
四.union all/union
将user_list_1和user_list_3合并在一起:
select user_id,user_name
from user_list_1
union all
select user_id,user_name
from user_list_2;
注意:全连接是横向连接,将所有字段都拼接在一块,
字段名称必须一致!
字段顺序必须一致!
没有连接条件!
需求7:2017-2019年有交易的所有用户数
方法3: 推荐写法
select count(distinct a.user_name)
from
(select distinct user_name
from trade_2017
union all
select distinct user_name
from trade_2018
union all
select distinct user_name
from trade_2019)a;
需求7:2017-2019年有交易的所有用户数
方法一:
select count(distinct a.user_name),
count(a.user_name)
from
(select user_name
from trade_2017
union all
select user_name
from trade_2018
union all
select user_name
from trade_2019)a;
方法2:
select count(distinct a.user_name),
count(a.user_name)
from
(select user_name
from trade_2017
union
select user_name
from trade_2018
union
select user_name
from trade_2019)a;
需求8:2019年每个用户的支付和退款金额汇总
1. 从两个表中分别筛选出2019年的每个用户的支付和退款总金额
2. 汇总
select a.user_name,sum(total_amount),sum(total_refund)
from
(select user_name,sum(pay_amount) total_amount,0 as total_refund
from user_trade
where year(dt)=2019
group by user_name
union all
select user_name,0 as total_amount,sum(refund_amount) total_refund
from user_refund
where year(dt)=2019
group by user_name)a
group by a.user_name;
需求9:2019年每个支付用户的支付金额和退款金额
1. 把2019年每个支付用户求出来, 并且拿出来总的支付金额
2. 得出2019年每个退款用户, 总退款金额拿出来
3, 左连接获取到以支付金额为主表的需求结果
select a.user_name,a.total_amount,b.total_refund
from
(select user_name,sum(pay_amount) total_amount
from user_trade
where year(dt)=2019
group by user_name)a
left join
(select user_name,sum(refund_amount) total_refund
from user_refund
where year(dt)=2019
group by user_name)b
on a.user_name=b.user_name;
– 解决b.refund_amount的null值
select a.user_name,a.total_amount,
if(b.refund_amount is null, 0, b.refund_amount)
from
(select user_name, sum(pay_amount) total_amount
from user_trade
where year(dt)=2019
group by user_name)a
left join
(select user_name,sum(refund_amount) refund_amount
from user_refund
where year(dt)=2019
group by user_name)b
on a.user_name=b.user_name;
多表连接容易犯的错误
select * FROM
a join b join c on a.xx=b.xx=c.xx;
需求10:首次激活时间在2017年,但是一直没有支付的用户年龄段分布
/*
1.限制时间在2017年的用户并且把年龄段筛选出来
2.没有支付的人选出来
3.年龄段分布
*/
select a.age_range,count(a.user_name)
from
(select user_name,
case when age<20 then '20岁以下'
when age>=20 and age<30 then '20-30岁'
when age>=30 and age<40 then '30-40岁'
else '40岁以上'
end age_range
from user_info
where year(firstactivetime)=2017)a
left join
(select distinct user_name
from user_trade
where dt>'0')b
on a.user_name=b.user_name
where b.user_name is null
group by a.age_range;
注意:dt是个分区字段,使用where 加分区字段来构建分区表,分区的最终
目的是在查询时,使用分区列过滤!
需求11:2018 2019年交易的用户,其激活时间段分布
/*
1.去除2018 2019年有交易的用户的全集
2.取出所有用户的激活时间
3.统计时间段分布
*/
select hour(firstactivetime),count(a.user_name)
from
(select user_name
from trade_2018
union
select user_name
from trade_2019)a
left join
user_info b
on a.user_name=b.user_name
group by hour(firstactivetime);
注意:union 去重且排序
union all 不去重且不排序