知乎er最想娶的数据女神 @空白白白白 组建的社区竞赛。
为了防止被投诉打广告,就不列相关信息了。
SQL挑战赛第一期
题目:
1: 编写一个查询,列出员工姓名列表,员工员工每月工资超过2000美元且员工人数少于10个月。通过提升employee_id对结果进行排序。
select * from
employee where
salary > 2000 and
months <10
order by employee_id
2: 查询 Employee 表格中以元音字母开头的 name 名字。结果不包含名字重复记录。
select distinct name
from employee
where name like 'a%'
or name like 'e%'
or name like'i%'
or name like'o%'
or name like 'u%'
3:编写一个查询,去掉一个最高收入,去掉一个最低收入,该公司员工平均收入是多少?
select avg(salary) from(
select salary from (
select salary from employee
order by salary
desc limit 11
)b
order by salary
limit 10
)c
4:简述NULL 空字符串 0的区别。
空字符串0是不占用空间的;
NULL是占用空间的。
SQL挑战赛第二期
关于查询所用的表,market_data表的字段介绍为:order_id(订单ID),order_time(订单时间),customer_name(用户名称),quantity(购买数量),sale(销售额),profit(利润)。
各项指标的定义为:
R值为:用户最后一次购买到现在(2016年12月31日)的时间间隔,输出月份。
L值为:用户第一次购买和最后一次购买之间的时间间隔,输出月份。
F值为:用户的总共购买次数,仅计算2016年的即可。
M值为:用户的全部销售额,仅计算2016年的即可。
题目:
1. 查询所有用户的R值和L值。
select customer_name,
TIMESTAMPDIFF(month,maxdate,curdate()) as R,
TIMESTAMPDIFF(month,mindate,maxdate) as L
from(
select customer_name,
max(order_time) as maxdate,
min(order_time) as mindate
from market_data
group by customer_name
)b
2.查询用户的R值,F值和M值,注意F值和M值,仅计算2016年度的数字。
select customer_name,
TIMESTAMPDIFF(month,maxdate,curdate()) as R,
F,
ROUND(M)
from(
select customer_name,
max(order_time) as maxdate,
curdate(),
count(order_time) as F,
sum(sale) as M
from market_data
group by customer_name
)b
3.查询用户的R值,L值和用户生命周期划分。生命周期划分如下:
(新用户:R<=6 and L<=12;忠诚用户:R<=6 and L>12;
流失的老用户:R>6 and L>12; 一次性用户:R>6 and L<=12)。
select customer_name,
case when R<=6 and L<=12 then '新用户'
when R<=6 and L>12 then '忠诚用户'
when R>6 and L>12 then '流失的老用户'
when R>6 and L<=12 then '一次性用户'
else null
end as 用户生命周期
from (
select customer_name,
TIMESTAMPDIFF(month,maxdate,curdate()) as R,
TIMESTAMPDIFF(month,mindate,maxdate) as L
from (
select customer_name,
max(order_time) as maxdate,
min(order_time) as mindate
from market_data
group by customer_name
) b
) c
SQL挑战赛第三期
Cinema表结构各字段介绍如下:
Seat_id(座位号,依次递增),free(0表示有人,1表示空座),fare(对应座位的票价)。
题目为:
1:查找表中最便宜的票价是多少?
select seat_id,min(fare) from cinema
2:女友要求你定的座位必须是连续的(输出可用位置的seat_id)。
select a.seat_id,b.seat_id
from cinema a
join
cinema b
on a.seat_id+1=b.seat_id
where a.free=1
and b.free=1
3:女友要求买连续的座位中总价最低的两个座位(输出对应的seat_id和总价)。
select a.seat_id,b.seat_id,a.fare+b.fare
from cinema a
join
cinema b
on a.seat_id+1=b.seat_id
where a.free=1
and b.free=1
order by a.fare + b.fare limit 1
SQL挑战赛第四期
表内字段的解释如下: position(职位名称),min_salary(最低薪资,单位元),max_salary(最高薪资,单位元),city(工作城市),educational(学历要求),people(招聘人数),industry(行业)
题目:
1. 查找不同学历要求的平均最低薪资和平均最高薪资。
select educational,
avg(min_salary) as 平均最低薪资,
avg(max_salary) as 平均最高薪资
from employ
group by educational
2. 查找每个行业,最高工资和最低工资差距最大的职位名称。
select industry,
position_name,
max(工资差距)
from (
select industry,
position_name,
(max_salary-min_salary) as 工资差距
from employ
group by industry,position_name
) b
group by industry
3. 查找各个城市电商行业的平均薪资水平,并按照薪资高低水平进行排序。(岗位的薪资按照最低薪资和最高薪资的中位数来计算,注意要考虑到职位招聘人数)。
select city,
industry,
sum((min_salary+max_salary)/2*people)/sum(people) as 行业平均薪资
from employ
where industry='互联网/电子商务'
group by city
order by 行业平均薪资 desc
4.问答题:说明UNION和UNION ALL的差别。
两者都是对两个结果集进行并集操作。
但是前者不包括重复行;
后者包括重复行。
SQL挑战赛第五期
题目:
表中两列数据,一列是userid 一列是用户标签tag,标签类别总共10个,每个用户可以拥有不同的标签,不同标签之间用逗号间隔。
原始数据:
1:输出每个标签的用户数?
select tag,count(*) from (
select
userid,
substring_index(substring_index(user_tag.tag,',',idx.n),',',-1)as tag
from
(select 1 as n union select 2 union select 3 union select 4)as idx
join user_tag on length(user_tag.tag)-length(replace(user_tag.tag,',','')) +1 >=idx.n
) as tb1 group by tb1.tag
2:谁拥有的标签最多?
select userid,count(*) as count from (
select
userid,
substring_index(substring_index(user_tag.tag,',',idx.n),',',-1)as tag
from
(select 1 as n union select 2 union select 3 union select 4)as idx
join user_tag on length(user_tag.tag)-length(replace(user_tag.tag,',','')) +1 >=idx.n
) as tb1 group by tb1.userid order by count desc limit 1
3:那个标签拥有用户最多?
select tag,count(*) as count from (
select
userid,
substring_index(substring_index(user_tag.tag,',',idx.n),',',-1)as tag
from
(select 1 as n union select 2 union select 3 union select 4)as idx
join user_tag on length(user_tag.tag)-length(replace(user_tag.tag,',','')) +1 >=idx.n
) as tb1 group by tb1.tag order by count desc limit 1
SQL小挑战第六期 车流量问题
某市交通局为了监控车流信息,新进了一批摄像头,安装在交通要道,用于检测日常车流信息,让我们看看怎么处理这些信息呢?
摄像头收集到信息经过简化后数据结构如下,id(摄像头编号),date(时间),car(车流)
原始数据:
题目要求:
1:那天的车流量最高?
select date,car from traffic order by car desc limit 1
2:找出高峰期时段,要求连续三天及以上,并且每天人流量均不少于100。
select count(date) as countd,
GROUP_CONCAT(date)
from(
select date,
car,
datediff(now(),date)-(@b:=@b+1)
as dat1
from
traffic a,
(SELECT @b := 0) tmp_b
where car >= 100 order by date desc
)d
group by dat1 having countd >=3
3:如果要求连续10天以上,并且每天人流量均不少于100。你上述的方法还可行吗?有更好的建议吗?
可行,这个方法就不错。
SQL挑战赛第七期
数据
题目:
1.查找所有科目均及格(>=60分)的同学并列出其各科分数;
select b.id,
Cscore,
Escore,
math.score as Mscore
from(
select chinese.id as id,
chinese.score as Cscore,
english.score as Escore
from chinese
left join english on chinese.id=english.id
)b
left join math on b.id=math.id
where Cscore>=60 and Escore>=60 and math.score>=60
2.查找所有同学的总分(没有分数的科目按照0分计算)和总分名次,并按照名次进行排序;
select b.id,
Cscore +IFNULL(Escore,0)+IFNULL(math.score,0) as score
from(
select chinese.id as id,
chinese.score as Cscore,
english.score as Escore
from chinese
left join english on chinese.id=english.id
)b
left join math on b.id=math.id
union all
select id,score from english where id not in(select id from chinese)
order by score
3.查找三个学科分数的平均数,中位数和众数:
select ma.标签,median_score,avg_score,众数
from(select mmm.标签 as 标签,mmm.median_score as median_score,avg_score from(SELECT '英语'标签,AVG(ee.score) as median_score
FROM (
SELECT e.score, @rownum:=@rownum+1 as `row_number`, @total_rows:=@rownum
FROM english e, (SELECT @rownum:=0) r
WHERE e.score is NOT NULL
ORDER BY e.score
) as ee
WHERE ee.row_number IN ( FLOOR((@total_rows+1)/2), FLOOR((@total_rows+2)/2) )
union
SELECT '数学'标签,AVG(mm.score) as median_score
FROM (
SELECT m.score, @rownum:=@rownum+1 as `row_number`, @total_rows:=@rownum
FROM math m, (SELECT @rownum:=0) r
WHERE m.score is NOT NULL
ORDER BY m.score
) as mm
WHERE mm.row_number IN ( FLOOR((@total_rows+1)/2), FLOOR((@total_rows+2)/2) )
union
SELECT '语文'标签,AVG(aa.score) as median_score
FROM (
SELECT a.score, @rownum:=@rownum+1 as `row_number`, @total_rows:=@rownum
FROM chinese a, (SELECT @rownum:=0) r
WHERE a.score is NOT NULL
ORDER BY a.score
) as aa
WHERE aa.row_number IN ( FLOOR((@total_rows+1)/2), FLOOR((@total_rows+2)/2) ))mmm
left join
(select '语文'标签,avg(score) as avg_score from chinese
union
select '英语'标签,avg(score) as avg_score from english
union
select '数学'标签,avg(score) as avg_score from math)aaa on mmm.标签=aaa.标签)ma
left join
(select '语文'标签,众数 from( select c.score as 众数,count(c.score) as ccscore from chinese c group by c.score having ccscore >=
(select max(计分) from
(select score,count(score) as 计分 from chinese c group by score)r))rr
union
select '英语'标签,众数 from(select e.score as 众数,count(e.score) as eescore from english e
group by e.score having eescore >=
(select max(计分) from
(select score,count(score) as 计分 from english e group by score)r))rr
union
select '数学'标签,众数 from(select m.score as 众数,count(m.score) as mmscore from math m
group by m.score having mmscore >=
(select max(计分) from
(select score,count(score) as 计分 from math m group by score)r))rr)rrr on ma.标签=rrr.标签
第一次写这么长的代码。不知道为什么会出现null,晚点再看看吧。
SQL挑战赛第八期
数据
select a.name as 一级地名,b.name as 二级地名,
c.name as 三级地名 from address a
left join address b on a.id=b.ParentId
left join address c on b.id=c.ParentId
where c.name is not null
SQL挑战赛第九期
题目为:
1.查找每个订单id及其分布时间段。
(输出整数的时间,如19:27:23输出19:00:00,超过30min则输出下一时间段)
2.查找订单数量最多的时间段。(订单id需要排重)
3.求每个用户的订单间隔时间,以月为单位输出。
(注意,这里需要基于同一用户id才能进行比较,也要进行订单id的排重,第一次购买的订单id后输出null即可)
4.在3的基础上求平均间隔时间,null值不用进行计算。
数据