sql 中位数_【SQL】小点点SQL挑战赛第一至八期

最新推荐文章于 2024-06-24 14:41:46 发布

weixin_39761822

最新推荐文章于 2024-06-24 14:41:46 发布

阅读量120

点赞数

文章标签： sql 中位数 sql 查询 tag sql中位数

知乎er最想娶的数据女神 @空白白白白组建的社区竞赛。

为了防止被投诉打广告，就不列相关信息了。

SQL挑战赛第一期

题目：

1: 编写一个查询，列出员工姓名列表，员工员工每月工资超过2000美元且员工人数少于10个月。通过提升employee_id对结果进行排序。

select * from 
         employee where 
			salary > 2000 and 
			months <10 
			order by employee_id

2: 查询 Employee 表格中以元音字母开头的 name 名字。结果不包含名字重复记录。

select distinct name 
       from employee 
		 where name like 'a%' 
           or name like 'e%'
           or name like'i%'
           or name like'o%'
           or name like 'u%'

3:编写一个查询，去掉一个最高收入，去掉一个最低收入，该公司员工平均收入是多少？

select avg(salary) from(
                   select salary from (
                          select salary from employee 
								  order by salary 
								  desc limit 11
                                      )b 
			               order by salary 
								limit 10
                       )c

4：简述NULL 空字符串 0的区别。

空字符串0是不占用空间的；

NULL是占用空间的。

SQL挑战赛第二期

关于查询所用的表，market_data表的字段介绍为：order_id(订单ID),order_time(订单时间),customer_name(用户名称),quantity(购买数量），sale(销售额），profit(利润)。

各项指标的定义为：

R值为：用户最后一次购买到现在（2016年12月31日）的时间间隔，输出月份。

L值为：用户第一次购买和最后一次购买之间的时间间隔，输出月份。

F值为：用户的总共购买次数，仅计算2016年的即可。

M值为：用户的全部销售额，仅计算2016年的即可。

题目：

1. 查询所有用户的R值和L值。

select customer_name,
       TIMESTAMPDIFF(month,maxdate,curdate()) as R,
       TIMESTAMPDIFF(month,mindate,maxdate) as L
       from(
       select customer_name,
		 max(order_time) as maxdate,
		 min(order_time) as mindate
		 from market_data
		 group by customer_name
		 )b

2.查询用户的R值，F值和M值，注意F值和M值，仅计算2016年度的数字。

select customer_name,
       TIMESTAMPDIFF(month,maxdate,curdate()) as R,
       F,
       ROUND(M)
       from(
       select customer_name,
		 max(order_time) as maxdate,
		 curdate(),
		 count(order_time) as F,
		 sum(sale) as M
		 from market_data
		 group by customer_name
		 )b

3.查询用户的R值，L值和用户生命周期划分。生命周期划分如下：

（新用户：R<=6 and L<=12；忠诚用户：R<=6 and L>12;

流失的老用户：R>6 and L>12; 一次性用户：R>6 and L<=12）。

select customer_name, 
       case when R<=6 and L<=12 then '新用户'
            when R<=6 and L>12 then '忠诚用户'
            when R>6 and L>12 then '流失的老用户'
            when R>6 and L<=12 then '一次性用户'
            else null 
       end as 用户生命周期
from (
      select customer_name,
             TIMESTAMPDIFF(month,maxdate,curdate()) as R,
             TIMESTAMPDIFF(month,mindate,maxdate) as L
      from (
            select customer_name,
                   max(order_time) as maxdate,
                   min(order_time) as mindate
            from market_data
            group by customer_name
           ) b
     ) c

SQL挑战赛第三期

Cinema表结构各字段介绍如下：

Seat_id（座位号，依次递增），free（0表示有人，1表示空座），fare(对应座位的票价）。

题目为：

1：查找表中最便宜的票价是多少？

select seat_id,min(fare) from cinema

2：女友要求你定的座位必须是连续的（输出可用位置的seat_id）。

select a.seat_id,b.seat_id
       from cinema a
       join
       cinema b
       on a.seat_id+1=b.seat_id
       where a.free=1 
		 and b.free=1

3：女友要求买连续的座位中总价最低的两个座位(输出对应的seat_id和总价)。

select a.seat_id,b.seat_id,a.fare+b.fare
       from cinema a
       join
       cinema b
       on a.seat_id+1=b.seat_id
       where a.free=1 
		 and b.free=1
		 order by a.fare + b.fare limit 1

SQL挑战赛第四期

表内字段的解释如下： position（职位名称），min_salary（最低薪资，单位元），max_salary（最高薪资，单位元），city（工作城市），educational（学历要求），people（招聘人数），industry（行业）

题目：

1. 查找不同学历要求的平均最低薪资和平均最高薪资。

      select educational,
       avg(min_salary) as 平均最低薪资,
       avg(max_salary) as 平均最高薪资
       from employ
       group by educational

2. 查找每个行业，最高工资和最低工资差距最大的职位名称。

select industry,
       position_name,
       max(工资差距)
       from (
		 select industry,
             position_name,
             (max_salary-min_salary) as 工资差距
              from employ
              group by industry,position_name
				  ) b
       group by industry

3. 查找各个城市电商行业的平均薪资水平，并按照薪资高低水平进行排序。（岗位的薪资按照最低薪资和最高薪资的中位数来计算，注意要考虑到职位招聘人数）。

select city,
       industry,
     sum((min_salary+max_salary)/2*people)/sum(people) as 行业平均薪资
       from  employ
       where industry='互联网/电子商务'
       group by city
       order by 行业平均薪资 desc

4.问答题：说明UNION和UNION ALL的差别。

两者都是对两个结果集进行并集操作。

但是前者不包括重复行；

后者包括重复行。

SQL挑战赛第五期

题目：

表中两列数据，一列是userid 一列是用户标签tag，标签类别总共10个，每个用户可以拥有不同的标签，不同标签之间用逗号间隔。

原始数据：

1：输出每个标签的用户数？

select tag,count(*) from (
select
userid,
substring_index(substring_index(user_tag.tag,',',idx.n),',',-1)as tag
from
(select 1 as n union select 2 union select 3 union select 4)as idx
join user_tag on length(user_tag.tag)-length(replace(user_tag.tag,',','')) +1 >=idx.n
) as tb1 group by tb1.tag

2：谁拥有的标签最多？

select userid,count(*) as count  from (
select 
userid,
substring_index(substring_index(user_tag.tag,',',idx.n),',',-1)as tag
from 
(select 1 as n union select 2 union select 3 union select 4)as idx
join user_tag on length(user_tag.tag)-length(replace(user_tag.tag,',','')) +1 >=idx.n
) as tb1 group by tb1.userid order by count desc limit 1

3：那个标签拥有用户最多？

select tag,count(*) as count from (
select 
userid,
substring_index(substring_index(user_tag.tag,',',idx.n),',',-1)as tag
from 
(select 1 as n union select 2 union select 3 union select 4)as idx
join user_tag on length(user_tag.tag)-length(replace(user_tag.tag,',','')) +1 >=idx.n
) as tb1 group by tb1.tag order by count desc limit 1

SQL小挑战第六期车流量问题

某市交通局为了监控车流信息，新进了一批摄像头，安装在交通要道，用于检测日常车流信息，让我们看看怎么处理这些信息呢？

摄像头收集到信息经过简化后数据结构如下，id（摄像头编号），date（时间），car（车流）

原始数据：

题目要求：

1：那天的车流量最高？

select date,car from traffic  order by car desc limit 1

2：找出高峰期时段，要求连续三天及以上，并且每天人流量均不少于100。

select count(date) as countd,
GROUP_CONCAT(date) 
from(
select date,
car,
datediff(now(),date)-(@b:=@b+1)
as dat1 
from 
traffic a,
(SELECT @b := 0) tmp_b  
where car >= 100 order by date desc 
)d 
group by dat1 having countd >=3

3：如果要求连续10天以上，并且每天人流量均不少于100。你上述的方法还可行吗？有更好的建议吗？

可行，这个方法就不错。

SQL挑战赛第七期

数据

题目：

1.查找所有科目均及格（>=60分）的同学并列出其各科分数；

select b.id,
       Cscore,
		 Escore,
       math.score as Mscore 
		 from(
		 select chinese.id as id,
		 chinese.score as Cscore,
       english.score as Escore
       from chinese 
       left join english on chinese.id=english.id
		 )b
       left join math on b.id=math.id 
		 where Cscore>=60 and Escore>=60 and math.score>=60

2.查找所有同学的总分（没有分数的科目按照0分计算）和总分名次，并按照名次进行排序；

select b.id,
       Cscore +IFNULL(Escore,0)+IFNULL(math.score,0) as score
		 from(
		 select chinese.id as id,
	    chinese.score as Cscore,
       english.score as Escore
       from chinese 
       left join english on chinese.id=english.id
		 )b
       left join math on b.id=math.id 
		 union all 
       select id,score from english where id not in(select id from chinese)
      order by score

3.查找三个学科分数的平均数，中位数和众数：

select ma.标签,median_score,avg_score,众数 
from(select mmm.标签 as 标签,mmm.median_score as median_score,avg_score  from(SELECT '英语'标签,AVG(ee.score) as median_score
FROM (
SELECT e.score, @rownum:=@rownum+1 as `row_number`, @total_rows:=@rownum
  FROM english e, (SELECT @rownum:=0) r
  WHERE e.score is NOT NULL
  ORDER BY e.score
) as ee
WHERE ee.row_number IN ( FLOOR((@total_rows+1)/2), FLOOR((@total_rows+2)/2) )
union
SELECT '数学'标签,AVG(mm.score) as median_score
FROM (
SELECT m.score, @rownum:=@rownum+1 as `row_number`, @total_rows:=@rownum
  FROM math m, (SELECT @rownum:=0) r
  WHERE m.score is NOT NULL
  ORDER BY m.score
) as mm
WHERE mm.row_number IN ( FLOOR((@total_rows+1)/2), FLOOR((@total_rows+2)/2) )
union
SELECT '语文'标签,AVG(aa.score) as median_score
FROM (
SELECT a.score, @rownum:=@rownum+1 as `row_number`, @total_rows:=@rownum
  FROM chinese a, (SELECT @rownum:=0) r
  WHERE a.score is NOT NULL
  ORDER BY a.score
) as aa
WHERE aa.row_number IN ( FLOOR((@total_rows+1)/2), FLOOR((@total_rows+2)/2) ))mmm
left join 
(select '语文'标签,avg(score) as avg_score from chinese
union
select '英语'标签,avg(score) as avg_score from english	 
union
select '数学'标签,avg(score) as avg_score from math)aaa on mmm.标签=aaa.标签)ma
left join
(select '语文'标签,众数 from( select c.score as 众数,count(c.score) as ccscore from chinese c group by c.score having ccscore >=
(select max(计分) from 
(select score,count(score) as 计分 from chinese c group by score)r))rr
union 
select '英语'标签,众数 from(select e.score as 众数,count(e.score) as eescore from english e 
group by e.score having eescore >=
(select max(计分) from 
(select score,count(score) as 计分 from english e group by score)r))rr
union 
select '数学'标签,众数 from(select m.score as 众数,count(m.score) as mmscore from math m 
group by m.score having mmscore >=
(select max(计分) from 
(select score,count(score) as 计分 from math m group by score)r))rr)rrr on ma.标签=rrr.标签

第一次写这么长的代码。不知道为什么会出现null，晚点再看看吧。

SQL挑战赛第八期

数据

select a.name as 一级地名,b.name as 二级地名,
c.name as 三级地名 from address a
left join address b on a.id=b.ParentId
left join address c  on b.id=c.ParentId
where c.name is not null

SQL挑战赛第九期

题目为：

1.查找每个订单id及其分布时间段。

（输出整数的时间，如19:27:23输出19:00:00,超过30min则输出下一时间段）

2.查找订单数量最多的时间段。（订单id需要排重）

3.求每个用户的订单间隔时间，以月为单位输出。

（注意，这里需要基于同一用户id才能进行比较，也要进行订单id的排重，第一次购买的订单id后输出null即可）

4.在3的基础上求平均间隔时间，null值不用进行计算。

数据

weixin_39761822

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
sql 中位数_【SQL】小点点SQL挑战赛第一至八期

知乎er最想娶的数据女神 @空白白白白组建的社区竞赛。为了防止被投诉打广告，就不列相关信息了。SQL挑战赛第一期题目：1: 编写一个查询，列出员工姓名列表，员工员工每月工资超过2000美元且员工人数少于10个月。通过提升employee_id对结果进行排序。select * from employee where salary > 2000 and mo...
复制链接

扫一扫