目录
基本语法
select
from
join/left join/right join
on
where
group by
having
order by asc/desc
limit a,b(截断a行,输出b行)
常见问题
1 各种连接的区别?
左连接(A left join B on…):两表关联,左表全部保留,右表关联不上用null表示。
右连接(A right join B on…):两表关联,右表全部保留,左表关联不上的用null表示。
内连接(A join B on…):两表关联,保留两表中交集的记录。
笛卡尔积(A,B / A join B):两表关联,把左表的列和右表的列通过笛卡尔积的形式表达出来。
2 查询顺序
from -> join -> on -> where -> group by(此处开始可以用别名) -> 聚合函数如AVG,SUM -> having -> select -> distinct -> order by -> limit
3 如何实现对一个字段升序,另一个字段降序?
字段1 asc,字段2 desc
4 SQL优化
详情见此
好题汇总
leetcode(按出现频率排列)
题目名字改成更直观的内容。
569. 员工薪水中位数
题目详情
此题中的中位数,指当行数n为偶的时候,同时输出n/2和n/2 + 1行,而不是求这两行的平均。
select id,company, salary
from
(select id,company,salary,row_number() over(partition by company order by salary) as rn,
count(id) over (partition by company) as cnt
from employee)t
where rn >= cnt/2 and rn <= cnt/2 +1
技巧点:
1.巧妙地将窗口函数和聚合函数结合起来。
2.求中位数的通用技巧 rn >= cnt/2 and rn <= cnt/2 +1。
185. 部门工资前三高的所有员工
select d.name as department , t.name as employee,t.salary
from
(select name,salary,departmentid,dense_rank() over(partition by departmentid order by salary desc) as rn
from employee) t join department d
on d.id = t.departmentid
where rn <= 3
技巧点:
1.利用dense_rank窗口函数。
262. 每天的取消率
解法1:
select request_at as Day,
round(sum(case when status <> 'completed' then 1 else 0 end )/count(*),2) as 'Cancellation Rate'
from trips
where request_at between '2013-10-01' and '2013-10-03' and
driver_id in
(select users_id from users where banned = 'No' and Role = 'driver') and
client_id in
(select users_id from users where banned = 'No' and Role = 'client')
group by 1
order by 1
解题思路:
1.通过where函数筛选出时间和用户符合的记录
2.利用sum函数和case when函数的组合计算取消用户数
解法2:
SELECT
request_at as 'Day', round(avg(Status!='completed'), 2) as 'Cancellation Rate'
FROM
trips t JOIN users u1 ON (t.client_id = u1.users_id AND u1.banned = 'No')
JOIN users u2 ON (t.driver_id = u2.users_id AND u2.banned = 'No')
WHERE
request_at BETWEEN '2013-10-01' AND '2013-10-03'
GROUP BY
request_at
解题思路:在筛选符合条件的用户上,此解法用了两次内连接,效率会比解法1的in好。在计算取消率上,此解法直接用avg函数,很巧妙。
1097. 每天的新用户数和1日留存率
select t1.event_date as install_dt,count(t1.player_id) as installs,
round(count(t2.player_id)/count(t1.player_id),2) as Day1_retention
from
(select player_id,event_date,row_number()over(partition by player_id order by event_date) as rn
from activity) t1 left join activity t2
on t1.player_id = t2.player_id and datediff(t2.event_date,t1.event_date) = 1
where t1.rn = 1
group by 1
解题思路:
1.构造一张新表,记录每天的新用户id
2.与原表格左连接,连接方式为用户id相等和登录日期差1天。
579. 员工的累计薪水
select id,month,salary
from
(select id,month,sum(salary) over(partition by id order by month rows 2 preceding) as salary,
row_number() over (partition by id order by month desc) as rn
from employee)t
where rn <> 1
order by id,month desc
技巧点:
1.窗口函数和sum函数结合
2. 聚合函数中的rows 2 preceding语法。
180. 连续出现的数字
解法1:
select distinct num as ConsecutiveNums
from
(select num, cast(id as signed) - cast(rn as signed)
from
(select id,num, row_number()over(partition by num) as rn
from logs) t
group by 1,2
having count(*) >=3)t
技巧点:
1.连续问题一般可以转换为id - rn 问题
2.id - rn 为负数时,mysql会报错。因此要用cast函数将数据类型转换为有符号整数类型。
解法2:
select distinct t1.num as ConsecutiveNums
from logs t1,logs t2,logs t3
where t1.id = t2.id +1 and t2.id = t3.id + 1
and t1.num = t2.num and t2.num = t3.num
解题思路:直接三表连接。
1205. 每月交易
select month,country,
sum(case when state = 'approved' then 1 else 0 end) as approved_count,
sum(case when state = 'approved' then amount else 0 end) as approved_amount,
sum(case when state = 'chargeback' then 1 else 0 end) as chargeback_count,
sum(case when state = 'chargeback' then amount else 0 end) as chargeback_amount
from
(select id,country,state,amount,date_format(trans_date,'%Y-%m') as month
from transactions
where state = 'approved'
union all
select trans_id as id,country, 'chargeback' as state,amount,
date_format(c.trans_date,'%Y-%m') as month
from chargebacks c left join transactions t
on c.trans_id = t.id) t1
group by 1,2
解题思路:这是一道比较难理解题意的题目。
根据题意构造这样的新表:表格含批准状态交易的所有信息和退款状态交易的所有信息,利用union all 函数连接两类信息,state字段是用来区分两种状态的,相当于打了一个tag。
技巧点:
1.date_format(date,format)
2. sum函数和case语句结合,需要熟练掌握
550. 1日留存
select round(count(distinct a.player_id)/count(t.player_id),2) as fraction
from
(select player_id,min(event_date) as event_date
from activity
group by 1)t left join activity a
on t.player_id = a.player_id and datediff(a.event_date,t.event_date) = 1
解题思路:1097题的简单版,依旧构造一张新用户表,与原表左连接,连接方式为用户id相等,登陆日期差一天。
1454. 活跃用户
select distinct t.id,a.name
from
(select id, date_sub(login_date,interval rn day ) as dif
from
(select id,login_date, row_number()over(partition by id order by login_date) as rn
from
(select id,login_date
from logins
group by 1,2)t)t
group by 1,2
having count(*)>=5)t join accounts a
on t.id = a.id
order by 1
解题思路:一共用了四个select。
1.第一个select目的是去重,解决用户一天登录多次的情况
2.第二第三个select是连续型问题的老操作了,利用窗口函数构造新列rn,转换为日期-rn问题
3.最后一个select是id去重,并与表accounts相连接得到name的信息
571. 给定数字的频率查询中位数
select round(avg(number),1) as median
from
(select number,
sum(frequency) over(order by number) as asc_number,
sum(frequency) over (order by number desc) as desc_number
from numbers)t1,
(select sum(frequency) as cnt from numbers) t2
where asc_number >= cnt/2 and desc_number >= cnt/2
技巧点:
1.利用正序累积值和倒序累积值计算中位数asc_number >= cnt/2 and desc_number >= cnt/2,很巧妙!
601. 体育馆的人流量
题目详情
与之前连续题不同的是,它要输出满足连续条件的所有记录。
解法1:
select id,visit_date,people
from
(select id,visit_date,people,count(id) over(partition by diff) as cnt
from
(select id,visit_date,people, cast(id as signed) - cast(rn as signed) as diff
from
(select id,visit_date,people,row_number()over(order by id) as rn
from
(select *
from stadium
where people >= 100)t)t)t)t
where cnt >= 3
order by 1
解题思路:一共5个select。
1.最里面3个select是所有连续题一样的套路,计算id-rn
2.第4个select是利用窗口函数和count函数结合又生成了新的列,用以计算id-rn相同行的行数cnt
3.最后一个select是选择行数>=3的行
解法2:
select distinct t1.*
from stadium t1,stadium t2,stadium t3
where t1.people>=100 and t2.people>=100 and t3.people>=100
and ((t1.id-t2.id = 1 and t2.id-t3.id = 1) or (t2.id-t1.id = 1 and t3.id -t2.id = 1)
or ( t2.id -t1.id = 1 and t1.id -t3.id = 1))
order by t1.id
解题思路:三表连接。由于最后选择t1的所有列,所以t1分别作连续三行的第一行,第二行,第三行。与解法1相比,解法2简洁不少。
1127. 用户购买平台
select t1.spend_date,t1.platform,ifnull(total_amount,0) as total_amount,
ifnull(total_users,0) as total_users
from
(select distinct spend_date,t.platform
from spending,
(select 'desktop' as platform union select 'mobile' as platform union
select 'both' as platform )t) t1 left join
(select spend_date, case when tag1 = 1 and tag2 = 0 then 'mobile' when
tag1 = 0 and tag2 = 1 then 'desktop' else 'both' end as platform,
sum(amount) as total_amount,count(user_id) as total_users
from
(select spend_date,user_id,
sum(case when platform = 'mobile' then 1 else 0 end) as tag1,
sum(case when platform = 'desktop' then 1 else 0 end) as tag2,
sum(amount) as amount
from spending
group by 1,2)t
group by 1,2) t2
on t1.platform = t2.platform and t1.spend_date = t2.spend_date
解题思路:t2表格是主要部分。
1.通过sum函数和case when函数打了两个标签tag1,tag2,分别表示是否用mobile/desktop登陆过。之后根据标签构造t2表格
2.构造t1表格左连接t2表格,是保证没有数据的记录也有输出
1501. 可以放心投资的国家
select c1.name as country
from person p,country c1, calls c2
where left(phone_number,3) =
country_code and (id = caller_id or id = callee_id)
group by 1
having avg(duration)>
(select avg(duration) from calls)
技巧点:
1.left函数截取部分字符串
2. id = caller_id or id = callee_id
1212. 查询球队积分
select t.team_id,t.team_name,ifnull(sum(point),0) as num_points
from
(select host_team as team_id,case when
host_goals = guest_goals then 1 when host_goals > guest_goals then 3 else 0 end as point
from matches
union all
select guest_team as team_id,case when
host_goals = guest_goals then 1 when host_goals > guest_goals then 0 else 3 end as point
from matches) k right join teams t
on k.team_id = t.team_id
group by 1
order by 3 desc, 1
技巧点:
1.将host和guest化列为行