10 SQL进阶 -- 综合练习题 -- 10道经典SQL题目,配套数据与解答

在这里插入图片描述

1. 创建表结构和导入数据

1.1 新建数据库

在这里插入图片描述

1.2 执行建表语句

  • 点击下方链接直接下载

创建数据表脚本:http://tianchi-media.oss-cn-beijing.aliyuncs.com/dragonball/SQL/create_table.sql

  • 执行建表语句

在这里插入图片描述

  • 执行成功

在这里插入图片描述

  • 查看创建的表

在这里插入图片描述

1.3 导入数据

  • 点击下方链接直接下载

插入数据脚本:https://tianchi-media.oss-cn-beijing.aliyuncs.com/dragonball/SQL/data.zip

大家下载好脚本后,先在MySQL环境中运行create_table.sql脚本,创建数据表,然后解压下载好的data.zip,解压后目录如下:

在这里插入图片描述

脚本文件名前面的序号表示用到该数据集的题目序号,例如1-7market data.sql表示第1题和第7题用到了该数据集。

同样的,这里给大家的也是sql脚本,里面是插入数据的语句,大家只需打开后在MySQL环境中运行即可将数据导入到数据表中。

  • 导入 1-7market data.sql 脚本

在这里插入图片描述

在这里插入图片描述

  • 导入 1-9company operating.sql 脚本

在这里插入图片描述

在这里插入图片描述

  • 导入 1-9income statement.sql 脚本

在这里插入图片描述
在这里插入图片描述

  • 导入 2-winequality-red.sql 脚本

在这里插入图片描述

在这里插入图片描述

  • 导入 3-ccf_offline_stage1_test_revised.sql 脚本

在这里插入图片描述

在这里插入图片描述

  • 导入 4-macro industry.sql 脚本

在这里插入图片描述

在这里插入图片描述

  • 导入 5-8-10ccf_online_stage1_train.sql 脚本,因为数据量比较大,耗时稍长一些,请耐心等待

在这里插入图片描述

在这里插入图片描述

  • 导入 6-winequality-white.sql 脚本

在这里插入图片描述

在这里插入图片描述

  • 导入 8-10ccf_offline_stage1_train.sql 脚本

在这里插入图片描述

在这里插入图片描述

2. 练习题

2.1 练习题1

以Market Data为主表,将三张表中的TICKER_SYMBOL为600383和600048的信息合并在一起。只需要显示以下字段。

在这里插入图片描述

  • 执行查询语句:

SELECT i.TICKER_SYMBOL AS I_TICKER_SYMBOL, i.END_DATE AS I_END_DATA, i.T_REVENUE AS I_T_REVENUE, i.T_COGS AS I_T_COGS, i.N_INCOME AS I_N_INCOME, m.TICKER_SYMBOL AS M_TICKER_SYMBOL, m.END_DATE AS M_END_DATE, m.CLOSE_PRICE AS M_CLOSE_PRICE, c.TICKER_SYMBOL AS C_TICKER_SYMBOL, c.INDIC_NAME_EN AS C_INDIC_NAME_EN, c.END_DATE AS C_END_DATE, c.VALUE AS C_VALUE
FROM
(SELECT TICKER_SYMBOL, END_DATE, CLOSE_PRICE
FROM market data
WHERE TICKER_SYMBOL IN (‘600383’, ‘600048’)) m
LEFT JOIN income statement i ON m.TICKER_SYMBOL=i.TICKER_SYMBOL
LEFT JOIN company operating c ON m.TICKER_SYMBOL=c.TICKER_SYMBOL;

在这里插入图片描述

注:大家可以看到查询的速度比较慢,主要原因是关联的 TICKER_SYMBOL 字段没有加索引。

  • 对关联字段 TICKER_SYMBOL 增加索引 SQL 脚本
    ALTER TABLE market data ADD INDEX INDEX_M_TICKER_SYMBOL (TICKER_SYMBOL);
    ALTER TABLE income statement ADD INDEX INDEX_I_TICKER_SYMBOL (TICKER_SYMBOL);
    ALTER TABLE company operating ADD INDEX INDEX_C_TICKER_SYMBOL (TICKER_SYMBOL);

在这里插入图片描述

重新执行查询语句:

在这里插入图片描述

查询速度有了明显的提升。

2.2 练习题2

请使用 winequality-red 表,找出 pH=3.03的所有红葡萄酒,然后,对其 citric acid 进行中式排名(相同排名的下一个名次应该是下一个连续的整数值。换句话说,名次之间不应该有“间隔”)

SELECT *, DENSE_RANK() OVER (ORDER BY citric acid) AS dense_ranking
FROM winequality-red
WHERE pH=3.03;

在这里插入图片描述

2.3 练习题3

请使用 ccf_offline_stage1_test_revised 表,试分别找出在2016年7月期间,发放优惠券总金额最多和发放优惠券张数最多的商家。这里只考虑满减的金额,不考虑打几折的优惠券。

在这里插入图片描述

  • 发放优惠券总金额最多的商家

SELECT m.*
FROM
(SELECT Merchant_id, SUM(SUBSTRING_INDEX(Discount_rate, ‘:’, 1)) as Coupon_amount
FROM ccf_offline_stage1_test_revised
WHERE DATE_FORMAT(Date_received, ‘%Y-%m’) LIKE ‘2016-07%’ AND Discount_rate LIKE ‘%:%’
GROUP BY Merchant_id) m
ORDER BY Coupon_amount DESC
LIMIT 1;

在这里插入图片描述

  • 发放优惠券张数最多的商家

SELECT m.*
FROM
(SELECT Merchant_id, COUNT(Coupon_id) AS Coupon_Num
FROM ccf_offline_stage1_test_revised
WHERE DATE_FORMAT(Date_received, ‘%Y-%m’) LIKE ‘2016-07%’ AND Discount_rate LIKE ‘%:%’
GROUP BY Merchant_id) m
ORDER BY Coupon_Num DESC
LIMIT 1;

在这里插入图片描述

2.4 练习题4

请使用 macro industry 表中的 PERIOD_DATE,请计算全社会用电量:第一产业:当月值在2015年用电最高峰是发生在哪月?并且相比去年同期增长/减少了多少个百分比?

  • 2015年用电最高峰的月份

SELECT m.peak_month, SUM(m.DATA_VALUE) as DATA_VALUE
FROM
(SELECT DATA_VALUE, DATE_FORMAT(PERIOD_DATE, ‘%Y-%m’) AS
peak_month
FROM macro industry
WHERE DATE_FORMAT(PERIOD_DATE, ‘%Y’) LIKE ‘2015%’) m
GROUP BY m.peak_month

在这里插入图片描述

  • 2014年12月的用电量

SELECT m.peak_month, SUM(m.DATA_VALUE) as DATA_VALUE
FROM
(SELECT DATA_VALUE, DATE_FORMAT(PERIOD_DATE, ‘%Y-%m’) AS
peak_month
FROM macro industry
WHERE DATE_FORMAT(PERIOD_DATE, ‘%Y-%m’) LIKE ‘2014-12%’) m
GROUP BY m.peak_month;

在这里插入图片描述

SELECT CONCAT(FORMAT(((4268310.33806 - 3594152.12446) / 4268310.33806) * 100, 2), ‘%’) AS increase_rate;

SELECT CONCAT(FORMAT((((SELECT m1.DATA_VALUE
FROM
(SELECT m.peak_month, SUM(m.DATA_VALUE) as DATA_VALUE
FROM
(SELECT DATA_VALUE, DATE_FORMAT(PERIOD_DATE, ‘%Y-%m’) AS
peak_month
FROM macro industry
WHERE DATE_FORMAT(PERIOD_DATE, ‘%Y’) LIKE ‘2015%’) m
GROUP BY m.peak_month) m1
ORDER BY m1.DATA_VALUE DESC
LIMIT 1) - (SELECT SUM(m.DATA_VALUE) as DATA_VALUE
FROM
(SELECT DATA_VALUE, DATE_FORMAT(PERIOD_DATE, ‘%Y-%m’) AS
peak_month
FROM macro industry
WHERE DATE_FORMAT(PERIOD_DATE, ‘%Y-%m’) LIKE ‘2014-12%’) m
GROUP BY m.peak_month)) / (SELECT m1.DATA_VALUE
FROM
(SELECT m.peak_month, SUM(m.DATA_VALUE) as DATA_VALUE
FROM
(SELECT DATA_VALUE, DATE_FORMAT(PERIOD_DATE, ‘%Y-%m’) AS
peak_month
FROM macro industry
WHERE DATE_FORMAT(PERIOD_DATE, ‘%Y’) LIKE ‘2015%’) m
GROUP BY m.peak_month) m1
ORDER BY m1.DATA_VALUE DESC
LIMIT 1)) * 100, 2), ‘%’) AS increase_rate;
在这里插入图片描述

2.5 练习题5

请使用 ccf_online_stage1_train 表,试统计在2016年6月期间,线上总体优惠券弃用率为多少?并找出优惠券弃用率最高的商家。弃用率 = 被领券但未使用的优惠券张数 / 总的被领取优惠券张数。

在这里插入图片描述

  • 被领券但未使用的优惠券张数

SELECT COUNT(Coupon_id) AS Coupon_num
FROM ccf_online_stage1_train
WHERE Coupon_id != ‘fixed’ AND Date IS NULL AND Coupon_id IS NOT NULL AND DATE_FORMAT(Date_received, ‘%Y-%m’) LIKE ‘2016-06%’;

在这里插入图片描述

  • 总的被领用的优惠券张数

SELECT (SELECT COUNT(Coupon_id) AS Coupon_num
FROM ccf_online_stage1_train
WHERE Coupon_id != ‘fixed’ AND Date IS NULL AND Coupon_id IS NOT NULL AND DATE_FORMAT(Date_received, ‘%Y-%m’) LIKE ‘2016-06%’) + (SELECT COUNT(Coupon_id) AS Coupon_num
FROM ccf_online_stage1_train
WHERE Coupon_id != ‘fixed’ AND Date IS NOT NULL AND Date_received IS NOT NULL AND DATE_FORMAT(Date_received, ‘%Y-%m’) LIKE ‘2016-06%’) AS sum_coupon_num;

在这里插入图片描述

  • 弃用率

SELECT CONCAT(FORMAT((79053/90406) * 100, 2), ‘%’) give_up_rate;

SELECT CONCAT(FORMAT(((SELECT COUNT(Coupon_id) AS coupon_num
FROM ccf_online_stage1_train
WHERE Coupon_id != ‘fixed’ AND Date IS NULL AND Coupon_id IS NOT NULL AND DATE_FORMAT(Date_received, ‘%Y-%m’) LIKE ‘2016-06%’)/(SELECT (SELECT COUNT(Coupon_id) AS Coupon_num
FROM ccf_online_stage1_train
WHERE Coupon_id != ‘fixed’ AND Date IS NULL AND Coupon_id IS NOT NULL AND DATE_FORMAT(Date_received, ‘%Y-%m’) LIKE ‘2016-06%’) + (SELECT COUNT(Coupon_id) AS Coupon_num
FROM ccf_online_stage1_train
WHERE Coupon_id != ‘fixed’ AND Date IS NOT NULL AND Date_received IS NOT NULL AND DATE_FORMAT(Date_received, ‘%Y-%m’) LIKE ‘2016-06%’) AS sum_coupon_num)) * 100, 2), ‘%’) give_up_rate;

在这里插入图片描述

  • 优惠券弃用率最高的商家

SELECT m4.*
FROM

(SELECT m3.Merchant_id, m3.unused_coupon_num, m3.used_coupon_num, m3.sum_coupon_num, FORMAT((m3.unused_coupon_num/m3.sum_coupon_num) * 100, 2) AS give_up_rate
FROM

(SELECT m1.Merchant_id, m1.unused_coupon_num, m2.used_coupon_num, (m1.unused_coupon_num + m2.used_coupon_num) AS sum_coupon_num
FROM

(SELECT Merchant_id, COUNT(Coupon_id) AS unused_coupon_num
FROM ccf_online_stage1_train
WHERE Coupon_id != ‘fixed’ AND Date IS NULL AND Coupon_id IS NOT NULL AND DATE_FORMAT(Date_received, ‘%Y-%m’) LIKE ‘2016-06%’
GROUP BY Merchant_id) m1

INNER JOIN

(SELECT Merchant_id, COUNT(Coupon_id) AS used_coupon_num
FROM ccf_online_stage1_train
WHERE Coupon_id != ‘fixed’ AND Date IS NOT NULL AND Date_received IS NOT NULL AND DATE_FORMAT(Date_received, ‘%Y-%m’) LIKE ‘2016-06%’
GROUP BY Merchant_id) m2

ON m1.Merchant_id = m2.Merchant_id) m3

) m4

ORDER BY m4.give_up_rate DESC
LIMIT 1;

在这里插入图片描述

2.6 练习题6

请使用 winequality-white 表,找出 pH=3.63的所有白葡萄酒,然后,对其 residual sugar 量进行英式排名(非连续的排名)

– 对其 residual sugar 量进行英式排名(非连续的排名)

SELECT *, RANK() OVER (ORDER BY residual sugar) AS ranking
FROM winequality-white
WHERE pH=3.63;

在这里插入图片描述

2.7 练习题7

请使用 Market Data 表,计算截止到2018年底,市值最大的三个行业是哪些?以及这三个行业里市值最大的三个公司是哪些?(每个行业找出前三大的公司,即一共要找出9个)

SELECT m.TYPE_ID, m.TYPE_NAME_CN, MAX(m.MARKET_VALUE) AS MARKET_VALUE
FROM
(SELECT TYPE_ID, TYPE_NAME_CN, CAST(MARKET_VALUE AS SIGNED) AS MARKET_VALUE, END_DATE
FROM market data
WHERE DATE_FORMAT(END_DATE, ‘%Y-%m-%d’) = ‘2018-05-31’) m
GROUP BY m.TYPE_ID, m.TYPE_NAME_CN
LIMIT 3;

在这里插入图片描述

(SELECT m.SECURITY_ID, m.TYPE_ID, m.TYPE_NAME_CN, m.MARKET_VALUE
FROM
(SELECT SECURITY_ID, TYPE_ID, TYPE_NAME_CN, CAST(MARKET_VALUE AS SIGNED) AS MARKET_VALUE, END_DATE
FROM market data
WHERE TYPE_ID=‘010303210101’ AND DATE_FORMAT(END_DATE, ‘%Y-%m-%d’) = ‘2018-05-31’) m
ORDER BY m.MARKET_VALUE DESC
LIMIT 3)

UNION ALL

(SELECT m.SECURITY_ID, m.TYPE_ID, m.TYPE_NAME_CN, m.MARKET_VALUE
FROM
(SELECT SECURITY_ID, TYPE_ID, TYPE_NAME_CN, CAST(MARKET_VALUE AS SIGNED) AS MARKET_VALUE, END_DATE
FROM market data
WHERE TYPE_ID=‘010303200101’ AND DATE_FORMAT(END_DATE, ‘%Y-%m-%d’) = ‘2018-05-31’) m
ORDER BY m.MARKET_VALUE DESC
LIMIT 3)

UNION ALL

(SELECT m.SECURITY_ID, m.TYPE_ID, m.TYPE_NAME_CN, m.MARKET_VALUE
FROM
(SELECT SECURITY_ID, TYPE_ID, TYPE_NAME_CN, CAST(MARKET_VALUE AS SIGNED) AS MARKET_VALUE, END_DATE
FROM market data
WHERE TYPE_ID=‘010303170301’ AND DATE_FORMAT(END_DATE, ‘%Y-%m-%d’) = ‘2018-05-31’) m
ORDER BY m.MARKET_VALUE DESC
LIMIT 3);

在这里插入图片描述

2.8 练习题8

请使用 ccf_online_stage1_train 和 ccf_offline_stage1_train 表,试找出在2016年6月期间,线上线下累计优惠券使用次数最多的顾客。

SELECT m.User_id, m.Coupon_num
FROM
(SELECT c.User_id, COUNT(c.Coupon_id) as Coupon_num
FROM
((SELECT User_id, Coupon_id
FROM ccf_offline_stage1_train
WHERE Date IS NOT NULL AND Date_received IS NOT NULL AND DATE_FORMAT(Date_received, ‘%Y-%m’) LIKE ‘2016-06%’)
UNION ALL
(SELECT User_id, Coupon_id
FROM ccf_online_stage1_train
WHERE Coupon_id != ‘fixed’ AND Date IS NOT NULL AND Date_received IS NOT NULL AND DATE_FORMAT(Date_received, ‘%Y-%m’) LIKE ‘2016-06%’)) c
GROUP BY c.User_id) m
ORDER BY m.Coupon_num DESC
LIMIT 1;

在这里插入图片描述

2.9 练习题9

请使用 income statement 表中的 sheet-General Business 和 company operating 表中的sheet-EN 找出在数据集所有年份中,按季度统计,白云机场旅客吞吐量最高的那一季度对应的净利润是多少?(注意,是单季度对应的净利润,非累计净利润。)

SELECT *
FROM (SELECT TICKER_SYMBOL,
YEAR(END_DATE) Year,
QUARTER(END_DATE) QUARTER,
SUM(VALUE) Amount
FROM company operating
WHERE INDIC_NAME_EN = ‘Baiyun Airport:Passenger throughput’
GROUP BY TICKER_SYMBOL,YEAR(END_DATE),QUARTER(END_DATE)
ORDER BY SUM(VALUE) DESC
LIMIT 1 ) BaseData
LEFT JOIN – income statement
(SELECT TICKER_SYMBOL,
YEAR(END_DATE) Year,
QUARTER(END_DATE) QUARTER,
SUM(N_INCOME) Amount
FROM income statement
GROUP BY TICKER_SYMBOL,YEAR(END_DATE),QUARTER(END_DATE) ) Income
ON BaseData.TICKER_SYMBOL = Income.TICKER_SYMBOL
AND BaseData.Year = Income.Year
AND BaseData.QUARTER = Income.QUARTER;

在这里插入图片描述

2.10 练习题10

请使用 ccf_online_stage1_train 和 ccf_offline_stage1_train 表,试找出在2016年6月期间,线上线下累计被使用优惠券满减最多的前3名商家。比如商家A,消费者A在其中使用了一张200减50的,消费者B使用了一张30减1的,那么商家A累计被使用优惠券满减51元。

SELECT s.Merchant_id, s.Coupon_sum
FROM
(SELECT m.Merchant_id, SUM(m.sum_) AS Coupon_sum
FROM
(SELECT Merchant_id,
SUM(SUBSTRING_INDEX(Discount_rate, ‘:’, -1))
OVER (PARTITION BY Merchant_id) AS sum_
FROM ccf_online_stage1_train
WHERE Coupon_id != ‘fixed’ AND Date IS NOT NULL AND Date_received IS NOT NULL AND DATE_FORMAT(Date, ‘%Y-%m’) LIKE ‘2016-06%’
UNION ALL
SELECT Merchant_id,
SUM(SUBSTRING_INDEX(Discount_rate, ‘:’, -1))
OVER (PARTITION BY Merchant_id) AS sum_
FROM ccf_offline_stage1_train
WHERE Date IS NOT NULL AND Date_received IS NOT NULL AND DATE_FORMAT(Date, ‘%Y-%m’) LIKE ‘2016-06%’) m
GROUP BY m.Merchant_id) s
ORDER BY s.Coupon_sum DESC
LIMIT 3;

在这里插入图片描述

  • 33
    点赞
  • 18
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

腾飞开源

你的鼓励将是我创作的最大动力!

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值