练习题6:
数据来源:数据集-阿里云天池
请使用 Wine Quality Data 数据集《winequality-white.csv》,找出 pH=3.63的所有白葡萄酒,然后,对其 residual sugar 量进行英式排名(非连续的排名);
select *,
RANK() OVER (ORDER BY `RESIDUAL SUGAR`) RANKING
FROM `winequality-white` where `PH`=3.63;
练习题7:
数据来源:数据集-阿里云天池
请使用A股上市公司季度营收预测中的数据集《Market Data.xlsx》中的sheet-DATA,
计算截止到2018年底,市值最大的三个行业是哪些?以及这三个行业里市值最大的三个公司是哪些?(每个行业找出前三大的公司,即一共要找出9个)
#市值最大的三个行业
select type_name_cn ,sum( `market_value`) from `market data` where end_date <='2018-12-31'
group by type_name_cn
order by sum( `market_value`) desc
LIMIT 3
#每个行业找出前三大的公司
select a.type_name_cn ,a.ticker_symbol from (select type_name_cn,ticker_symbol,market_value,
ROW_NUMBER() over (PARTITION by type_name_cn order by market_value desc)as rankingn
from `market data`
)a
LEFT JOIN
(select type_name_cn ,sum( `market_value`) from `market data` where end_date <='2018-12-31'
group by type_name_cn
order by sum( `market_value`) desc
LIMIT 3) b
on a.type_name_cn=b.type_name_cn
where a.rankingn<=3
and b.type_name_cn is not null
练习题8:
数据来源:天池新人实战赛o2o优惠券使用预测赛题与数据-天池大赛-阿里云天池
使用Coupon Usage Data for O2O中的数据集《ccf_online_stage1_train.csv》和《ccf_offline_stage1_train.csv》,试找出在2016年6月期间,线上线下累计优惠券使用次数最多的顾客。
select a.`User_id`,
a.`Count_online`,
c.`Count_offline`,
(a.`Count_online`+ c.`Count_offline`) sumcount
from(
select `User_id`, COUNT(*) as Count_online
FROM `ccf_online_stage1_train`
where `Date` IS NOT NULL
and `Coupon_id` IS NOT NULL
group by `User_id`) a
inner join(
select `User_id`, COUNT(*) as Count_offline
FROM `ccf_offline_stage1_train`
where `Date` IS NOT NULL
and `Coupon_id` IS NOT NULL
group by `User_id`) c on a.`User_id`= c.`User_id` order by (a.`Count_online`+ c.`Count_offline`) desc LIMIT 1
练习题9:
数据来源:数据集-阿里云天池
请使用A股上市公司季度营收预测数据集《Income Statement.xls》中的sheet-General Business和《Company Operating.xlsx》中的sheet-EN。
找出在数据集所有年份中,按季度统计,白云机场旅客吞吐量最高的那一季度对应的净利润是多少?(注意,是单季度对应的净利润,非累计净利润。)
SELECT *
FROM(
select `ticker_symbol`, YEAR(`END_DATE`) YEAR, QUARTER(`END_DATE`) QUARTER, SUM(`VALUE`) VALUES_DATE
FROM `company operating`
where `INDIC_NAME_EN` LIKE 'Baiyun Airport%'
GROUP BY `TICKER_SYMBOL`, YEAR(`END_DATE`), QUARTER(`END_DATE`)
ORDER BY SUM(`VALUE`) DESC
LIMIT 1) AS C
LEFT JOIN(
SELECT `ticker_symbol`, YEAR(`END_DATE`) YEAR, QUARTER(`END_DATE`) QUARTER, SUM(`N_INCOME`) VALUES_DATE
FROM `income statement`
GROUP BY `TICKER_SYMBOL`, YEAR(`END_DATE`), QUARTER(`END_DATE`))
AS D ON C.ticker_symbol= D.ticker_symbol
AND C.YEAR= D.YEAR
AND C.QUARTER= D.QUARTER
练习题10:
数据来源:天池新人实战赛o2o优惠券使用预测赛题与数据-天池大赛-阿里云天池
使用Coupon Usage Data for O2O中的数据集《ccf_online_stage1_train.csv》和《ccf_offline_stage1_train.csv》,试找出在2016年6月期间,线上线下累计被使用优惠券满减最多的前3名商家。
比如商家A,消费者A在其中使用了一张200减50的,消费者B使用了一张30减1的,那么商家A累计被使用优惠券满减51元。
select A.`Merchant_id` ID,
SUM(A.DISCOUNT) as discountsum
from(
select `Merchant_id`, substring_index(`discount_rate`, ':', -1) DISCOUNT
from `ccf_online_stage1_train`
where `Date` like '%2016-06%'
and Date IS not NULL
and Coupon_id IS not NULL
and `Discount_rate`!= 'fixed'
union all
select `Merchant_id`, substring_index(`discount_rate`, ':', -1) DISCOUNT
from `ccf_offline_stage1_train`
where `Date` like '%2016-06%'
and Date IS not NULL
and Coupon_id IS not NULL
and `Discount_rate`!= 'fixed') AS A
group by A.`Merchant_id`
order by SUM(A.Discount) DESC
limit 3