Task06:综合练习题-10道经典题目-天池龙珠计划SQL训练营 笔记

本笔记为阿里云天池龙珠计划SQL训练营的学习内容,链接为:https://tianchi.aliyun.com/specials/promotion/aicampsql

 

第一题答案

SELECT MarketData.*,                                                                                                                                                                                                                                                                                                                                                                                                                                        
       OperatingData.INDIC_NAME_EN,
       OperatingData.VALUE,
       IncomeStatement.T_REVENUE,
       IncomeStatement.T_COGS,
       IncomeStatement.N_INCOME
FROM(SELECT TICKER_SYMBOL,END_DATE,
            CLOSE_PRICE
     FROM `market data`
     WHERE TICKER_SYMBOL IN ('600383','600048')) MarketData
LEFT  JOIN
(SELECT TICKER_SYMBOL,
        INDIC_NAME_EN,
        END_DATE,
        VALUE
     FROM `company operating`
     WHERE  TICKER_SYMBOL in ('600383','600048')) OperatingData
      ON MarketData.TICKER_SYMBOL=OperatingData.TICKER_SYMBOL
      AND MarketData.END_DATE=OperatingData.END_DATE
LEFT JOIN  
(SELECT DISTINCT TICKER_SYMBOL,
             END_DATE,
             T_REVENUE,
             T_COGS,
             N_INCOME
      FROM `income statement`
      WHERE TICKER_SYMBOL IN ('600383' AND '600048')) IncomeStatement
      ON MarketData.TICKER_SYMBOL=IncomeStatement.TICKER_SYMBOL    
      AND MarketData.END_DATE=IncomeStatement.END_DATE 
      ORDER BY MarketData.TICKER_SYMBOL,MarketData.END_DATE

①LEFT JOIN 后面不用加FROM,加了会出问题

② SQL查询的执行顺序:

FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY

③DISTINCT 关键词用于返回唯一不同的值。
SQL SELECT DISTINCT 语法
SELECT DISTINCT column_name,column_name
FROM table_name;

 

第二题答案

SELECT pH,
       `citric_acid`
FROM `winequality-red`
DENSE_RANK() OVER (ORDER BY `citric_acid`)  AS rankn
WHERE pH =3.03;

 

第三题答案

-- 发放优惠券总额最多的商家
SELECT Merchant_id,
 -- SUM(SUBSTRING_INDEX(`Discount_rate`,':', 1)) AS sale_amount,
SUM(SUBSTRING_INDEX(`Discount_rate`,':',-1)) AS discount_amount
FROM ccf_offline_stage1_test_revised
WHERE Date_received BETWEEN '2016-07-01' AND '2016-07-31'
GROUP BY Merchant_id
ORDER BY discount_amount DESC
LIMIT 1;
-- 发放优惠券张数最多的商家
SELECT Merchant_id,COUNT(1) AS cnt
FROM ccf_offline_stage1_test_revised
WHERE Date_received BETWEEN '2016-07-01' AND '2016-07-31'
GROUP BY Merchant_id
ORDER BY cnt DESC
LIMIT 1;

 

 

①日期查询限定格式:

WHERE column(日期) BETWEEN 'xxxx-xx-xx' AND 'yyyy-yy-yy'

substring_index()函数(笔记在task03SUBSTRING_INDEX – 字符串按索引截取)

substring_index()函数用来截取字符串

    substring_index(str,delim,count)
    str:要处理的字符串
    delim:分隔符
    count:计数

 

该函数用来获取原始字符串按照分隔符分割后,第 n 个分隔符之前(或之后)的子字符串,支持正向和反向索引,索引起始值分别为 1 和 -1。

个人见解:

例如这个句子:
SELECT SUBSTRING_INDEX('www.mysql.com.qq', '.', -3)

可以理解为分隔符“.”从右往左数第三个“.”的右边所有部分:

也就是mysql.com.qq

 

SELECT SUBSTRING_INDEX('www.mysql.com.qq', '.', 3)

这个句子可以理解为分隔符“.”从左往右数第三个“.”的左边所有部分

也就是www.mysql.com

 

③LIMIT 1的作用:

在根据非索引字段查询时,若查询结果只有一条记录,可以在sql语句中使用  " limit 1",来有效缩短查询时间。

缩短查询时间的原因:加入Limit后,查询结果能够被一级缓存保存!从底层分析的话,加入Limit 1后,匹配到一条数据后,就不会再往下查询了,所以,性能的提升和数据量的大小有很大关系。
 

④Substring_INDEX在Substring后面记得加“_”。

 

 

第四题

--2015用电高峰月(本人看答案后更改)
SELECT PERIOD_DATE,MAX(DATA_VALUE) FinalValue
FROM `macro industry`
WHERE INDIC_ID='2020101522'
AND YEAR(PERIOD_DATE)=2015
GROUP BY PERIOD_DATE
ORDER BY FinalValue
LIMIT 1;  
--相比去年同期增长了多少个百分比
--本人做答(不会)
/*SELECT PERIOD_DATE,MAX(DATA_VALUE) FinalValue
FROM `macro industry`
WHERE INDIC_ID='2020101522'
AND YEAR(PERIOD_DATE)=2015
LEFT JOIN(
SELECT PERIOD_DATA,MAX(DATA_VALUE) FinalValue2
FROM `macro industry` AS P2
AND YEAR(PERIOD_DATE)=2014)
LEFT JOIN
(SELECT MAX(DATA_VALUE) FinalValue,
        MAX(DATA_VALUE) FinalValue2,
        ((MAX(DATA_VALUE) FinalValue-MAX(DATA_VALUE) FinalValue2)/
        MAX(DATA_VALUE) FinalValue2) AS Rate
GROUP BY PERIOD_DATE
ORDER BY FinalValue
LIMIT 1; */
--本人看答案后修改版本
SELECT Basedata.*,
(BaseData.Finalvalue-YOY.Finalvalue2)/YOY.Finalvalue2 AS YOY
FROM (SELECT PERIOD_DATE,
MAX(DATA_VALUE) AS Finalvalue
FROM `macro industry`
WHERE INDIC_ID='2020101522'
AND YEAR(PERIOD_DATE)=2015
GROUP BY PERIOD_DATE
ORDER BY Finalvalue DESC
LIMIT 1) Basedata
LEFT JOIN 
(SELECT PERIOD_DATE,MAX(DATA_VALUE) AS Finalvalue2
FROM `macro industry`
WHERE INDIC_ID='2020101522'
AND YEAR(PERIOD_DATE)=2014
GROUP BY PERIOD_DATE) AS YOY
ON YEAR(Basedata.PERIOD_DATE)=YEAR(YOY.PERIOD_DATE)+1
AND MONTH(Basedata.PERIOD_DATE)=MONTH(YOY.PERIOD_DATE);

 

 

①日期函数的年份单独查询方法:

YEAR(时间列名)=年份,或具体年月日:

YEAR(时间列名)=‘xxxx-xx-xx'

使用ORDER BY后默认的排序是升序,升序可能会将题目要求的数据颠倒,要注意这个细节,加上DESC

③AS是可以省略的

 

第六题

/*请使用 Wine Quality Data 数据集《winequality-white.csv》,找出 pH=3.63的
所有白葡萄酒,然后,对其 residual sugar 量进行英式排名(非连续的排名)*/
--本人做答
SELECT white.*
FROM `winequality-white` AS white
WHERE pH='3.63'
RANK() OVER (ORDER BY `residual sugar`) AS ranking

--标准答案
SELECT pH, 
 `residual sugar`,
 RANK() OVER (ORDER BY `residual sugar`) AS rankn
 FROM `winequality-white`
WHERE pH= 3.63;

 ①RANK() OVER (ORDER BY column名 ) AS rankn 为英式排名

 

 

 

第七题

/*请使用A股上市公司季度营收预测中的数据集《Market Data.xlsx》中的sheet-DATA,

计算截止到2018年底,市值最大的三个行业是哪些?以及这三个行业里市值
最大的三个公司是哪些?(每个行业找出前三大的公司,即一共要找出9个)*/
--本人作答(看答案后修改的版本)
-- 计算截止到2018年底,市值最高的三个行业业是哪些?
SELECT TYPE_NAME_CN,
       SUM(MARKET_VALUE) AS Sum_value
FROM `market data`
WHERE YEAR(END_DATE) = '2018-12-31'
GROUP BY TYPE_NAME_CN
ORDER BY Sum_value DESC
LIMIT 3;

--标准答案
-- 计算截止到2018年底,市值最高的三个行业业是哪些?
SELECT TYPE_NAME_CN,
 SUM(MARKET_VALUE)
 FROM `market data`
WHERE YEAR(END_DATE) = '2018-12-31'
GROUP BY TYPE_NAME_CN
ORDER BY SUM(MARKET_VALUE) DESC
LIMIT 3


--这三个行业里市值最大的三个公司是哪些?
--本人作答(不会做)
SELECT TICKER_SYMBOL,TYPE_NAME_CN,
       SUM(MARKET_VALUE) AS Maxium_in_3_type
FROM(
SELECT TYPE_NAME_CN,
       SUM(MARKET_VALUE) AS Sum_value
FROM `market data`
WHERE YEAR(END_DATE) = '2018-12-31'
GROUP BY TYPE_NAME_CN
ORDER BY Sum_value DESC
LIMIT 3)
GROUP BY TICKER_SYMBOL
ORDER BY Maxium_in_3_type DESC


--标准答案
--这三个行业里市值最大的三个公司是哪些?
SELECT BaseData.TYPE_NAME_CN,
 BaseData.TICKER_SYMBOL
 FROM (SELECT TYPE_NAME_CN,
 TICKER_SYMBOL,
 MARKET_VALUE,
 ROW_NUMBER() OVER(PARTITION BY TYPE_NAME_CN ORDER BY MARKET_VALUE) AS 
 CompanyRanking --运用窗口函数将每种行业中的市值进行排序
 FROM `market data` ) BaseData
 LEFT JOIN --左连结
 ( SELECT TYPE_NAME_CN,
 SUM(MARKET_VALUE)
 FROM `market data`
 WHERE YEAR(END_DATE) = '2018-12-31'
 GROUP BY TYPE_NAME_CN
 ORDER BY SUM(MARKET_VALUE) DESC
 LIMIT 3 ) top3Type --第一问中的答案,也就是三大市值总值最高的行业
 ON BaseData.TYPE_NAME_CN = top3Type.TYPE_NAME_CN
WHERE CompanyRanking <= 3  --限定每个行业中的不同市值的公司个数
 AND top3Type.TYPE_NAME_CN IS NOT NULL --行业名不能为空

 

 

 

 

①PARTITION BY的用法让同一商品种类中的物品能够优先比较某些方面的参数(ORDER BY),然后按照升序排列

PARTITION BY 能够设定窗口对象范围

②LIMIT x的作用:

在根据非索引字段查询时,若查询结果只有x条记录,可以在sql语句中使用  " limit x",来有效缩短查询时间。

缩短查询时间的原因:加入Limit后,查询结果能够被一级缓存保存!从底层分析的话,加入Limit 1后,匹配到一条数据后,就不会再往下查询了,所以,性能的提升和数据量的大小有很大关系。

③日期函数的年份单独查询方法:

YEAR(时间列名)=年份,或具体年月日:

YEAR(时间列名)=‘xxxx-xx-xx'

 

 

第八题

--标准答案
SELECT User_id,
 SUM(couponCount) couponCount
 FROM (SELECT User_id,
 count(*) AS couponCount
 FROM `ccf_online_stage1_train`
 WHERE (Date != 'null' AND Coupon_id != 'null')
 AND (LEFT(DATE,4)=2016 )
 GROUP BY User_id
 UNION ALL 
 SELECT User_id,
 COUNT(*) AS couponCount
 FROM `ccf_offline_stage1_train`
 WHERE (Date != 'null' AND Coupon_id != 'null')
 AND (LEFT(DATE,4)=2016 )
 GROUP BY User_id ) AS BaseData
GROUP BY User_id
ORDER BY SUM(couponCount) DESC
LIMIT 1

 

 

 

 

 

 

第九题】

不会做

本人根据答案作答:

SELECT *
FROM(
SELECT TICKER_SYMBOL,
SUM(VALUE) AS Amount,
YEAR(END_DATE) AS YEAR,
QUARTER(END_DATE) AS QUARTER
FROM `company operating`
WHERE INDIC_NAME_EN ='Baiyun Airport:Passenger throughput'
GROUP BY TICKER_SYMBOL,
         YEAR,
         QUARTER
ORDER BY Amount DESC) BASEDATA
LEFT JOIN
(SELECT TICKER_SYMBOL,
SUM(N_INCOME) AS Amount2,
YEAR(END_DATE) AS YEAR,
QUARTER(END_DATE) AS QUARTER
FROM `income statement`
GROUP BY TICKER_SYMBOL,
         YEAR,
         QUARTER) AS INCOME
ON BASEDATA.TICKER_SYMBOL=INCOME.TICKER_SYMBOL

 

 

标准答案:

SELECT *
 FROM (SELECT TICKER_SYMBOL,
 YEAR(END_DATE) AS Year,
 QUARTER(END_DATE) AS QUARTER,
 SUM(VALUE) AS Amount
 FROM `company operating`
 WHERE INDIC_NAME_EN = 'Baiyun Airport:Passenger throughput'
 GROUP BY TICKER_SYMBOL,
       YEAR(END_DATE),
       QUARTER(END_DATE)
 ORDER BY SUM(VALUE) DESC
 LIMIT 1 ) AS BaseData
 LEFT JOIN -- income statement
 (SELECT TICKER_SYMBOL,
 YEAR(END_DATE) AS Year,
 QUARTER(END_DATE) AS QUARTER,
 SUM(N_INCOME) Amount
 FROM `income statement`
 GROUP BY TICKER_SYMBOL,
          YEAR(END_DATE),
          QUARTER(END_DATE) ) AS Income
 ON BaseData.TICKER_SYMBOL = Income.TICKER_SYMBOL

 

 

 

第十题

--标准答案
SELECT Merchant_id,
 SUM(discount_amount) discount_amount
 FROM (SELECT Merchant_id,
 SUM(SUBSTRING_INDEX(`Discount_rate`,':',-1)) AS discount_amount
 FROM `ccf_online_stage1_train`
 WHERE (Date != 'null' AND Coupon_id != 'null')
 AND (LEFT(DATE,4)=2016 ) 
 AND MID(DATE,5,2) = '06'
 GROUP BY Merchant_id
 UNION ALL 
 SELECT Merchant_id,
 SUM(SUBSTRING_INDEX(`Discount_rate`,':',-1)) AS discount_amount
 FROM `ccf_offline_stage1_train`
 WHERE (Date != 'null' AND Coupon_id != 'null')
 AND (LEFT(DATE,4)=2016 )
 AND MID(DATE,5,2) = '06'
 GROUP BY Merchant_id ) BaseData
GROUP BY Merchant_id
ORDER BY SUM(discount_amount) DESC
LIMIT 1

 

--本人看答案后修正版本
SELECT Merchant_id,SUM(discount_amount) AS discount_amount
FROM(
SELECT Merchant_id,
SUM(SUBSTRING_INDEX(Discount_rate,':',-1)) AS discount_amount
FROM ccf_offline_stage1_train
WHERE (Date !='null' AND Coupon_id !='null')
AND(LEFT(Date,4)=2016 )
AND MID(Date,5,2)=06  
GROUP BY Merchant_id
UNION ALL
SELECT Merchant_id,
SUM(SUBSTRING_INDEX(Discount_rate,':',-1)) AS discount_amount
FROM ccf_online_stage1_train
WHERE (Date !='null' AND Coupon_id !='null')
AND(LEFT(Date,4)=2016 )
AND MID(Date,5,2)=06 
GROUP BY Merchant_id 

) AS BaseData
GROUP BY Merchant_id
ORDER BY SUM(discount_amount) DESC
LIMIT 1
 

 

①MID函数
此函数为截取字符串一部分。MID(column_name,start[,length])

参数    描述
column_name    必需。要提取字符的字段。
start    必需。规定开始位置(起始值是 1)。
length    可选。要返回的字符数。如果省略,则 MID() 函数返回剩余文本。
例:str=“123456” mid(str,2,1) 结果为2

②“union all”的意思为“联合所有”,是指对两个结果集进行并集操作,包括重复行,不进行排序。

③ORDER BY SUM(discount_amount) DESC
LIMIT 1

该句代码指令可以求出倒序的第一位的数据,也就是最高的那一位

(LIMIT 1)

 

 


引用部分:

CSDN博主「石头成说」
原文链接:https://blog.csdn.net/qq_27127145/article/details/83443306)

CSDN博主「惜木兮」

原文链接:https://blog.csdn.net/weixin_46023346/article/details/120817379

 

 

 

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值