MySQL面试题_两地多次通行平均距离计算

  问题是这样的,有出发地到目的地的多次通行距离数据,需要通过SQL语句求出两地之间的平均通行距离。比如成都到重庆、重庆到成都都视为这两地之间的形成,需要合并计算,数据如下↓

  下面用两种思路来解答,第一种思路是先以出发地和目的地进行聚合操作,求出合计的距离和次数;然后通过简单的窗口函数,对上面这个聚合的表格进行一次JOIN操作,通过简单的逻辑判断得到我们想要的结果。

  先进行第一步的聚合操作,这一步就比较简单了,通过GROUP BY就完成了,只是我们还需要在加一个ROW_NUMBER函数来计算出编号后面使用,SQL语句和结果如下↓

SELECT  start,end,  SUM(distance) AS tot_distance,  COUNT(*) AS no_of_time,  ROW_NUMBER() OVER(ORDER BY start) as idFROM  `distance`GROUP BY  start,end

  接下来就是对上面这个表格进行JOIN处理,使用两次上面这个结果,用第一个表格的start字段去JOIN第二个表格的end字段,我们就可以判断出相关两地之间的距离;但这里会出现两次,所有还需要加一个条件,第一个表格的id<第二个表格的id,SQL语句和结果如下↓​​​​​​​

WITH cte AS  (SELECT    start,end,    SUM(distance) AS tot_distance,    COUNT(*) AS no_of_time,    ROW_NUMBER() OVER(ORDER BY start) as id  FROM    `distance`  GROUP BY    start,end)SELECT  t1.start,t1.end,  t1.tot_distance,t2.tot_distance,t1.no_of_time,t2.no_of_timeFROM  cte AS t1JOIN cte AS t2 ON t1.start = t2.end AND t1.id < t2.id

  基本上就实现了,我们只需要把最后的距离和次数加起来然后相除就可以了,最后的结果如下↓

  第一种思路就是实现了,但是我们这里的数据各地之间是不相重合的,如果我们再增加一个成都到上海的路径,就会出问题了,所以我们还有第二种思路。


  我们先增加两行数据,成都-上海的数据,数据如下↓

  然后我们通过连接语句,把出发和目的地连接,这样就是唯一的标识了,SQL语句和结果如下↓​​​​​​​

SELECT  CONCAT(start,"-",end) AS start,  CONCAT(end,"-",start) AS end,  distanceFROM  distance

  然后通过次互换出发地和目的地的连接,就可以得到两地之间的距离和次数,在稍微计算一下就得到了平均距离和,SQL语句和结果如下↓

WITH ts AS  (SELECT    CONCAT(start,"-",end) AS start,    CONCAT(end,"-",start) AS end,    distance  FROM    distance)SELECT start,SUM(distance),COUNT(*),SUM(distance)/COUNT(*) AS avg_dist FROM(SELECT start,distance FROM ts AS t1UNION ALLSELECT end,distance FROM ts AS t1) tssGROUP BY  start

  当然最后我们还可以把第一列拆分一下,只需要在上面的基础上用字符拆分函数就行了,SQL语句和结果如下↓​​​​​​​

(WITH ts1 AS  (WITH ts AS    (SELECT      CONCAT(start,"-",end) AS start,      CONCAT(end,"-",start) AS end,      distance    FROM      distance)  SELECT start,SUM(distance)/COUNT(*) AS avg_dist FROM  (SELECT start,distance FROM ts AS t1  UNION ALL  SELECT end,distance FROM ts AS t1) tss  GROUP BY    start)SELECT  SUBSTRING_INDEX(start,"-",1) AS start,  SUBSTRING_INDEX(start,"-",-1) AS end,  ROUND(avg_dist,2) AS avg_distFROM  ts1)

  到了这里,我们是不是可以通过合并的方式使开始-出发地不重复,就可以通过第一种方式来解决了,SQL语句和结果如下↓​​​​​​​

WITH cte AS  (SELECT    start,end,    SUM(distance) AS tot_distance,    COUNT(*) AS no_of_time,    ROW_NUMBER() OVER(ORDER BY start) as id  FROM    (SELECT    CONCAT(start,"-",end) AS start,    CONCAT(end,"-",start) AS end,    distance    FROM    distance) ts  GROUP BY    start,end)SELECT  t1.start,  (t1.tot_distance+t2.tot_distance)/(t1.no_of_time+t2.no_of_time) AS avg_distFROM  cte AS t1JOIN cte AS t2 ON t1.start = t2.end AND t1.id < t2.id

  非常完美,各种形式的结果都有了,只需要根据需求使用就行了。

 

 

End

◆ PowerBI_RFM客户关系模型

◆ PowerBI饼图、圈图、旭日图

◆ Excel时间序列预测函数

◆ Python操作MySQL数据库

◆ Python企业微信机器人

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值