Table Person
:
+----------------+---------+ | Column Name | Type | +----------------+---------+ | id | int | | name | varchar | | phone_number | varchar | +----------------+---------+ id is the primary key for this table. Each row of this table contains the name of a person and their phone number. Phone number will be in the form 'xxx-yyyyyyy' where xxx is the country code (3 characters) and yyyyyyy is the phone number (7 characters) where x and y are digits. Both can contain leading zeros.
Table Country
:
+----------------+---------+ | Column Name | Type | +----------------+---------+ | name | varchar | | country_code | varchar | +----------------+---------+ country_code is the primary key for this table. Each row of this table contains the country name and its code. country_code will be in the form 'xxx' where x is digits.
Table Calls
:
+-------------+------+ | Column Name | Type | +-------------+------+ | caller_id | int | | callee_id | int | | duration | int | +-------------+------+ There is no primary key for this table, it may contain duplicates. Each row of this table contains the caller id, callee id and the duration of the call in minutes. caller_id != callee_id
A telecommunications company wants to invest in new countries. The company intends to invest in the countries where the average call duration of the calls in this country is strictly greater than the global average call duration.
Write an SQL query to find the countries where this company can invest.
Return the result table in any order.
The query result format is in the following example.
Person
table: +----+----------+--------------+ | id | name | phone_number | +----+----------+--------------+ | 3 | Jonathan | 051-1234567 | | 12 | Elvis | 051-7654321 | | 1 | Moncef | 212-1234567 | | 2 | Maroua | 212-6523651 | | 7 | Meir | 972-1234567 | | 9 | Rachel | 972-0011100 | +----+----------+--------------+Country
table: +----------+--------------+ | name | country_code | +----------+--------------+ | Peru | 051 | | Israel | 972 | | Morocco | 212 | | Germany | 049 | | Ethiopia | 251 | +----------+--------------+ Calls table: +-----------+-----------+----------+ | caller_id | callee_id | duration | +-----------+-----------+----------+ | 1 | 9 | 33 | | 2 | 9 | 4 | | 1 | 2 | 59 | | 3 | 12 | 102 | | 3 | 12 | 330 | | 12 | 3 | 5 | | 7 | 9 | 13 | | 7 | 1 | 3 | | 9 | 7 | 1 | | 1 | 7 | 7 | +-----------+-----------+----------+ Result table: +----------+ | country | +----------+ | Peru | +----------+ The average call duration for Peru is (102 + 102 + 330 + 330 + 5 + 5) / 6 = 145.666667 The average call duration for Israel is (33 + 4 + 13 + 13 + 3 + 1 + 1 + 7) / 8 = 9.37500 The average call duration for Morocco is (33 + 4 + 59 + 59 + 3 + 7) / 6 = 27.5000 Global call duration average = (2 * (33 + 3 + 59 + 102 + 330 + 5 + 13 + 3 + 1 + 7)) / 20 = 55.70000 Since Peru is the only country where average call duration is greater than the global average, it's the only recommended country.
注意点:
1、确定每个电话号码的区号,需要取前三位首字母,使用LEFT(phone_number,3)
2、mysql 8.0里错误的sql:如果同时选择了t2.name和AVG(t1.duration),那么必须要使用group by,否则无法通过debug,所以avg窗口函数和整体table的均值并列展示。最后这部分改成了后面的最终解法:
tmp3 AS
(SELECT t2.name,
AVG(t1.duration) OVER (PARTITION BY t2.name) avg_code_d,
AVG(t1.duration) avg_d
FROM tmp1 t1, tmp2 t2
WHERE t1.uid = t2.id)
SELECT NAME AS country
FROM tmp3
WHERE avg_code_d >= avg_d;
最终解法:
WITH tmp1 AS
(SELECT caller_id AS uid, duration
FROM calls
UNION ALL
SELECT callee_id, duration
FROM calls),
tmp2 AS
(SELECT country_code, id, c.name
FROM person p, country c
WHERE LEFT(phone_number,3) = c.country_code)
SELECT t2.name
FROM tmp1 t1, tmp2 t2
WHERE t1.uid = t2.id
GROUP BY t2.name
HAVING AVG(t1.duration) > (SELECT AVG(duration) FROM tmp1)