Numbers
表保存数字的值及其频率。
+----------+-------------+
| Number | Frequency |
+----------+-------------|
| 0 | 7 |
| 1 | 1 |
| 2 | 3 |
| 3 | 1 |
+----------+-------------+
在此表中,数字为 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 2, 3
,所以中位数是 (0 + 0) / 2 = 0
。
+--------+
| median |
+--------|
| 0.0000 |
+--------+
请编写一个查询来查找所有数字的中位数并将结果命名为 median
。
drop table if EXISTS Employee;
CREATE TABLE If Not Exists Employee (Id INT, Company varchar(50), Salary int);
insert into Numbers values (0,7);
insert into Numbers values (1,1);
insert into Numbers values (2,3);
insert into Numbers values (3,1);
select * from Numbers;
select
avg(cast(number as float)) median
from
(
select
Number,
Frequency,
sum(Frequency) over(order by Number) - Frequency prev_sum,
sum(Frequency) over(order by Number) curr_sum
from Numbers
) t1,
(
select
sum(Frequency) total_sum
from Numbers
) t2
where
t1.prev_sum <= (cast(t2.total_sum as float) / 2) and
t1.curr_sum >= (cast(t2.total_sum as float) / 2)
//只要理解prev_sum和curr_sum是频数即可
如果 n1.Number 为中位数,n1.Number(包含本身)前累计的数字应大于等于总数/2 同时n1.Number(不包含本身)前累计数字应小于等于总数/2
例如:0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 2, 3 共12个数
中位数0(包含本身)前累计的数字 7 >=6 0(不包含本身)前累计数字 0 <=6
例如:0,0,0,3,3,3 共6个数
中位数0(包含本身)前累计的数字 3 >=3 0(不包含本身)前累计数字 0 <=3
中位数3(包含本身)前累计的数字 6 >=3 3(不包含本身)前累计数字 3 <=3
SELECT
AVG(Number)median
FROM
(SELECT n1.Number FROM Numbers n1 JOIN Numbers n2 ON n1.Number>=n2.Number
GROUP BY
n1.Number
HAVING
SUM(n2.Frequency)>=(SELECT SUM(Frequency) FROM Numbers)/2
AND
SUM(n2.Frequency)-AVG(n1.Frequency)<=(SELECT SUM(Frequency) FROM Numbers)/2
)s