中位数(Median)又称中值,统计学中的专有名词,是按顺序排列的一组数据中居于中间位置的数,代表一个样本、种群或概率分布中的一个数值,其可将数值集合划分为相等的上下两部分。对于有限的数集,可以通过把所有观察值高低排序后找出正中间的一个作为中位数。如果观察值有偶数个,通常取最中间的两个数值的平均数作为中位数。

方法1:利用窗口函数,排序,编号,找位置,取值

SELECT
	avg(a.num) AS median
FROM
	(
	SELECT
		ROW_NUMBER() OVER(
		ORDER BY num) AS rn,
		num,
		count(*) OVER() AS n
	FROM
		tmp) AS a
WHERE
	rn IN (floor(n / 2)+ 1, IF(MOD(n,
	2)= 0,
	floor(n / 2),
	floor(n / 2)+ 1));

注:floor函数返回小于等于指定值的最大整数并取整

方法2:无论正排还是倒排,对于奇数个数来说,其编号始终不变,而对于偶数个数来说,两个中位数(取均值)的编号相差±1

select avg(num)
from 
(
    select id
        ,num 
        ,row_number() over(order by num, id) as rn1
        ,row_number() over(order by num desc, id desc) as rn2
    from tmp
)as t 
where rn1 = rn2 or abs(rn1-rn2) = 1;

 

进阶:分组排序取中位数

SELECT * FROM EMPLOYEE
id company salary
1 A 2,341
2 A 341
3 A 15
4 A 15,314
5 A 451
6 A 513
7 B 15
8 B 13
9 B 1,154
SELECT
	id,
	salary,
	company
FROM
	(
	SELECT
		id,
		company,
		salary,
		ROW_NUMBER () OVER(PARTITION BY company
	ORDER BY
		salary) AS rn,
		count(*)OVER(PARTITION BY company) AS n
	FROM
		employee ) AS a
WHERE
	rn IN (floor(n / 2)+ 1, IF(MOD(n,
	2)= 0,
	floor(n / 2),
	floor(n / 2)+ 1));

输出:

 
id salary company
5 451 A
6 513 A
7 15 B

进阶:给定数字的频率求中位数

select round(avg(num),1) as median
from
(select Num, frequency,
        sum(frequency) over(order by num asc) as total,
        sum(frequency) over(order by num desc) as total1
from Numbers
order by num asc)as a
where total>=(select sum(frequency) from Numbers)/2
and total1>=(select sum(frequency) from Numbers)/2

 

posted on 2023-02-21 10:26  白的枫叶  阅读(22)  评论(0)    收藏  举报