Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.
Examples:
[2,3,4]
, the median is 3
[2,3]
, the median is (2 + 3) / 2 = 2.5
Given an array nums, there is a sliding window of size k which is moving from the very left of the array to the very right. You can only see the k numbers in the window. Each time the sliding window moves right by one position. Your job is to output the median array for each window in the original array.
For example,
Given nums = [1,3,-1,-3,5,3,6,7]
, and k = 3
.
Window position Median
[1 3 -1] -3 5 3 6 7 1
1 [3 -1 -3] 5 3 6 7 -1
1 3 [-1 -3 5] 3 6 7 -1
1 3 -1 [-3 5 3] 6 7 3
1 3 -1 -3 [5 3 6] 7 5
1 3 -1 -3 5 [3 6 7] 6
Therefore, return the median sliding window as [1,-1,-1,3,5,6]
.
Note:
You may assume k
is always valid, ie: k
is always smaller than input array’s size for non-empty array.
解法
解法一:两个堆
参考评论区@ArizonaTea的答案
对于长度为k的滑动窗口,假如我们将窗口里的数字升序排列,那么如果将窗口分成A=[:k//2]
和B=[k//2:]
两部分:
- B数字较大,A数字较小
- 当k是偶数时,A和B数量相等;当k是奇数时,B比A将多1个数
- 我们将A做成最大堆,将B做成最小堆。当k是偶数时,中位数就是两个堆顶的均值;当k是奇数时,中位数就是B的堆顶
之前我一直没想通怎么去删除堆中间的数,其实不需要这样。
在窗口滑动的过程中我们如何维护这两个堆,使得它们的数字数量不变呢?
(一) 首先我们知道每滑动一位,会加入一个新数new
,并删除一个老数old
,显然我们需要看old
在哪个堆里,然后往这个堆里丢一个数就行了,区别只是说丢的数是new
还是另一个堆的堆顶。
举个例子,假如
old
在A里:(1)如果new<=B[0]
那么要往A里填的一定是new
这个数字;(2)否则,往A
里填B[0]
,并且将new
插入B
这样一来,虽然现在实际两个堆的数字个数不平均,但是去掉应该出堆的数字之后还是平均的。
由于每一轮只会涉及两个堆的堆顶元素,所以我们只要保证每一轮开始前,堆顶的元素一定是在滑动窗口内的。我们可以用一种延迟删除的方法,记录每个等待删除的数字。做完(一)之后:
(二)如果A
和B
的堆顶是待删元素,就把它们出堆。
(三)计算该滑动窗口内的中位数
注意:要单独处理k//2==0
的情况
class Solution(object):
def medianSlidingWindow(self, nums, k):
"""
:type nums: List[int]
:type k: int
:rtype: List[float]
"""
if k==0:
return
if k==1:
return map(float,nums)
import heapq
tmp = list(sorted(nums[:k]))
big,small = tmp[k//2:],map(lambda x:-x,tmp[:k//2])
heapq.heapify(small)
ans = [big[0]*1.0 if k%2!=0 else (big[0]-small[0])/2.0]
pending = {}
for i,c in enumerate(nums[k:]):
pending[nums[i]] = pending.get(nums[i],0)+1
if nums[i]>=big[0]:
if c>=-small[0]:
heapq.heappush(big,c)
else:
heapq.heappush(small,-c)
heapq.heappush(big,-heapq.heappop(small))
else:
if c<=big[0]:
heapq.heappush(small,-c)
else:
heapq.heappush(big,c)
heapq.heappush(small,-heapq.heappop(big))
while pending.get(big[0],0):
pending[big[0]] -= 1
heapq.heappop(big)
while pending.get(-small[0],0):
pending[-small[0]] -= 1
heapq.heappop(small)
ans.append(big[0]*1.0 if k%2!=0 else (big[0]-small[0])/2.0)
return ans
解法二:离散化+树状数组
假设每个数值x
对应一个a[x]
,树状数组可以高效地统计<=x
的所有y
的a[y]
的总和。
假如a[x]
代表了x的出现次数,那么前缀和代表了x+1
的排名
这个特性可以有效地用来找中位数
但是由于nums
的值是稀疏的,所以要做离散化
class BIT(object):
def __init__(self,i2n):
self.i2n = i2n
self.n = len(i2n)
self.c = [0]*self.n
self.a = [0]*self.n
def lowbit(self,k):
return k&-k
def add(self,k,val): # k in [1,n]
self.a[k-1] += val
while k<=self.n:
self.c[k-1] += val
k += self.lowbit(k)
def sum(self,k):
res = 0
while k>=1:
res += self.c[k-1]
k -= self.lowbit(k)
return res
def median(self,k):
if self.n==1:
return self.i2n[0]*1.0
l1 = 1
r1 = self.n
tar = (k+1)/2.0
while l1<r1:
mid = (l1+r1)>>1
if self.sum(mid)<tar:
l1 = mid + 1
else:
r1 = mid
if k%2!=0:
return self.i2n[r1-1]*1.0
res = self.i2n[r1-1]
while self.a[r1-2]==0:
r1 -= 1
return (res+self.i2n[r1-2])/2.0
class Solution(object):
def medianSlidingWindow(self, nums, k):
"""
:type nums: List[int]
:type k: int
:rtype: List[float]
"""
if k==0:
return
n = len(nums)
tmp = sorted(xrange(n),key=lambda x:nums[x])
n2i = sorted(xrange(1,n+1),key=lambda x:tmp[x-1])
i2n = map(lambda x:nums[x], tmp)
bit = BIT(i2n)
ans = []
for i in xrange(n):
bit.add(n2i[i],1)
if i>=k-1:
if i>=k:
bit.add(n2i[i-k],-1)
ans.append(bit.median(k))
return ans
……超慢