LeetCode 295. Find Median from Data Stream

231 篇文章 0 订阅

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

For example,

[2,3,4], the median is 3

[2,3], the median is (2 + 3) / 2 = 2.5

Design a data structure that supports the following two operations:

  • void addNum(int num) - Add a integer number from the data stream to the data structure.
  • double findMedian() - Return the median of all elements so far.

 

Example:

addNum(1)
addNum(2)
findMedian() -> 1.5
addNum(3) 
findMedian() -> 2

 

Follow up:

  1. If all integer numbers from the stream are between 0 and 100, how would you optimize it?
  2. If 99% of all integer numbers from the stream are between 0 and 100, how would you optimize it?

--------------------------------------------------------------------------

用堆很容易想到,主要是两个堆顶和新加的数有6种情况,应该放在哪个堆又有两种情况,写起来一堆if else

所以先找个堆放进去,然后再决定要不要调整,codes就简洁很多:

import heapq

class MedianFinder:

    def __init__(self):
        """
        initialize your data structure here.
        """
        self.mi_heap = []
        self.ma_heap = []
        
    def addNum(self, num: int) -> None:
        heapq.heappush(self.ma_heap,-num)
        top = -heapq.heappop(self.ma_heap)
        heapq.heappush(self.mi_heap, top)
        if (len(self.mi_heap) > len(self.ma_heap)):
            mi_top = heapq.heappop(self.mi_heap)
            heapq.heappush(self.ma_heap,-mi_top)

    def findMedian(self) -> float:
        l1, l2 = len(self.ma_heap),len(self.mi_heap)
        if (l1 == 0):
            return None
        if (l1 == l2):
            return (self.mi_heap[0]-self.ma_heap[0]) / 2
        return -self.ma_heap[0]
        

# Your MedianFinder object will be instantiated and called as such:
# obj = MedianFinder()
# obj.addNum(num)
# param_2 = obj.findMedian()

Copy from discussion about extensions:

Followup #1 - If all integer numbers from the stream are between 0 and 100, how would you optimize it

  1. Create 100 buckets using an array of size 100.
  2. Store the numbers into these buckets.
  3. Find median by looping through this array.

Followup #2 - If 99% of all integer numbers from the stream are between 0 and 100, how would you optimize it?

  1. Divide problem into 3 subproblems. Here are the groupings:
    1. Numbers < 0: You have 2 options:
      1. Use 2-heap solution (that we coded in original solution), or
      2. Use 1 array, which represents 1 bucket
    2. 0 <= Numbers <= 100: Use 100 buckets using an array of size 100
    3. 100 < Numbers: You have 2 options:
      1. Use 2-heap solution (that we coded in original solution), or
      2. Use 1 array, which represents 1 bucket
  2. For each number we get in the stream, insert it into 1 of the 3 groupings, keeping track of the count of numbers in each of these 3 groupings
  3. To find the median, see which grouping the median must fall into and find it there.

 

For Numbers < 0 and 100 < Numbers, using 2 arrays/buckets is the more practical solution since it is very unlikely the median will fall into either bucket/array. This makes findMedian() O(1) in average case. In the worst case, all numbers fall in 1 array, and we would either have to use Quickselect (O(n) average case, O(n2) worst case), or sorting (O(n log n)) to find the median.

 

If you use 2 heaps instead, you will get findMedian() of O(1) average case, O(log n) worst case.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值