LeetCode Problem 4: Median of Two Sorted Arrays

Problem

There are two sorted arrays nums1 and nums2 of size m and n respectively.

Find the median of the two sorted arrays. The overall run time complexity should be O ( l o g ( m + n ) ) O(log(m+n)) O(log(m+n)).

You may assume nums1 and nums2 cannot be both empty.

Example 1:

nums1 = [1, 3]
nums2 = [2]
The median is 2.0

Example 2:

nums1 = [1, 2]
nums2 = [3, 4]
The median is (2 + 3) / 2 = 2.5

Solution

Approach: Recursive Approach
Algorithm

To solve this problem, we need to understand “What is the use of the median”. In statistics, the median is used for:

  • Dividing a set into two equal length subsets, that one subset is always greater than the other.

If we understand the use of median for dividing, we are very close to the answer.

First, let’s cut A A A into two parts at a random position i i i:

leftArightA
A[0], A[1], …, A[i-1]A[i], A[i+1], …, A[m-1]

Since A A A has m m m elements, so there are m + 1 m + 1 m+1 kinds of cutting ( i = 0 − m i = 0 - m i=0m).

And we know:

  • l e n ( l e f t A ) = i len(leftA) = i len(leftA)=i, l e n ( r i g h t A ) = m − 1 len(rightA) = m - 1 len(rightA)=m1.
  • Note: When i = 0 i = 0 i=0, l e f t A leftA leftA is empty, and when i = m i = m i=m, r i g h t A rightA rightA is empty.

With the same way, cut B B B into two parts at a random position j j j:

leftBrightB
B[0], B[1], …, B[j-1]B[j], B[j+1], …, B[n-1]

Put l e f t A leftA leftA and l e f t B leftB leftB into one set, and put r i g h t A rightA rightA and r i g h t B rightB rightB into another set. Let’s name them l e f t P a r t leftPart leftPart and r i g h t P a r t rightPart rightPart:

leftPartrightPart
A[0], A[1], …, A[i-1]A[i], A[i+1], …, A[m-1]
B[0], B[1], …, B[j-1]B[j], B[j+1], …, B[n-1]

If we can ensure:

  • l e n ( l e f t P a r t ) = l e n ( r i g h t P a r t ) len(leftPart) = len(rightPart) len(leftPart)=len(rightPart)
  • m a x ( l e f t P a r t ) ≤ m i n ( r i g h t P a r t ) max(leftPart) \le min(rightPart) max(leftPart)min(rightPart)

then we divide all the elements in { A A A, B B B} into two parts with equal length, and one part is always greater than the other. Then

m e d i a n = m a x ( l e f t P a r t ) + m i n ( r i g h t P a r t ) 2 median = \frac{max(leftPart) + min(rightPart)}{2} median=2max(leftPart)+min(rightPart)

To ensure these two conditions, we just need to ensure:

  • i + j = m − i + n − j i + j = m - i + n - j i+j=mi+nj (or: m − i + n − j + 1 m - i + n - j + 1 mi+nj+1)

If n ≥ m n \ge m nm, we just need to set: i = 0 − m i = 0 - m i=0m, j = m + n + 1 2 − i j = \frac{m+n+1}{2} - i j=2m+n+1i

  • B [ j − 1 ] ≤ A [ i ] B[j - 1] \le A[i] B[j1]A[i] and A [ i − 1 ] ≤ B [ j ] A[i - 1] \le B[j] A[i1]B[j]

PS.1 For simplicity, we presume A [ i − 1 ] A[i - 1] A[i1], B [ j − 1 ] B[j - 1] B[j1], A [ i ] A[i] A[i], B [ j ] B[j] B[j] are always valid even if i = 0 i = 0 i=0, i = m i = m i=m, j = 0 j = 0 j=0 or j = n j = n j=n. I will talk about how to deal with these edge values at last.

PS.2 Why n ≥ m n \ge m nm? Because we have to make sure j j j is non-negative since 0 ≤ i ≤ m 0 \le i \le m 0im and j = m + n + 1 2 − i j = \frac{m+n+1}{2} - i j=2m+n+1i. If n < m n \lt m n<m, then j j j may be negative, which will lead to wrong result.

So, all we need to do is:

  • Searching i i i in [0, m m m], to find an object i i i such that:
    B [ j − 1 ] ≤ A [ i ] B[j-1] \le A[i] B[j1]A[i] and A [ i − 1 ] ≤ B [ j ] A[i-1] \le B[j] A[i1]B[j], where j = m + n + 1 2 − i j = \frac{m + n + 1}{2} - i j=2m+n+1i

And we can do a binary search following steps described below:

  1. Set iMin = 0, iMax = m m m, then start searching in [iMin, iMax]

  2. Set i = i M i n + i M a x 2 i = \frac{iMin + iMax}{2} i=2iMin+iMax, j = m + n + 1 2 − i j = \frac{m+n+1}{2} - i j=2m+n+1i

  3. Now we have l e n ( l e f t P a r t ) = l e n ( r i g h t P a r t ) len(leftPart) = len(rightPart) len(leftPart)=len(rightPart). And there are only 3 situations that we may encounter:

  • B [ j − 1 ] ≤ A [ i ] B[j - 1] \le A[i] B[j1]A[i] and A [ i − 1 ] ≤ B [ j ] A[i - 1] \le B[j] A[i1]B[j]
    Means we have found the object i i i, so stop searching.
  • B [ j − 1 ] > A [ i ] B[j - 1] \gt A[i] B[j1]>A[i]
    Means A [ i ] A[i] A[i] is too small. We must adjust i i i to get B [ j − 1 ] ≤ A [ i ] B[j - 1] \le A[i] B[j1]A[i].
    Can we increase i i i?
    Yes. Because when i i i is increased, j j j will be decreased.
    So B [ j − 1 ] B[j - 1] B[j1] is decreased and A [ i ] A[i] A[i] is increased, and B [ j − 1 ] ≤ A [ i ] B[j - 1] \le A[i] B[j1]A[i] may be satified.
    Can we decrease i i i?
    No! Because when i i i is decreased, j j j will be increased.
    So B [ j − 1 ] B[j - 1] B[j1] is increased and A [ i ] A[i] A[i] is decreased, and B [ j − 1 ] ≤ A [ i ] B[j - 1] \le A[i] B[j1]A[i] will never be satisfied.
    So, set iMin = i i i + 1, and go to step 2.
  • A [ i − 1 ] > B [ j ] A[i - 1] \gt B[j] A[i1]>B[j]:
    Means A [ i − 1 ] A[i - 1] A[i1] is too big. And we must decrease i i i to get A [ i − 1 ] ≤ B [ j ] A[i - 1] \le B[j] A[i1]B[j].
    That is, we must adjust the searching range to [iMin, i i i - 1].
    So, set iMax = i i i - 1, and go to step 2.

When the object i i i is found, the median is:

  • m a x ( A [ i − 1 ] , B [ j − 1 ] ) max(A[i - 1], B[j - 1]) max(A[i1],B[j1]), when m + n m + n m+n is odd.
  • m a x ( A [ i − 1 ] , B [ j − 1 ] ) + m i n ( A [ i ] , B [ j ] ) 2 \frac{max(A[i - 1], B[j - 1]) + min(A[i], B[j])}{2} 2max(A[i1],B[j1])+min(A[i],B[j]), when m + n m + n m+n is even.

Now let’s consider the edge values i = 0 i = 0 i=0, i = m i = m i=m, j = 0 j = 0 j=0, j = n j = n j=n where A [ i − 1 ] A[i - 1] A[i1], B [ j − 1 ] B[j - 1] B[j1], A [ i ] A[i] A[i], B [ j ] B[j] B[j] may not exist. Actually it is easier than you think.

What we need to do is ensuring that m a x ( l e f t P a r t ) ≤ m i n ( r i g h t P a r t ) max(leftPart) \le min(rightPart) max(leftPart)min(rightPart). So, if i i i and j j j are not edge values (means A [ i − 1 ] A[i - 1] A[i1], B [ j − 1 ] B[j - 1] B[j1], A [ i ] A[i] A[i], B [ j ] B[j] B[j] all exist), then we must check both B [ j − 1 ] ≤ A [ i ] B[j - 1] \le A[i] B[j1]A[i] and A [ i − 1 ] ≤ B [ j ] A[i - 1] \le B[j] A[i1]B[j]. But if some of $A[i - 1], B [ j − 1 ] B[j - 1] B[j1], A [ i ] A[i] A[i], B [ j ] B[j] B[j] don’t exist, then we need to check one (or both) of these two conditions. For example, if i = 0 i = 0 i=0, then A [ i − 1 ] A[i - 1] A[i1] doesn’t exist, then we don’t need to check A [ i − 1 ] ≤ B [ j ] A[i - 1] \le B[j] A[i1]B[j]. So, what we need to do is:

  • Searching i i i in [0, m m m] to find an object i i i such that:
    ( j = 0 j = 0 j=0 or i = m i = m i=m or B [ j − 1 ] ≤ A [ i ] B[j - 1] \le A[i] B[j1]A[i]) and
    ( i = 0 i = 0 i=0 or j = n j = n j=n or A [ i − 1 ] ≤ B [ j ] A[i - 1] \le B[j] A[i1]B[j]), where j = m + n + 1 2 − i j = \frac{m + n + 1}{2} - i j=2m+n+1i

And in a searching loop, we will encounter only three situations:

  • ( j = 0 j = 0 j=0 or i = m i = m i=m or B [ j − 1 ] ≤ A [ i ] B[j - 1] \le A[i] B[j1]A[i]) and
    ( i = 0 i = 0 i=0 or j = n j = n j=n or A [ i − 1 ] ≤ B [ j ] A[i - 1] \le B[j] A[i1]B[j])
    Means i i i is perfect, we can stop searching.

  • j > 0 j > 0 j>0 and i < m i < m i<m and B [ j − 1 ] > A [ i ] B[j - 1] > A[i] B[j1]>A[i]
    Means i i i is too small, we must increase it.

  • i > 0 i > 0 i>0 and j < n j < n j<n and A [ i − 1 ] > B [ j ] A[i - 1] > B[j] A[i1]>B[j]
    Means i i i is too big, we must decrease it.

Code: C++
class Solution {
public:
    double findMedianSortedArrays(vector<int>& nums1, vector<int>& nums2) {
        int m = nums1.size();
        int n = nums2.size();
        if (m > n)
        {
            vector<int> temp = nums1;
            nums1 = nums2;
            nums2 = temp;
            int tmp = m;
            m = n;
            n = tmp;
        }
        int iMin = 0;
        int iMax = m;
        int halfLen = (m + n + 1) / 2;
        while (iMin <= iMax)
        {
            int i = (iMin + iMax) / 2;
            int j = halfLen - i;
            if (i < iMax && nums2[j - 1] > nums1[i])
            {
                iMin = i + 1;
            }
            else if (i > iMin && nums1[i - 1] > nums2[j])
            {
                iMax = i - 1;
            }
            else
            {
                int maxLeft = 0;
                if (i == 0)
                {
                    maxLeft = nums2[j - 1];
                }
                else if (j == 0)
                {
                    maxLeft = nums1[i - 1];
                }
                else
                {
                    maxLeft = max(nums1[i - 1], nums2[j - 1]);
                }
                if ((m + n) % 2 == 1)
                {
                    return maxLeft;
                }
                
                int minRight = 0;
                if (i == m)
                {
                    minRight = nums2[j];
                }
                else if (j == n)
                {
                    minRight = nums1[i];
                }
                else
                {
                    minRight = min(nums2[j], nums1[i]);
                }            
                return (maxLeft + minRight) / 2.0;
            }
        }
        return 0.0;
    }
};
Complexity Analysis
  • Time complexity: O ( l o g ( m i n ( m , n ) ) ) O(log(min(m ,n))) O(log(min(m,n))).
    At first, the searching range id [0, m m m]. And the length of this searching range will be reduced by half after each loop. So, we only need l o g ( m ) log(m) log(m) loops. Since we do constant operations in each loop, the time complexity is O ( l o g ( m ) ) O(log(m)) O(log(m)). Since m ≤ n m \le n mn, the time complexity is O ( l o g ( m i n ( m , n ) ) ) O(log(min(m, n))) O(log(min(m,n))).

  • Space complexity:
    We only need constant memory to strore local variables, so the space comlpexity is O ( 1 ) O(1) O(1).

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值