寻找两个有序数组的中位数----Median of Two Sorted Arrays

There are two sorted arrays nums1 and nums2 of size m and n respectively.

Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).

You may assume nums1 and nums2 cannot be both empty.

Example 1:nums1 = [1, 3] nums2 = [2] The median is 2.0

Example 2:nums1 = [1, 2] nums2 = [3, 4] The median is (2 + 3)/2 = 2.5

给定两个有序数组长度分别为m,n,求出中位数。当两个有序数组合并为一个新有序数组X时,元素总个数为N=m+n

如果N为偶数返回(X[N/2]+X[(N-1)/2])/2;如果N为奇数,返回X[N/2];要求时间复杂度为O(log(m+n));

参考:https://blog.csdn.net/yutianzuijin/article/details/11499917/

解法一

假定某一个方法能够返回两个有序数组中第K小的数,则零K=(m+n)/2,则可以得到中位数。

假设数组A和B的元素个数都大于k/2,比较A[k/2-1]和B[k/2-1]两个元素,这两个元素分别表示A的第k/2小的元素和B的第k/2小的元素。

两个元素比较共有三种情况:>、<和=。如果A[k/2-1]<B[k/2-1],这表示A[0]到A[k/2-1]的元素都在A和B合并之后的前k小的元素中。换句话说,A[k/2-1]不可能大于两数组合并之后的第k小值,所以我们可以将其抛弃。

证明也很简单,可以采用反证法。假设A[k/2-1]大于合并之后的第k小值,我们不妨假定其为第(k+1)小值。由于A[k/2-1]小于B[k/2-1],所以B[k/2-1]至少是第(k+2)小值。但实际上,在A中至多存在k/2-1个元素小于A[k/2-1],B中也至多存在k/2-1个元素小于A[k/2-1],所以小于A[k/2-1]的元素个数至多有k/2+ k/2-2,小于k,这与A[k/2-1]是第(k+1)的数矛盾。

当A[k/2-1]>B[k/2-1]时存在类似的结论。

当A[k/2-1]=B[k/2-1]时,我们已经找到了第k小的数,也即这个相等的元素,我们将其记为m。由于在A和B中分别有k/2-1个元素小于m,所以m即是第k小的数。(这里可能有人会有疑问,如果k为奇数,则m不是中位数。这里是进行了理想化考虑,在实际代码中略有不同,是先求k/2,然后利用k-k/2获得另一个数。)

边界条件:

  • 如果A或者B为空,则直接返回B[k-1]或者A[k-1];
  • 如果k为1,我们只需要返回A[0]和B[0]中的较小值;
  • 如果A[k/2-1]=B[k/2-1],返回其中一个;
public double findMedianSortedArrays(int[] nums1, int[] nums2) {
		int m = nums1.length;
		int n = nums2.length;
		int total = m + n;
		
		//如果奇数,则返回第total/2+1小
		if((total & 0x01) == 1)
			return findKth(nums1, 0, m, nums2, 0, n, total/2+1);
		//如果为偶数,则返回total/2和total/2+1平均值
		else
		    return (findKth(nums1, 0, m, nums2, 0, n, total/2) + findKth(nums1, 0, m, nums2, 0, n, total/2+1))/2;
		
	}
	
	private double findKth(int[] nums1, int l1, int m, int[] nums2, int l2, int n, int k) {
		
		//假定m<n
		if(m > n)
			return findKth(nums2, l2, n, nums1, l1, m, k);
		
		if(m == 0)
			return nums2[l2+k-1];
		
		if(k == 1)
			return Math.min(nums1[l1], nums2[l2]);
		
		int pa = Math.min(k/2, m);
		int pb = k - pa;
		//System.out.println("m = " + m + " n = " + n + " l1 = " + l1 + " l2 = " + l2 + " k = " + k + " pa = " + pa + " pb = " + pb + " " + nums1[pa-1] + " " + nums2[pb-1]);
		if(nums1[l1+pa-1] < nums2[l2+pb-1])
			return findKth(nums1, l1+pa, m-pa, nums2, l2, n, k-pa);
		
		else if(nums1[l1+pa-1] > nums2[l2+pb-1])
			return findKth(nums1, l1, m, nums2, l2+pb, n-pb, k-pb);
		
		else
			return nums1[l1+pa-1];
	}

解法二

Approach 1: Recursive Approach

中位数的概念

To solve this problem, we need to understand "What is the use of median". In statistics, the median is used for:

Dividing a set into two equal length subsets, that one subset is always greater than the other.

If we understand the use of median for dividing, we are very close to the answer.

将数组A进行切分,有m+1种切分方式

First let's cut A into two parts at a random position i:

          left_A             |        right_A
    A[0], A[1], ..., A[i-1]  |  A[i], A[i+1], ..., A[m-1]

Since A has m elements, so there are m+1 kinds of cutting (i = 0∼m).

给定某种切分,左边有元素i个,右边m-i个元素

And we know:

len(left_A)=i,len(right_A)=m−i.

Note: when i = 0, left_A is empty, and when i = m, right_A is empty.

With the same way, cut B into two parts at a random position j:


          left_B             |        right_B
    B[0], B[1], ..., B[j-1]  |  B[j], B[j+1], ..., B[n-1]

Put left_A and left_B into one set, and put right_A and right_B into another set.

Let's name them left_part and right_part:

将A和B切分之后,将A左部分和B左部分即为left_part,同理右部分即为right_part

          left_part          |        right_part
    A[0], A[1], ..., A[i-1]  |  A[i], A[i+1], ..., A[m-1]
    B[0], B[1], ..., B[j-1]  |  B[j], B[j+1], ..., B[n-1]

If we can ensure: 

如果左部分和右部分长度一样,并且左边最大值小于右边最小值

则此时中位数为median=(max(left_part)+min(right_part)​)/2

  1. len(left_part)=len(right_part)
  2. max(left_part)≤min(right_part)

then we divide all elements in {A,B} into two parts with equal length, and one part is always greater than the other.

Then median=(max(left_part)+min(right_part)​)/2

To ensure these two conditions, we just need to ensure:

先不考虑边界条件,则两部分相等应满足下列式1,左边最大值小于右边最小值应满足式2,这里先不考虑边界条件。

  1. i + j = m - i + n - j(or: m - i + n - j + 1)
    if n≥m, we just need to set: i=0∼m,j=2m+n+1​−i

  2. B[j−1]≤A[i] and A[i−1]≤B[j]

ps.1 For simplicity, I presume A[i−1],B[j−1],A[i],B[j] are always valid even if i=0, i=m, j=0, or j=n. I will talk about how to deal with these edge values at last.

ps.2 Why n≥m? Because I have to make sure j is non-negative since 0≤i≤m  j=2m+n+1​−i. If n < m, then j may be negative, that will lead to wrong result.

So, all we need to do is:

问题转化为i从0-m,j=(m+n-1)/2-i 找到这样的i,j使得满足B[j−1]≤A[i] and A[i−1]≤B[j]

Searching i in [0, m], to find an object ii such that:

B[j−1]≤A[i] and A[i−1]≤B[j], where j=(m+n+1)/2​−i

And we can do a binary search following steps described below:

  1. Set imin=0, imax=m, then start searching in [imin,imax]

  2. Set i = (imin+imax)/2​, j = (m+n+1)/2​−i;

  3. Now we have len(left_part)=len(right_part). And there are only 3 situations that we may encounter:

    • B[j−1]≤A[i] and A[i−1]≤B[j]
      Means we have found the object i, so stop searching.

    • B[j−1]>A[i]
      Means A[i] is too small. We must adjust i to get B[j−1]≤A[i].
      Can we increase i?
            Yes. Because when i is increased, j will be decreased.
            So B[j−1] is decreased and A[i] is increased, andB[j−1]≤A[i] may be satisfied.
      Can we decrease ii?
            No! Because when i is decreased, jjwill be increased.
            So B[j−1] is increased and A[i] is decreased, and B[j−1]≤A[i] will be never satisfied.
      So we must increase ii. That is, we must adjust the searching range to [i+1,imax].
      So, set imin=i+1, and goto 2.

    • A[i−1]>B[j]:
      Means A[i−1] is too big. And we must decrease ii to get A[i−1]≤B[j].
      That is, we must adjust the searching range to [imin,i−1].
      So, set imax=i−1, and goto 2.

When the object ii is found, the median is:

max(A[i−1],B[j−1]), when m + n is odd

(max(A[i−1],B[j−1])+min(A[i],B[j]))/2​, when m + n is even

Now let's consider the edges values i=0,i=m,j=0,j=n where A[i−1],B[j−1],A[i],B[j] may not exist.

Actually this situation is easier than you think.

What we need to do is ensuring that max(left_part)≤min(right_part). So, if i and j are not edges values (means A[i−1],B[j−1],A[i],B[j] all exist), then we must check both B[j−1]≤A[i] and A[i−1]≤B[j].

But if some of A[i−1],B[j−1],A[i],B[j] don't exist, then we don't need to check one (or both) of these two conditions. For example, if i=0, then A[i−1] doesn't exist, then we don't need to check A[i−1]≤B[j]. So, what we need to do is:

Searching i in [0, m], to find an object ii such that:

(j = 0 or i = m or B[j−1]≤A[i]) and
(i = 0 or j = n or A[i−1]≤B[j]), where j=2m+n+1​−i

And in a searching loop, we will encounter only three situations:

  1. (j = 0 or i = m or B[j−1]≤A[i]) and  (i = 0 or j = n orA[i−1]≤B[j])
    Means ii is perfect, we can stop searching.
  2. j > 0 and i < m and \B[j−1]>A[i]
    Means i is too small, we must increase it.
  3. i > 0 and j < n and A[i−1]>B[j]
    Means ii is too big, we must decrease it.

Thanks to @Quentin.chen for pointing out that: i<m⟹j>0 and i>0⟹j<n. Because:

m≤n,i<m⟹j=(m+n+1)/2​−i>(m+n+1)/2​−m≥(2m+1)/2​−m≥0

m≤n,i>0⟹j=(m+n+1)/2​−i<(m+n+1)/2​≤2(2n+1​)≤n

So in situation 2. and 3, we don't need to check whether j > 0 and whether j < n

代码:

class Solution {
    public double findMedianSortedArrays(int[] A, int[] B) {
        int m = A.length;
        int n = B.length;
        if (m > n) { // to ensure m<=n
            int[] temp = A; A = B; B = temp;
            int tmp = m; m = n; n = tmp;
        }
        int iMin = 0, iMax = m, halfLen = (m + n + 1) / 2;
        while (iMin <= iMax) {
            int i = (iMin + iMax) / 2;
            int j = halfLen - i;
            if (i < iMax && B[j-1] > A[i]){
                iMin = i + 1; // i is too small
            }
            else if (i > iMin && A[i-1] > B[j]) {
                iMax = i - 1; // i is too big
            }
            else { // i is perfect
                int maxLeft = 0;
                if (i == 0) { maxLeft = B[j-1]; }
                else if (j == 0) { maxLeft = A[i-1]; }
                else { maxLeft = Math.max(A[i-1], B[j-1]); }
                if ( (m + n) % 2 == 1 ) { return maxLeft; }

                int minRight = 0;
                if (i == m) { minRight = B[j]; }
                else if (j == n) { minRight = A[i]; }
                else { minRight = Math.min(B[j], A[i]); }

                return (maxLeft + minRight) / 2.0;
            }
        }
        return 0.0;
    }
}

解法三

                           

假定数组X和数组Y,数组X[0:x-1]  数组Y[0:y-1]   找到X和Y中某一种划分px.py,使得

px+py=(x+y+1)/2 并且 X[:px-1]<=Y[py:] && Y[:py-1]<=X[px:]

即此时在X中px左边小于Y中py右边  在Y中py左边均小于X中右边  有因为px+py=(x+y+1)/2即此时中位数必在px,py附近。

假定X为数组中较长者,长度为x,Y的长度为y,则px位置必定属于X中。py位置可能属于X可能属于Y。

在X中探索px位置,利用px+py=(x+y+1)/2在Y中找到py位置,

如果满足条件X[:px-1]<=Y[py:] && Y[:py-1]<=X[px:],则说明已经找到中位数,

如果max(X[:px-1])>min(Y[py:]) 则说明px位置过于靠后,px应该往前移动,即px减小,此时py会增大;

px减小,则max(X[:px-1])减小  py增大  则min(Y[py:])增大,因此随着px移动会出现某一个位置有X[:px-1]<=Y[py:]。

代码如下:

public double findMedianSortedArrays(int[] nums1, int[] nums2) {
        if(nums1==null || nums1.length<1) 
			return (double)(nums2[nums2.length/2]+nums2[(nums2.length-1)/2])/2;
		if(nums2==null || nums2.length<1) 
			return (double)(nums1[nums1.length/2]+nums1[(nums1.length-1)/2])/2;
		int x = nums1.length;
		int y = nums2.length;
		if(y < x)
			return findMedianSortedArrays(nums2, nums1);
		
		int left = 0, right = x, partitionX, partitionY;
		while(left <= right){ 
			partitionX = (left+right) / 2;
			partitionY = (x+y+1)/2 - partitionX;
			int maxLeftX = (partitionX == 0) ? Integer.MIN_VALUE : nums1[partitionX-1];
			int minRightX = (partitionX == x) ? Integer.MAX_VALUE : nums1[partitionX];
			
			int maxLeftY = (partitionY == 0) ? Integer.MIN_VALUE : nums2[partitionY-1];
			int minRightY = (partitionY == y) ? Integer.MAX_VALUE : nums2[partitionY];
			
			if(maxLeftX <= minRightY && maxLeftY <= minRightX){
				if((x+y)%2 == 0){
					return (double)(Math.max(maxLeftX, maxLeftY) + Math.min(minRightX, minRightY))/2;
				}else{
					return Math.max(maxLeftX, maxLeftY);
				}
			}else if(maxLeftX > minRightY){
				right = partitionX - 1;
			}else{
				left = partitionX + 1;
			}
		}
		return Double.MIN_VALUE;
    }

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值