寻找两个有序数组的中位数----Median of Two Sorted Arrays

There are two sorted arrays nums1 and nums2 of size m and n respectively.

Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).

You may assume nums1 and nums2 cannot be both empty.

Example 1:nums1 = [1, 3] nums2 = [2] The median is 2.0

Example 2:nums1 = [1, 2] nums2 = [3, 4] The median is (2 + 3)/2 = 2.5








证明也很简单,可以采用反证法。假设A[k/2-1]大于合并之后的第k小值,我们不妨假定其为第(k+1)小值。由于A[k/2-1]小于B[k/2-1],所以B[k/2-1]至少是第(k+2)小值。但实际上,在A中至多存在k/2-1个元素小于A[k/2-1],B中也至多存在k/2-1个元素小于A[k/2-1],所以小于A[k/2-1]的元素个数至多有k/2+ k/2-2,小于k,这与A[k/2-1]是第(k+1)的数矛盾。




  • 如果A或者B为空,则直接返回B[k-1]或者A[k-1];
  • 如果k为1,我们只需要返回A[0]和B[0]中的较小值;
  • 如果A[k/2-1]=B[k/2-1],返回其中一个;
public double findMedianSortedArrays(int[] nums1, int[] nums2) {
		int m = nums1.length;
		int n = nums2.length;
		int total = m + n;
		if((total & 0x01) == 1)
			return findKth(nums1, 0, m, nums2, 0, n, total/2+1);
		    return (findKth(nums1, 0, m, nums2, 0, n, total/2) + findKth(nums1, 0, m, nums2, 0, n, total/2+1))/2;
	private double findKth(int[] nums1, int l1, int m, int[] nums2, int l2, int n, int k) {
		if(m > n)
			return findKth(nums2, l2, n, nums1, l1, m, k);
		if(m == 0)
			return nums2[l2+k-1];
		if(k == 1)
			return Math.min(nums1[l1], nums2[l2]);
		int pa = Math.min(k/2, m);
		int pb = k - pa;
		//System.out.println("m = " + m + " n = " + n + " l1 = " + l1 + " l2 = " + l2 + " k = " + k + " pa = " + pa + " pb = " + pb + " " + nums1[pa-1] + " " + nums2[pb-1]);
		if(nums1[l1+pa-1] < nums2[l2+pb-1])
			return findKth(nums1, l1+pa, m-pa, nums2, l2, n, k-pa);
		else if(nums1[l1+pa-1] > nums2[l2+pb-1])
			return findKth(nums1, l1, m, nums2, l2+pb, n-pb, k-pb);
			return nums1[l1+pa-1];


Approach 1: Recursive Approach


To solve this problem, we need to understand "What is the use of median". In statistics, the median is used for:

Dividing a set into two equal length subsets, that one subset is always greater than the other.

If we understand the use of median for dividing, we are very close to the answer.


First let's cut A into two parts at a random position i:

          left_A             |        right_A
    A[0], A[1], ..., A[i-1]  |  A[i], A[i+1], ..., A[m-1]

Since A has m elements, so there are m+1 kinds of cutting (i = 0∼m).


And we know:


Note: when i = 0, left_A is empty, and when i = m, right_A is empty.

With the same way, cut B into two parts at a random position j:

          left_B             |        right_B
    B[0], B[1], ..., B[j-1]  |  B[j], B[j+1], ..., B[n-1]

Put left_A and left_B into one set, and put right_A and right_B into another set.

Let's name them left_part and right_part:


          left_part          |        right_part
    A[0], A[1], ..., A[i-1]  |  A[i], A[i+1], ..., A[m-1]
    B[0], B[1], ..., B[j-1]  |  B[j], B[j+1], ..., B[n-1]

If we can ensure: 



  1. len(left_part)=len(right_part)
  2. max(left_part)≤min(right_part)

then we divide all elements in {A,B} into two parts with equal length, and one part is always greater than the other.

Then median=(max(left_part)+min(right_part)​)/2

To ensure these two conditions, we just need to ensure:


  1. i + j = m - i + n - j(or: m - i + n - j + 1)
    if n≥m, we just need to set: i=0∼m,j=2m+n+1​−i

  2. B[j−1]≤A[i] and A[i−1]≤B[j]

ps.1 For simplicity, I presume A[i−1],B[j−1],A[i],B[j] are always valid even if i=0, i=m, j=0, or j=n. I will talk about how to deal with these edge values at last.

ps.2 Why n≥m? Because I have to make sure j is non-negative since 0≤i≤m  j=2m+n+1​−i. If n < m, then j may be negative, that will lead to wrong result.

So, all we need to do is:

问题转化为i从0-m,j=(m+n-1)/2-i 找到这样的i,j使得满足B[j−1]≤A[i] and A[i−1]≤B[j]

Searching i in [0, m], to find an object ii such that:

B[j−1]≤A[i] and A[i−1]≤B[j], where j=(m+n+1)/2​−i

And we can do a binary search following steps described below:

  1. Set imin=0, imax=m, then start searching in [imin,imax]

  2. Set i = (imin+imax)/2​, j = (m+n+1)/2​−i;

  3. Now we have len(left_part)=len(right_part). And there are only 3 situations that we may encounter:

    • B[j−1]≤A[i] and A[i−1]≤B[j]
      Means we have found the object i, so stop searching.

    • B[j−1]>A[i]
      Means A[i] is too small. We must adjust i to get B[j−1]≤A[i].
      Can we increase i?
            Yes. Because when i is increased, j will be decreased.
            So B[j−1] is decreased and A[i] is increased, andB[j−1]≤A[i] may be satisfied.
      Can we decrease ii?
            No! Because when i is decreased, jjwill be increased.
            So B[j−1] is increased and A[i] is decreased, and B[j−1]≤A[i] will be never satisfied.
      So we must increase ii. That is, we must adjust the searching range to [i+1,imax].
      So, set imin=i+1, and goto 2.

    • A[i−1]>B[j]:
      Means A[i−1] is too big. And we must decrease ii to get A[i−1]≤B[j].
      That is, we must adjust the searching range to [imin,i−1].
      So, set imax=i−1, and goto 2.

When the object ii is found, the median is:

max(A[i−1],B[j−1]), when m + n is odd

(max(A[i−1],B[j−1])+min(A[i],B[j]))/2​, when m + n is even

Now let's consider the edges values i=0,i=m,j=0,j=n where A[i−1],B[j−1],A[i],B[j] may not exist.

Actually this situation is easier than you think.

What we need to do is ensuring that max(left_part)≤min(right_part). So, if i and j are not edges values (means A[i−1],B[j−1],A[i],B[j] all exist), then we must check both B[j−1]≤A[i] and A[i−1]≤B[j].

But if some of A[i−1],B[j−1],A[i],B[j] don't exist, then we don't need to check one (or both) of these two conditions. For example, if i=0, then A[i−1] doesn't exist, then we don't need to check A[i−1]≤B[j]. So, what we need to do is:

Searching i in [0, m], to find an object ii such that:

(j = 0 or i = m or B[j−1]≤A[i]) and
(i = 0 or j = n or A[i−1]≤B[j]), where j=2m+n+1​−i

And in a searching loop, we will encounter only three situations:

  1. (j = 0 or i = m or B[j−1]≤A[i]) and  (i = 0 or j = n orA[i−1]≤B[j])
    Means ii is perfect, we can stop searching.
  2. j > 0 and i < m and \B[j−1]>A[i]
    Means i is too small, we must increase it.
  3. i > 0 and j < n and A[i−1]>B[j]
    Means ii is too big, we must decrease it.

Thanks to @Quentin.chen for pointing out that: i<m⟹j>0 and i>0⟹j<n. Because:



So in situation 2. and 3, we don't need to check whether j > 0 and whether j < n


class Solution {
    public double findMedianSortedArrays(int[] A, int[] B) {
        int m = A.length;
        int n = B.length;
        if (m > n) { // to ensure m<=n
            int[] temp = A; A = B; B = temp;
            int tmp = m; m = n; n = tmp;
        int iMin = 0, iMax = m, halfLen = (m + n + 1) / 2;
        while (iMin <= iMax) {
            int i = (iMin + iMax) / 2;
            int j = halfLen - i;
            if (i < iMax && B[j-1] > A[i]){
                iMin = i + 1; // i is too small
            else if (i > iMin && A[i-1] > B[j]) {
                iMax = i - 1; // i is too big
            else { // i is perfect
                int maxLeft = 0;
                if (i == 0) { maxLeft = B[j-1]; }
                else if (j == 0) { maxLeft = A[i-1]; }
                else { maxLeft = Math.max(A[i-1], B[j-1]); }
                if ( (m + n) % 2 == 1 ) { return maxLeft; }

                int minRight = 0;
                if (i == m) { minRight = B[j]; }
                else if (j == n) { minRight = A[i]; }
                else { minRight = Math.min(B[j], A[i]); }

                return (maxLeft + minRight) / 2.0;
        return 0.0;



假定数组X和数组Y,数组X[0:x-1]  数组Y[0:y-1]   找到X和Y中某一种划分px.py,使得

px+py=(x+y+1)/2 并且 X[:px-1]<=Y[py:] && Y[:py-1]<=X[px:]

即此时在X中px左边小于Y中py右边  在Y中py左边均小于X中右边  有因为px+py=(x+y+1)/2即此时中位数必在px,py附近。



如果满足条件X[:px-1]<=Y[py:] && Y[:py-1]<=X[px:],则说明已经找到中位数,

如果max(X[:px-1])>min(Y[py:]) 则说明px位置过于靠后,px应该往前移动,即px减小,此时py会增大;

px减小,则max(X[:px-1])减小  py增大  则min(Y[py:])增大,因此随着px移动会出现某一个位置有X[:px-1]<=Y[py:]。


public double findMedianSortedArrays(int[] nums1, int[] nums2) {
        if(nums1==null || nums1.length<1) 
			return (double)(nums2[nums2.length/2]+nums2[(nums2.length-1)/2])/2;
		if(nums2==null || nums2.length<1) 
			return (double)(nums1[nums1.length/2]+nums1[(nums1.length-1)/2])/2;
		int x = nums1.length;
		int y = nums2.length;
		if(y < x)
			return findMedianSortedArrays(nums2, nums1);
		int left = 0, right = x, partitionX, partitionY;
		while(left <= right){ 
			partitionX = (left+right) / 2;
			partitionY = (x+y+1)/2 - partitionX;
			int maxLeftX = (partitionX == 0) ? Integer.MIN_VALUE : nums1[partitionX-1];
			int minRightX = (partitionX == x) ? Integer.MAX_VALUE : nums1[partitionX];
			int maxLeftY = (partitionY == 0) ? Integer.MIN_VALUE : nums2[partitionY-1];
			int minRightY = (partitionY == y) ? Integer.MAX_VALUE : nums2[partitionY];
			if(maxLeftX <= minRightY && maxLeftY <= minRightX){
				if((x+y)%2 == 0){
					return (double)(Math.max(maxLeftX, maxLeftY) + Math.min(minRightX, minRightY))/2;
					return Math.max(maxLeftX, maxLeftY);
			}else if(maxLeftX > minRightY){
				right = partitionX - 1;
				left = partitionX + 1;
		return Double.MIN_VALUE;






