给定两个长度分别为
m
m
m与
n
n
n的有序数组
A
A
A与
B
B
B。找出这两个有序数组的中位数,并且要求算法的时间复杂度为
O
(
log
(
m
+
n
)
)
O(\log (m + n))
O(log(m+n))。假设
A
A
A与
B
B
B不会同时为空。
解题思路:
将两个原数组分为两个新数组C与D,若C与D满足以下条件,则根据
m
+
n
m+n
m+n 的奇偶性易求得中位数。
l
e
n
g
t
h
(
C
)
=
l
e
n
g
t
h
(
D
)
o
r
l
e
n
g
t
h
(
C
)
=
l
e
n
g
t
h
(
D
)
+
1
(1)
length(C)=length(D)\ \ or\ \ length(C)=length(D)+1\tag{1}
length(C)=length(D) or length(C)=length(D)+1(1)
m
a
x
(
C
)
≤
m
i
n
(
D
)
(2)
max(C) ≤ min(D)\tag{2}
max(C)≤min(D)(2)
根据
(
1
)
(1)
(1)可得:
l
e
n
g
t
h
(
C
)
=
i
n
t
(
m
+
n
2
)
(3)
length(C)=int(\frac{m+n}{2})\tag{3}
length(C)=int(2m+n)(3)
考虑到
A
A
A与
B
B
B为有序,可得:
C
=
[
A
[
0
]
,
A
[
1
]
,
.
.
.
,
A
[
i
−
1
]
,
B
[
0
]
,
B
[
1
]
,
.
.
.
,
B
[
j
−
1
]
]
(4)
C=[A[0],\ A[1],\ ...,\ A[i-1],\ B[0], B[1],\ ..., \ B[j-1]]\tag{4}
C=[A[0], A[1], ..., A[i−1], B[0],B[1], ..., B[j−1]](4)
D
=
[
A
[
i
]
,
A
[
i
+
1
]
,
.
.
.
,
A
[
m
−
1
]
,
B
[
j
]
,
B
[
j
+
1
]
,
.
.
.
,
B
[
n
−
1
]
]
(5)
D=[A[i],\ A[i+1],\ ...,\ A[m-1],\ B[j], B[j+1],\ ..., \ B[n-1]]\tag{5}
D=[A[i], A[i+1], ..., A[m−1], B[j],B[j+1], ..., B[n−1]](5)
i
+
j
=
l
e
n
g
t
h
(
C
)
(6)
i+j=length(C)\tag{6}
i+j=length(C)(6)
根据
(
2
)
(2)
(2)可得:
A
[
i
]
>
B
[
j
−
1
]
(7)
A[i]>B[j-1]\tag{7}
A[i]>B[j−1](7)
B
[
j
]
>
A
[
i
−
1
]
(8)
B[j]>A[i-1]\tag{8}
B[j]>A[i−1](8)
联立
(
3
)
(3)
(3)-
(
8
)
(8)
(8),使用二分法查找满足条件的
i
i
i即可。
def find_median(shorterlist, longerlist):
shorterlength, longerlength = len(shorterlist), len(longerlist)
if shorterlength > longerlength:
shorterlist, longerlist, shorterlength, longerlength = longerlist, shorterlist, longerlength, shorterlength
target_length = (shorterlength + longerlength) // 2
lb, ub = 0, shorterlength
if shorterlength == 0:
if longerlength % 2 == 1:
return longerlist[longerlength // 2]
else:
return (longerlist[(longerlength - 1) // 2] + longerlist[longerlength // 2]) / 2
while lb <= ub:
left_index = (lb + ub) // 2
right_index = target_length - left_index
if left_index < shorterlength and shorterlist[left_index] < longerlist[right_index - 1]:
lb = left_index + 1
elif left_index > 0 and shorterlist[left_index - 1] > longerlist[right_index]:
ub = left_index - 1
else:
# 若未找到perfect_index,可能出现两种大边界情况:
# (1)left_index = shorterlength -> shorterlist中的元素全部位于数组A;
# (2)left_index = 0 -> 数组A的元素全部来自于longerlist
if left_index == shorterlength:
# 需要注意的是,若两数组的长度相差小于2且max(shorterlist) <= min(longerlist),则longerlist没有元素移入数组A。
right_min = longerlist[right_index]
if right_index != 0:
left_max = max(shorterlist[-1], longerlist[right_index - 1])
else:
left_max = shorterlist[-1]
elif left_index == 0:
left_max = longerlist[target_length - 1]
# 需要注意的是,若两数组的长度相等,则longerlist中的元素全部位于数组A。
if longerlength == target_length:
right_min = shorterlist[0]
elif longerlength > target_length:
right_min = min(shorterlist[0], longerlist[target_length])
else:
right_min = min(longerlist[right_index], shorterlist[left_index])
left_max = max(longerlist[right_index - 1], shorterlist[left_index - 1])
if (shorterlength + longerlength) % 2 == 1:
return right_min
else:
return (left_max + right_min) / 2
此题关键在于理解两种边界情况出现的条件。