问题描述
You are interested in analyzing some hard-to-obtain data from two separate databases. Each database contains n numerical values—so there are 2n values total—and you may assume that no two values are the same. You’d like to determine the median of this set of 2n values, which we will define here to be the nth smallest value.
However, the only way you can access these values is through queries to the databases. In a single query, you can specify a value k to one of the two databases, and the chosen database will return the kth smallest value that it contains. Since queries are expensive, you would like to compute the median using as few queries as possible.
Give an algorithm that finds the median value using at most O(log n) queries.
翻译
您想分析两个独立数据库中一些难以获得的数据。每个数据库包含 n 个数值,因此总共有 2n 个数值,您可以假设没有两个数值是相同的。您想确定这组 2n 个数值的中位数,我们在此将其定义为第 n 个最小值。
但是,访问这些值的唯一方法是查询数据库。在一次查询中,您可以向两个数据库中的一个指定一个值 k,所选数据库将返回其包含的第 k 个最小值。由于查询的成本很高,因此你希望使用尽可能少的查询来计算中位数。
请给出一种算法,用最多 O(log n) 次查询找出中值。
思路
findIntwo函数是重点
代码
#define _CRT_SECURE_NO_WARNINGS 1
#include <iostream>
#include <stdio.h>
#include <math.h>
#include <string>
#include <algorithm>
#include <stdlib.h>
using namespace std;
const int N = 1e6 + 10;
int n;
int ans[N];
void mergeSort(int a[], int l, int r)
{
if (l >= r)return;
int k = 0, i = l, mid = l + r >> 1,j=mid+1;
mergeSort(a, l, mid);
mergeSort(a, mid + 1, r);
while (i<=mid&&j<=r)
{
if (a[i] <= a[j])
ans[k++] = a[i++];
else
ans[k++] = a[j++];
}
while(i<=mid)
ans[k++] = a[i++];
while(j<=r)
ans[k++] = a[j++];
for (int i = l, j = 0; i <= r; i++, j++)
a[i] = ans[j];
}
int findInTwo(int k, int a[], int b[], int l1, int r1, int l2, int r2)
{
if (k == 1)
return min(a[l1], b[l2]);
int mid1 = l1 + k / 2 - 1;//从左往右求中点
//int mid2 = l2 + k / 2 - 1;//奇数正确
int mid2 = l2 + k - k / 2-1;//从右往左求中点
if (a[mid1] < b[mid2])
findInTwo(k - k / 2, a, b, l1 + k / 2, r1, l2, l2 + k - k / 2);
//findInTwo(k - k / 2, a, b, l1 + k / 2, r1, l2, r2-k+k/2);
//(数组前半部分为k/2,后半部分为k-k/2)重点 ,舍去前半部分,在后半部分中找第k-k/2个就是中位数
//
//l1越过前面k/2个数,l1+k/2 r2减去后面k-k/2个数,变成r2-k+k/2;
else
findInTwo(k / 2, a, b, l1, l1 + k / 2, l2 + k - k / 2, r2);
//
//findInTwo(k - k / 2, a, b, l1, r1 - k + k / 2, l2 + k / 2, r2);
}
//1 3 5 7 9
//2 4 6 8 10
/*
mid1=1;mid2=2 3<6
k=3 l1=2 r1=4 l2=0 r2=3 1 3 2 4 6
mid1=2 mid2=3 5<8
k3=2 l1=3 r1=4 l2=0 r2=2 1 3 5 2 4 6 8
mid1=3 mid2=0 7>2
k=1 l1=3 r1=4 l2=0 r2=2 1 3 5 7 2
*/
int main()
{
int n;
cin >> n;
int a[N], b[N];
int k;
cout << "please enter k" << endl;
cin >> k;
for (int i = 0; i < n; i++)
{
a[i] = rand();
b[i] = rand();
}
mergeSort(a, 0, n-1);
mergeSort(b, 0, n - 1);
for (int i = 0; i < n; i++)
cout << a[i] << " ";
cout << endl;
for (int i = 0; i < n; i++)
cout << b[i] << " ";
int mid=findInTwo(k, a, b, 0, n, 0, n);
cout << "\nThe kth minest of the two n arrays is" << mid << endl;
return 0;
}
//
//
1 3 5 7 9
2 4 6 8 10
1 2 3 4 5 6 7 8 9 10
测试案例
随机
运行结果
包对的老弟
时间复杂度分析
nlogn排序
询问logn
速记
mid1和k/2有关
mid2与k-k/2有关,此处要死记硬背
后续会再回顾