我使用一种算法编写了这个函数的实现,该算法在稀疏分布方面表现得比普通线性合并更好.
对于类似†的分布,它具有O(n)复杂度,但是在分布差异很大的范围内,它应该在线性下执行,在最佳情况下接近O(log n).但是,我无法证明最坏的情况并不比O(n log n)好.另一方面,我也未能找到最坏的情况.
我将其模板化,以便可以使用任何类型的范围,例如子范围或原始数组.从技术上讲,它也适用于非随机访问迭代器,但复杂性要大得多,因此不建议这样做.我认为在这种情况下应该可以修改算法以回退到线性搜索,但我没有打扰.
†通过类似的分布,我的意思是这对阵列有很多交叉点.通过交叉,我的意思是,如果要按排序顺序将两个数组合并在一起,您将从一个数组切换到另一个数组.
#include
#include
#include
// helper structure for the search
template
struct search_data {
// is any there clearer way to get iterator that might be either
// a Range::const_iterator or const T*?
using iterator = decltype(std::cbegin(std::declval()));
iterator curr;
const iterator begin, end;
Out out;
};
template
auto init_search_data(const Range& range, Out out) {
return search_data{
std::begin(range),
std::begin(range),
std::end(range),
out,
};
}
template
void match_indices(const Range& in1, const Range& in2, Out1 out1, Out2 out2) {
auto search_data1 = init_search_data(in1, out1);
auto search_data2 = init_search_data(in2, out2);
// initial order is arbitrary
auto lesser = &search_data1;
auto greater = &search_data2;
// if either range is exhausted, we are finished
while(lesser->curr != lesser->end
&& greater->curr != greater->end) {
// difference of first values in each range
auto delta = *greater->curr - *lesser->curr;
if(!delta) { // matching value was found
// store both results and increment the iterators
*lesser->out++ = std::distance(lesser->begin, lesser->curr++);
*greater->out++ = std::distance(greater->begin, greater->curr++);
continue; // then start a new iteraton
}
if(delta < 0) { // set the order of ranges by their first value
std::swap(lesser, greater);
delta = -delta; // delta is always positive after this
}
// next crossing cannot be farther than the delta
// this assumption has following pre-requisites:
// range is sorted, values are integers, values in the range are unique
auto range_left = std::distance(lesser->curr, lesser->end);
auto upper_limit =
std::min(range_left, static_cast(delta));
// exponential search for a sub range where the value at upper bound
// is greater than target, and value at lower bound is lesser
auto target = *greater->curr;
auto lower = lesser->curr;
auto upper = std::next(lower, upper_limit);
for(int i = 1; i < upper_limit; i *= 2) {
auto guess = std::next(lower, i);
if(*guess >= target) {
upper = guess;
break;
}
lower = guess;
}
// skip all values in lesser,
// that are less than the least value in greater
lesser->curr = std::lower_bound(lower, upper, target);
}
}
#include
#include
int main() {
std::vector array1 = {4,6,12,34};
std::vector array2 = {1,3,6,34};
std::vector<:size_t> indices1;
std::vector<:size_t> indices2;
match_indices(array1, array2,
std::back_inserter(indices1),
std::back_inserter(indices2));
std::cout << "indices in array1: ";
for(std::vector::size_type i : indices1)
std::cout << i << ' ';
std::cout << "\nindices in array2: ";
for(std::vector::size_type i : indices2)
std::cout << i << ' ';
std::cout << std::endl;
}