逆序对数问题Count Inversion
Problem Description
Recall the problem of finding the number of inversions. As in the course, we are given a sequence of n numbers a1,··· ,an, which we assume are all distinct, and we define an inversion to be a pair i < j such that ai > aj.
We motivated the problem of counting inversions as a good measure of how different two orderings are. However, one might feel that this measure is too sensitive. Let’s call a pair a significant inversion if i < j and ai > 3aj. Given an O(nlogn) algorithm to count the number of significant inversions between two orderings.
The array contains N elements (1<=N<=100,000). All elements are in the range from 1 to 1,000,000,000.
回想一下找到反转数的问题。 就像在过程中一样,我们给定n个数字a1,…,an的序列,我们假设它们是完全不同的,并且我们将一个反转定义为对i <j,从而ai> aj。
我们提出了将反演计数的问题,以此作为衡量两种排序的不同程度的一种好方法。 但是,人们可能会认为此措施过于敏感。 如果i <j和ai> 3aj,我们将这对称为显着倒置。 给定一个O(nlogn)算法来计算两个排序之间的有效反转次数。该数组包含N个元素(1 <= N <= 100,000)。
所有元素的范围都在1到1,000,000,000之间。
Input
The first line contains one integer N, indicating the size of the array. The second line contains N elements in the array.
50% test cases guarantee that N < 1000.
Output
Output a single integer which is the number of pairs of significant inversions.
Sample Inout
6
13 8 5 3 2 1
Sample Output
6
算法思想
采用分治法与归并排序思想,不停地将序列A分割为两个等长的子序列 L 和 R ,分别对 L 和 R 中的significant inversions进行计数,然后对 L 和 R 组成的总序列A中所有的significant inversions 进行合并计数,A中的总计数为 L 和 R 中的分别计数加上合并后而引起的新计数。
算法伪代码
Sort-and-Count(A)
Diride A into two sub-sequence L and R
(RCL,L) = Sort-and-Count(L)
(RCR,R) = Sort-and-Count(R)
(r, A) = Merge-and-Count(L, R)
return (RC = RCL + RCR + r,A)
Merge-and-Count(L, R)
InverseCount = 0
i = 1, j = 1
for k = 1 to r do
if L[i] > R[j] then
A[k] = L[j]
j++
else
A[k] = R[i]
i++
end if
end for
i = 1,j = 1
for k = 1 to r do
if L[i] > 3R[j] then
InverseCount += length - i + 1
++j
else
++i
end if
end for
return InverseCount and A
正确性证明
对于每个i,我们都计算 ai 到 aj 的有效反转次数,如果 ai <= 3aj ,则 aj 与任何 am (m>=j)之间都没有有效的反转次数,因此我们减少 j ,如果 ai >= 3aj ,则对于所有的 m (k<m<=j) 都有 ai >= 3am ,因此我们已经检验到涉及 ai 的 j - k 个显著的反演了。
复杂度分析
Merge-and-Count中的每个for循环都最多执行 n 次,故时间复杂度为 O(n) ,根据二分算法的特性,最多分割logn次,算法最多执行logn次,所以总的时间复杂度为O(nlogn)。
源代码
#include <cstdio>
#include <iostream>
#include <algorithm>
#include <map>
using namespace std;
const int maxn = 1e5 + 10;
const int MAX = 1e9 + 1;
int a[maxn];
map<int, int> m;
int lowbit(int x) {
return x & -x;
}
void add(int x) {
while (x < MAX) {
m[x]++;
x += lowbit(x);
}
}
int sum(int x) {
int res = 0;
while (x) {
res += m[x];
x -= lowbit(x);
}
return res;
}
int main() {
int n; cin >> n;
for (int i = 0; i < n; i++) cin >> a[i];
long long res = 0;
for (int i = n - 1; i >= 0; i--) {
res += sum((a[i] - 1) / 3);
add(a[i]);
}
cout << res << endl;
return 0;
}