算法 {特殊算法知识汇总}

「已注销」

已于 2024-07-18 18:42:17 修改

阅读量186

点赞数

分类专栏：算法文章标签：算法

于 2023-01-11 12:26:22 首次发布

本文链接：https://blog.csdn.net/qq_66485519/article/details/128642850

版权

算法专栏收录该内容

219 篇文章 0 订阅

订阅专栏

算法 {特殊算法知识汇总}

`@LOC_1`

杂汇

#输出结尾字符#

cout<< A[i]<< ",\n"[i==N-1];, 其中""[]他就是对字符串进行下标操作，即[0]是逗号 [1]是换行；

但是他有个缺点，假如我们要把\n换成空，即当i==N-1时不输出，此时如果你写成","[i==N-1] 这是错误的因为当i==N-1时会输出\0 他会终止符会导致你的cout以后不再输出任何内容, 即cout<< a<< '\0'; cout<< b 其中b是不会输出的;
因此, 我们现在的需求是: cout<< a<< ?<< b 其中?需要是空, 我们可以把他设置为""空字符串他就是空的;

即cout<< A[i]<< (i==N-1 ? "":","); 这种做法比上面要更灵活;

@DELI;

子数组：连续； A[l...r]
子序列：可以不连续 A[i1, i2, i3, i4] (满足i1<i2<i3<i4);

@DELI;

求1<<i的二次幂根即i，直接用(int)std::log2( a)是不会丢精度的, 也可以直接用LowBit;

`scanf|printf`与`cin|cout`的耗时比较

如果你使用cin,cout超时了 (不是算法问题, 比如题目的输入/输出就是非常的多):
使用scanf来替换, 会极大程序的优化时间;
使用printf来替换, 也会优化时间, 但没有scanf程度大;
当然最好是同时把cin,cout给替换掉;
. 参见: @LINK: https://editor.csdn.net/md/?articleId=137054231; 用cin,cout会超时, 改为printf 是9s, 而改为scanf是4s;

逆向思维: 首先就单刀直入去考虑(答案)的性质, 得到了答案的性质再正向求解

比如对于最优路径问题, 如果直接去遍历所有路径(BFS) 可能会超时;
逆向思维: 我们先研究最终答案路径他符合哪些性质, 然后根据这个性质再正向求解;
. 比如答案路径beg...ed 可以划分为beg->...->m0->...->m1->...->ed, 即以mi划分为了若干子路径, 然后我们研究beg->...->m0->...->m1, 如果已经得到beg...m0的最优解是否可以更新beg...m1的最优解, 其实就是DP的思路, 即DP[m0] -> DP[m1];
例题: @LINK: https://editor.csdn.net/md/?not_checkout=1&articleId=136650855;

求兩個不相交的子區間的和

求A[l1...r1] + A[l2...r2]的和 (且r1 < l2); 注意不一定有r1 + 1 = l2;
先單獨的求A[l1...r1] 對於i 求出形如A[...i]的最優解記作V[i], 令Ma[i] = max( V[1,2,...,i]); 然後再對數組反向的處理得到VV[i] = A[i...]的最優解, 那麼VV[i] + Ma[i-1]就是一個可能的答案;

例題: @LINK: (https://editor.csdn.net/md/?not_checkout=1&articleId=126906501)-(@LOC_0);

有序对的对称性去重

如果a<b 则称(a,b)为合法有序对; 求n^2里合法有序对的个数;

(n^2) / 2 - n这是错误的! 因为你/2是想将(a,b) (b,a)当做1个, 可是对于(i,i) 他本来就只有1个! 你/2是错误的!
. 因此, 正确做法是: 首先把(i,i)给去掉即n^2 - n 然后再/2;

也就是: 假如一个有序对集合大小为S 且满足对称性, 那么其中合法有序对个数为(S - e) / 2 (e为形如(i,i)的个数, 去掉这些后剩下的即(a<b) (b>a)两两对称);

例题: LINK: https://editor.csdn.net/md/?not_checkout=1&articleId=131933499;

操作可以是后效性

即到了后续未来才真正生效执行, 而不是在当下; 所谓账先记着以后再算; 通常是用个全局delta来记录, 比如对于[...pre..., cur] 在cur时此时的delta 表示pre所欠的那么如果cur有多余就补到delta里;

例题: LINK: https://editor.csdn.net/md/?articleId=131893321;

区间`[l,r]`进行排序后的性质

区间sort后, 比如情况是[1, 5], [1, >=5], [2,3] 一定要注意这个[2,3]的情况;
. 比如当前区间是[1,5], 后面会存在3 < 5的情况, 即虽然左端点是递增的但右端点可不是;

但也不要一概而论, 有特殊情况 (即左右端点都是递增的情况):
1: 当左端点相同时, 此时的右端点确实是递增的, 即[x,2] [x,2] [x,3] [x, >=3];
2: 如果所有区间的长度是相同的! 则此时右端点是递增的! (也就是所有区间都形如[l, l+K])
. 这种情况很常见, 比如滑动窗口 也就是有若干个区间[l, l + K] 然后一维坐标轴上有若干个点 $A$ (通常的做法是: 将点集 $A$ 排序, 将所有区间排序遍历每个区间, 全局维护一个l,r指针指向 $A$ 点集, 对于当前的[L, L + K]区间假如在他区间里的点是[a,b,...,z] 则让l指向a, r指向z (可以发现, 两个区间x,y 一定满足单调性 即: x.l <= y.l, x.r <= y.r);
. 例题: LINK: https://leetcode.cn/problems/count-zero-request-servers/;

遍历长度为3的环

给定一个图遍历所有长度为3的环 a-b-c-a;

方式1: 遍历一个点a 然后枚举他的2个临界点b,c, 看bc是否有边;

for( int a = 0; a < N; ++a){
    for( int b, e = G.Head[ a]; ~e; e = G.Next[ e]){
        b = G.Vertex[ e];
        for( int c, e2 = G.Next[ e]; ~e2; e2 = G.Next[ e2]){
            c = G.Vertex[ e2];
            if( false == Edge[ b][ c]) continue;
            //--
            得到了一个环`a-b-c`;
        }
    }
}

方式2: 遍历所有的边a-b, 再枚举所有的点c, 看ac, bc是否有边;

for( `a-b` : 所有的边){
	for( int c = 0; c < N; ++c){
		if( false == Edge[ a][ c]) continue;
		if( false == Edge[ b][ c]) continue;
		//--
		得到一个环`a-b-c`;
	}
}

两种方式应该是差不多;

虚拟节点 (两集合间的完全映射)

Link

There are $n$ points and $m$ times, every time given you two sets of points $A, B$ where all points in $A, B$ are distinct; you need connect an edge $\to b \quad a\in A, b\in B$ ; finally, if there are multiple-edges $a\to b$ , just save one of them is also enough.

At the worst-case, every time you need connect $n/2) * (n/2) = O(n^2)$ edges, if $n = 1000, m = 1000$ , then $O (m * n * n)$ this is Out-Of-Time;

--

Usually, there is a Trick: let $\notin [1,n]$ be a new auxiliary-point, then for the $A, B$ , previously we need connect $∣ A ∣ * ∣ B ∣$ edges, now we connect $\to x$ and $\to B$ , totally $∣ A ∣ + ∣ B ∣$ edges, finally $m * n$ edges;
. The auxiliary-points x of every time are different or the same, that is, totally you maybe just need one auxiliary-point, or maybe need multiple auxiliary-points;

Of course, you must clarify the relation between the previous-graph and the new-graph, cuz they are totally different and make sure they are equivalent for solving the problem.

性质

+ We say that every raw-edge $x\to y$ in the raw-graph, would corresponds two edges $\to a$ and $\to y$ where $a$ is an auxiliary-point; do not perceive that every raw-edge would be divided into two edges, this is wrong comprehending;
. For example, in the raw-graph, two sets $\to B$ will consist $∣ A ∣ * ∣ B ∣$ raw-edges, but it corresponds to only $∣ A ∣ + ∣ B ∣$ edges in the new-graph; cuz an edge $\to x$ where $x$ is an auxiliary-point would corresponds to multiple raw-edges $a\to b1, a\to b2, a \to ...$ where $a\in A, bi \in B$ ;
. So, we say a raw-edge $a\to b$ corresponds to two edges $a\to x, x \to b$ ( $x$ is an auxiliary-point), not divided into;

+ In the new Graph, for an edge $\to b$ either $a$ is an auxiliary-point ( $b$ is a raw-point), or $b$ is an auxiliary-point ( $a$ is a raw-point);
. Cuz every raw-edge $x\to y$ in the raw-graph, would corresponds two edges $\to a$ and $\to y$ where $a$ is an auxiliary-point.

用DFS递归, 来替代二进制枚举

MARK: @LOC_0;

长度为N的序列, 枚举他所有 $2^N$ 个子序列;

用二进制状态枚举, 时间是: 2^N * N (比如N=20, 这很可能会被卡常数而卡掉…);
看似可以, 但实际上会超时… (而且, 即便你把N: 20 使用Lower_bit 优化为<N, 还是会超时), 只能说遇到卡时间卡的比较极限的题目没办法… 此时还有个备用方案即下面介绍的DFS递归;

当遇到2^n的二进制枚举时, 应该联想到 DFS递归, 两个算法是完全一样的;
虽然说比如对于DP问题同样的递推式用循环遍历一定要比 DFS递归, 要高效多的多;
但不能一概而论, DP里所有状态节点一般都是有效的 (不能去除); 而对于此时枚举所有子序列的情况他的状态节点虽然有 $2^N$ 个但不一定都是有效的, 可能会有很多节点是非法的;

而DFS方式遍历的核心, 比如序列[a,b,c,d,e...], 那么当前DFS(假如当前在c点)会选择一个子序列比如是a,c, 那么当前DFS 一定会遍历所有如同a,c, [...]的子序列 (这是非常非常多的);
如果说当前a,c子序列已经是非法的了, 而且如果a,c非法则a,c, [...]都非法的话, 那么, 在当前的DFS(即a,c), 就可以直接return了, 这就一下子把所有如同a,c, [...]的子序列全部都略过了;
. 换句话说, DFS方式, 他遍历到的任意序列[x,y,z], 都是通过一个个拼接而来的 (这是DFS方式的核心), 即先有的x 然后往后插入x,y, 再往后插入 x,y,z, 这个拼接 (其实DFS递归都是特性, 也就是当前DFS状态不仅包含当前节点信息其实他是个前缀即他还包含前面整个一条连续的DFS路径的信息) 所谓剪枝他的核心原理就是依靠的 DFS递归的这种拼接特性;
所以, 相比于直接遍历所有子序列, DFS递归更有优化/剪枝的空间;

比如, 一组方案 (子序列) a, b, c;
. 如果是二进制枚举他一定是比如st = 100101, 然后你还需要遍历这个子序列, 那么总时间一定是2^N * N, 没有任何可能去优化;
. 但如果是DFS递归, 当我们在a b时, 假如他是非法的, 那么我们就不会访问到a b c这种方案, 在a b时就已经给剪枝掉了;
. . 换句话说, DFS递归它是: 2^N遍历所有方案和 N遍历一组方案, 两个过程是结合到一块了同时进行, 即时间是2^N;

CSDN-129676841

`i,j`两层循环的次序

方式1: for(i : [0,n]) for(j : [0,m]) (i,j) 和方式2: for(j: [0,m]) for(i: [0,n]) (i,j) 这是常用的遍历两层循环的方式, 时间是一样的, 我们重点讲下他在线性DP里的应用;

他俩的共同点是: 对于当前的(i,j), 对于所有的(<=i, <=j)这些状态节点都已经遍历过了;
. 当然, 准确的说比如对于方式1, 此时(<i, ...)这些状态其实都遍历过了, 但我们这里线性DP这里只关注(<i, <=j)这些状态和 (=i, <=j)这些状态;
这是他俩的共同点, 不管是那种遍历方式都满足(<=i, <=j)这些集合里的点都是遍历过的;

不同点就是内外循环的次序不同, 这是显然的, 但这在线性DP里很重要, 通过你转换下次序就可能把多余的O(N) 优化掉;
. 比如, 最长LCS&LIS这道题 LINK: https://editor.csdn.net/md/?not_checkout=1&articleId=130650605;

线性序列转换为环

以区间DP的石子合并为例子, 对于一个线性序列[a,b,c,d] 我们可以求出他的答案;

现在要将他变成一个环即求[abcd, bcda, cdab, dabc] 这4个序列里的最优解;

一个技巧是 复制序列, 即将序列变成[abcd abcd], 这样你选择他的子序列 比如[0...3], [1...4], [2...5], [3...6] 就对应原来的各个序列;

整数的等效二进制拆分

给定一个正数K, 你需要找到一个数集S (一个数集的和为该数集里所有元素之和), 使得S的所有子集的和 所组成的集合, 为{0,1,2,...,K}; (所有子集的意思, 也就是对数集里每个元素进行选/不选操作);

比如K=5, (如果令S = {5}, 显然他无法得到1) (如果令S={1,2,4} 他得到了7 这不属于{0,1,...,5}), 一种可行方案是: 令S = {1,1,1,1,1} 共5个1 他是满足的;

最优方案是: S = {1, 2, 4, 8, 16, ..., M, R} 其中M为<= K的最大的二次幂, R为K - (1+2+4+...+M);
. 该数集的元素个数为log(K)个 (而之前那种方式, 是K个);

@DELIMITER

应用: 背包模型3: 每种物品为有限个;

有K个完全相同的物品, 你需要从中拿若干个, 即你拿取的个数一定是0/1/2/.../K个;
. 做法是: for( i : [0,1,...K]), 即暴力枚举看选择多少个是最优解, 显然这个算法是O(K)的;

他可以优化到O( log(K));
将这个K个物品A (即[A, A, A, ..., A] 共K个物品A), 拆分成log(K)个物品分别是[A, AA, AAAA, ....] (对应元素个数为[1,2,4,8,...,M,R], M,R定义见上面);
. 那么这log(K)个物品的选/不选 (即所有方案), 与物品A的选择个数{0,1,2,...,K}, 是双射的, 也就是原问题等价于新问题;

几何问题下, 空间放大1倍, 以避免浮点数的比较

https://editor.csdn.net/md/?not_checkout=1&articleId=130368646–@MARK_0;

矩阵重复覆盖问题, 得到所有网格点, 进行2维离散化, 再进行2维差分

https://editor.csdn.net/md/?not_checkout=1&articleId=130368646–@MARK_1;

单位转换

给定d = 13.14天转换为: 天-时-分-秒;
先转换到最小单位 (即秒), x个秒, 得到了x这个整数, 一切就自然明了;

$[1, n]$ 可以整除 $a$ 的个数

对于 $\in N^+$ , $[1, n]$ 范围中有 $n / a$ 个数可以整除 $a$ (即 $[a, 2 a, 3 a, ...]$ );

但如果 $a = b 1 * b 2 * ...$ , $a$ 是溢出无法存储的, 而 $bi$ 可以存储 $n$ 也可以存储;
. 此时, 可以将 $n / a$ 转变为: $n / b 1/ b 2/...$ ;

int ans = n;
for( i : {b1, b2, ...}){
	ans /= i;
}

一个数末尾连续的0, 取决于其质因数分解中 $2, 5$ 的个数 `@Mark_0`

给定一个数 $a$ (非常大 long long肯定是存不下的, 一般题目会通过说明a = x1 * x2 * x3 * ...来指定), 如何判断其末尾0的个数呢?
. 用高精度这种暴力方法, 一定超时;

一个数末尾的一个零 $\iff$ 该数的质因数分解中, 一定同时存在质因子 $2, 5$ ;
. $x00000 = x * 10^5 = x * (2*5)^5 = x * 2^5 * 5^5$ ; 此时 $x$ 质因数中可能有2 也可能有5, 但不会同时存在;
. 假如 $a$ 末尾有 $c$ 个0, 则 $2^x * 5^y * ..., \quad min(x,y) = c$ ;

因此, 我们对a = x1 * x2 * x3 * ... * xn中的 $x i$ , 求出其质因数分解中 $2, 5$ 的量级 $p 2, p 5$ ;
. 这里要注意, 你不需要对 $x i$ 进行完整的质因数分解 (比如你得到 $3$ 的量级, 是徒劳的), 只需得到 $p 2, p 5$ 即可;
. 也就是: for( auto i = a; i % 2 == 0; i /= 2) ++ power_2; 只单独处理2;
. 总时间不是 $O (n * k$ ), 因为我们只求2|5的量级, 并不是说对每个 $x i$ 进行之因数分解, 因此, 时间是 $O (n + 2 * k)$ 其中 $k$ 为 $a$ 末尾0的个数;

一个常见应用是: 求 $A!$ 阶乘这个数的末尾0的个数;

power_2 = power_5 = 0;
for( int a = 2; a <= A;++a){
	for( auto i = a; i % 2 == 0; i /= 2) ++ power_2;
    for( auto i = a; i % 5 == 0; i /= 5) ++ power_5;
}
答案是: min(power_2, power_5);

例题: CSDN--130055705

当朴素最短路会超时, 且每个点的邻接点是连续子段时的优化策略

参见: 128122790--@Mark_0.

最大连续子段和

给定数组 $A$ , 一个连续子段 $[l, ..., r]$ 的价值为 $A [l] + ... + A [r]$ , 求最大的连续子段的价值;

int ans = 0, sum = 0;
for( auto i : A){
	sum += i;
	if( sum < 0){ sum = 0;}
	ans = max( ans, sum);
}

代码倒很简单, 但有很多细节;

假如当前刚进入i位置, 则sum表示: 以[i-1]结尾的最大的连续子段和, 或表示为空; 即 $\geq 0$ 始终满足;

假如答案是: l, l+1, ..., r,令S[i] = A[l] + ... + A[i], 则有: $\forall i \in [l, r] \quad S[i] > 0$ ;
. 当刚进入l时, 此时的sum一定等于0; 然后sum = A[l] 然后+ A[l+1] 然后+ A[l+2] … 最后+ A[r]; 在这个过程中, sum = S[i] 因此始终 $> 0$ ;

统计`单调数列`的个数, 对`单调数列`的等价转换

1 令数列长度为 $n$ , 即 $[a 1, a 2, ..., an]$ (且满足 $a_i \in [L, R]$ ); 问有多少种不同的数列;
2
. 1 等价修改数组–统一增加偏移量
. . 所有元素统一减去 $L$ (令 $S = R - L$ ), 即得到: 所有元素范围为 $[0, S]$ (比如 $[R, R, R, R]$ 对应为 $[S, S, S, S]$ );
. 2 等价修改数组–转变为差分数组
. . 即首元素不变, 其他元素令 $a_i = a_i - a_{i-1}$ ;

比如原数组是 $[a, b, c]$ (满足每个元素 $\in [L, R]$ ), 令 $S = R - L$ ;
1 先转换为 $[a - L, b - L, c - L]$ (满足每个元素 $\in [0, S]$ )
2 再转换为 $[a - L, (b - L) - (a - L), (c - L) - (b - L)]$ ;
. . 这个最终的数组比如是 $[a 1, a 2, ..., an]$ (注意, 这个数组并不一定是单调的), 它有个非常重要的性质: $\displaystyle \forall pre \in [1, n], \sum_{i=1}^{pre} a_i \in [0, S]$ ;
. . 理解这个式子, 非常非常重要, 这个式子和最终的这个数组, 是完全等价的充分必要条件;

因此, 我们求有多少个不同的数组方案, 其实就只关注那个数组式子即可;
由于 $\sum_{i=1}^{n} a_i$ 这个数组总和是 $[0, S]$ 范围的, 不是确定的我们再将他细分, 假设整个数组和是 $s$ , 他的方案数就对应为隔板法 ( $s$ 个小球放入 $n$ 个盒子里, 且每个盒子 $\geq 0$ 个, 方案数为 $C_{s+n - 1}^{n-1}$ )

使用`lower_bit`的二进制枚举

vector< int> A{ a1, a2, ..., aN};

for( int st = 0; st < (1 << N); ++st){
	vector< int> nums;
	for( int bit = 0; bit < N; ++bit){
		if( (st >> bit) & 1)  num.push_back( A[ bit]);
	}
}

这是一个经典的二进制枚举的代码;
nums里的元素, 他的顺序和 A里的顺序是一致的! (比如st表示三个元素a1, a5, a7, 那么num里的元素次序是: [a1, a5, a7]);

@Delimiter

我们可以将2^N * N时间, 优化为: 2^N * K (K比N略小一点, 取决于st里1的个数, 不过这个优化有时并不能成功, 只能说优化了一点点把… 例如CSDN-129676841);

vector< int> A{ a1, a2, ..., aN};

int Lower_bit[ (1 << (N-1)) + 1];
for( int i = 0; i < N; ++i)  Lower_bit[ 1 << i] = i;

for( int st = 0; st < (1 << N); ++st){
	vector< int> nums;
	for( int temp = st; temp > 0; temp -= temp & -temp){
		num.push_back( A[ Lower_bit[ temp & -temp]]);
	}
}

同样的, num和之前一样, 他的元素次序依然是和 A里的次序是保持一致的;

$O (n)$ 将所有相交区间放入一个并查集内

CSDN-129559755

对阶乘 $n!$ 的质因数分解 / 求 $n!$ 质因数分解中, 某个质数 $p$ 的量级

朴素算法是对 $1, 2, 3, ..., n$ 一个个的质因数分解, 然后累加到一起;

更好的算法是, 将问题转换为: 给定一个质数 $p$ , 问其在 $n!$ 里的量级是多少?
. 能够整除 $p$ 的数 (即p的倍数) 有 $n / p$ 个; 能够整除 $p^2$ 的数有 $n/(p^2)$ 个, 能够整除 $p^3$ 的数有 $n/(p^3)$ 个, …
. 因此, 直观做法是for( ; p <= A; p *= p) power += A / p;;
. 但这种有个缺点, 只有当p > A时才终止, 而大于A的p*p 可能会溢出!

–

要避免溢出, 我们再做一次转换, 求 $[1, n]$ 中 $p * p$ 的倍数个数即 $n / (p * p)$ 个;
分别是 $\quad k = n / (p*p)$ , 对k做等价变形 $k = n / (p * p) = (n / p) / p$ , 将他们同时除以 $p$ , 即为 $p, 2 p, 3 p, ..., k p$ ;
因此, 其等价于 $[1, n / p]$ 中 $p$ 的倍数的个数;

*求n!中 质数p的量级*

int power = 0;
for( auto a = n; a >= p; a /= p) power += a / p;

相同和不相同的关系矛盾判断

给定一些元素 $a 1, a 2, ... an$ , 和一个二维布尔数组 $F [] []$ 其中 $F [i] [j]$ 为True表示两者相同否则两者不同 (且 $F [i] [i] = t r u e, F [i] [j] = F [j] [i]$ )

我们要判断, 这个关系 (即这个 $F$ 数组) 是否是矛盾的;

当然最简单的是用并查集, 因为"相同-二元关系"满足传递性 ("不相同-二元关系"不满足传递性), 所以用并查集 $O(N^2)$ 可以处理;

@Delimiter

看一种新的方法, 也是 $O(N^2)$ 的;

我们用flag来给每个元素标记, 相同的元素必须相同标记, 不同元素则不同标记;

int flag = 0;
for( int i = 0; i < n; ++i){
	A[i] = -1;
    for( int j = 0; j < i; ++j){
        if( F[ i][ j]){
            A[ i] = A[ j];
            break;
        }
    }
    //>> Mark_0
    for( int j = 0; j < i; ++j){
        if( F[ i][ j] && A[ i] != A[ j]){ return "Wrong";}
        if( F[ i][ j] == false && A[ i] == A[ j]){ return "Wrong";}
    }
    if( A[ i] != -1){ continue;}
    A[ i] = flag ++;
}

这个算法是很巧妙的, 比如最终A[] = 0 1 0 2 3 0 2 2;
注意, 到了Mark_0时, A[i]是可能等于-1的, 这种情况会执行A[i] = flag ++

对于Mark_0的代码, 很重要, 那两个if判断都要执行!!! 也就是, 与前面的每个元素都要判断以下;
对于F[i][j]的情况, 也要执行, 因为上面有break, 也就是会出现: F[i][j]是True 但是A[i] != A[j]的情况;

边录入数据, 同时递归

CSDN(129279897)

方格矩阵中的直角三角形

长高为 $a * b$ 的直角三角形, 且端点都位于方格点上 (注意长度为 $a$ 是指经过了 $a + 1$ 的点), 有 $g c d (a, b) + 1$ 个方格点坐落在他的斜边上

证明: 令 $g = g c d (a, b), A = a / g, B = b / g$ , 将长和高都分为了 $g$ 个长度为 $A, B$ 的子段; 于是, 可以发现斜边也可以分为 $g$ 的子段每个子段处于一个 $A * B$ 的矩形里;

选择前`K`大数

Given you a sequence of numbers $A$ , you need choose $K$ (e.g., $K = 5$ ) elements which are the Greatest/Smallest numbers amongst $A$
. Let the $K$ elements be the vector<> B( K) (notice, $B$ maybe not Increasing, the only-requirement is $B [0, 1, ..., K - 1]$ are the utmost- $K$ elements of $A$ );

vector< int> B;
for( auto i : A){
	B.emplace_back( i);
	if( B.size() > K){ // `B.size()` must equals `K + 1`;
		nth_element( B.begin(), B.begin() + K, B.end());
		//< `(..., greater<>())` for the Greatest-Elements;
		B.resize( K);
	}
}

For example, $A = [10, 9, 8, ..., 3, 2, 1]$ , finally, $B . s i ze () = K = 5, B . c a p a c i t y () = 8$ and $B = [3, 1, 5, 2, 4]$ (the elements-order in $B$ is arbitrary);

The Time-Cost is $O (n * K) = O (n)$ where $n$ is the length of $A$ ;

组合数 (不含某个特定子字符串)

AcWing-1305. GT考试

Given a string $T$ (length $m = 20$ , consists of $[0, 9]$ ), a string is called Valid if:
0 Its length $n=10^9$ and consists of $[0, 9]$ (there are $10^n$ kinds of such string);
1 Do not contains $T$ as its sub-string;

@Delimiter

If we use the normal- $D P$ , that is $dp[n][10^{20}]$ which is Infeasible;
. $d p [i] [j]$ denotes the number of strings with length $i$ , whose last $m$ bits forms the number $j$ , and do not contains $T$ ;

@Delimiter

The correct-method is still $D P$ ;
Let $S$ be the set of all Valid-Strings with length $i$ , according to the $\text{LSPS-EX}$ (Longest-Same-Prefix-Suffix of Suffixed- $A$ and Prefixed- $B$ );
. Review this notion; $A = d e f ab c, B = ab c d e f$ , the $\text{LSPS-EX}$ of Suffixed-A and Prefixed- $B$ is $ab c$ ;

We divide $S$ into $[0, ..., m)$ types according to the value $\text{LSPS-EX}$ ;
. We use $d p [i] [j]$ to denotes the number of string with these types $\in [0, m)$ ;
. The answer would be $\sum_{i = 0}^{m-1} dp[n][i]$ ;

Now, we consider $d p [i] [j]$ would update how many DP-States $d p [i + 1] [?]$ ;

          i
A: ... same ?
B: ... same n ...

The length of `same` is `j`;
The last-bit of `same` is `A[i]`;
The element `n` is certain, `?` has `[0-9]` choices;

So the question is, calculate the `LSPS-EX` of `A[...,i+1]` and `B[all]`
This is a Template-Algorithm, you can go to review it;

e.g., when `A[i+1] = x`, we got `l is the `LSPS-EX` of `A[..., i+1]` and `B[all]`;
. Then, `dp[i + 1][ l] += dp[i][ j]`;

The vital thing is, $l$ is independent from $i$ , that is, once we got $j$ , $l$ is settled;
. In other words, suppose $d p [i] [j]$ would update $d p [i + 1] [a, b, c, d]$ , this process is independent from the value of $i$ ; whatever $i$ is, $\to [a,b,c,d]$ is also the same;
. From the reverse view-point, $dp[i][j] = k_1 * dp[i-1][j_1] + k_2 * dp[i-1][j_2] + ...$ is a Linear-Combination where $k_i, j_i$ are all Constants, has no relation with $i$ ;
. Then, $d p [i] [j]$ where $[i]$ can be solved by Matrix-Multiplication;

形如 $aaa ...$ 数字的数学公式表示

A integer $aaa ...$ consists of $k$ bits with the same number $a$ , can be represented as $\frac{10^k - 1}{9}$ .

For example, $888...$ with $k$ bits can be represented as $\frac{999...}{9} = 8 * \frac{10^k - 1}{9}$ ;

$O (1)$ 判断一个数是否为2次幂

$x$ conforms to the form 1 << k if and only if x & (x - 1) equals $0$ ;

图的删除边操作

The Algorithm for Finding a Eulerian-Path

Here, we are talking about the real-deletion, not just use a bool flag[] to pretend deleting, cuz this is in bad performance; every time you iterating all the adjacency-edges, you will still visit A-Deleted-Edge and then skip it, which is not effective;
(e.g., there are $n$ self-loop a->a, and we need get the Limited-Path of a (i.e., there are $n$ times performance of dfs(a), ..., dfs(a)), if you use bool flag[] to delete edges, its time is n * n)

----

Based on Linked-List

Firstly, we consider the Directed-Graph;

A chain in Linked-List a->b->c->d (the edge-id are 0,1,2), denotes a has 3 adjacent-points b,c,d; so iterate from b to the tail, we would get all the adjacent-points of a;

Head[a] = 0, Nex[0]=1, Nex[1]=2, Nex[2]=-1, Ver[0,1,2]=b,c,d;

When we wanna delete a edge x, it is very ineffective:
1 If x is the First-Edge (i.e., x=0 here), perform Head[a]=Nex[x]
2 Otherwise, we need find a edge y such that Nex[y]=x, then perform Nex[y]=Nex[x];

In summary, this operation is infeasible in Linked-List, while it is useful in a special case, that is, If the edge that need to be deleted, can be arbitrary of a point;

That is, the demand is delete any one adjacent-edge of a point;

Then, it can be effective, that is, if a has Adjacent-Edge, then just perform Head[a] = Nex[ Head[a]] to delete the first-edge in the Linked-List;

One vital thing is,

dfs( cur){
    for( e = Head[ cur]; ~e; e = Nex[e]){
        $( @Loc-1);
        //>< @Loc-0
    }
}

This code is wrong; let the adjacent-edges of cur be x,y,z;

Once @Loc-1 deleted some edge of cur (e.g., x,y), when we at @Loc-0, we cannot detect the deletion of these edges; so, e would continue visit y,z which is wrong;

The correct method is:

dfs( cur){
    for( e = Head[ cur]; ~e;){
    	auto cur_edge = e;
        $(...);
        if( Head[ cur] == cur_edge){ //< no deletion
        	e = Nex[ e];
        }
        else{ //< `cur_edge,...` has been deleted
        	e = Head[cur];
        }
    }
}

Now, once a edge has been deleted, we would never visit it.

--

For example, there are $n$ self-loop a->a, and we need get the Limited-Path of a (i.e., there are $n$ times performance of dfs(a), ..., dfs(a))

void dfs( int _cur){
    for( int cur_edge, e = graph->Head[ _cur]; ~e;){
        graph->Head[ _cur] = graph->Next[ e]; //< delete cur-edge
        cur_edge = e;
        dfs( graph->Vertex[ e]);
        e = graph->Head[ _cur]; //< update
        //--
        Limited_Path.push_back( cur_edge);
    }
}

The above code to deleting edges, its time is n (if you have not the code-line //< update, it would be n + (n-1) + ... + 1)

--

Undirected-graph
When we delete a Directed-Edge a->b, you need also delete its converse-edge b->a; (i.e., delete two edges)

It is important to realize that, the First-Edge of a (the edge a->b to be deleted), maybe not the First-Edge of b, although the ID of the two edges are i, i^1;
For example, $(a\to b) (b\to a) (c\to b) (b\to c)$ whose ID are [0,1,2,3] respectively (so, the First-Edge of b is 3), when we delete $a\to b$ (0), its converse-edge (1) is not the First-Edge of b;

So, you need a bool flag[] to denote the converse-edge that is to be deleted;

if( flag[ e] == true){ //< this edge has been deleted, but yet be deleted physically;
	Head[ cur] = Nex[ e]; //< delete this edge essentially;
	e = Head[ cur];
}
else{
	if( you wanna delete this edge){
		Head[ cur] = Nex[ e]; //< delete this edge essentially (so, `flag[e]=true` actually is vain);
		flag[ e ^ 1] = true;
		e = Head[ cur];
	}
}

----

Based on Adjacency-Matrix

We use int[a][b] to denote the number of edges a->b; (note that, not bool)
Then, -- [a][b] represents deleting one edge a->b;

无向图的路径方向

AcWing-1184. 欧拉回路

For a path a->b->c in Undirected-Graph, we know that every Directed-Edge in the path corresponds to a Undirected-Edge (i.e., a->b denotes a edge a-b)
Cuz a edge a-b would corresponds to two different Directed-Edges a->b and b->a, how can we represent a path in Undirected-Graph?

For instance, in Directed-Graph, there are two-edges that are a->b (0) and b->c (1), we use a id (x) to denote every edge in Directed-Graph;
. So, the path a->b->c can be represented by a Sequence of Edge-ID, that is [0,1]

But, in Undirected-Graph, there are two-edges that are a-b (1) and b-c (2), then for a path a->b->c, if you use [1,2] to denote it, that is 1 represents a->b and both b->a, which is wrong, cuz you cannot clarify the orientation of a edge;

Of course, you can use another system of Edge-ID a->b (0), b->a (1), ...; but, the problem only identifies the a-b (0) ... not your system, cuz the order of all edges are given by the problem;
If problem stipulates that an edge a-b (x) given by the problem, x denotes a->b, and -x denotes b->a; (x>=1, cuz $0 = - 0$ )
Then we found, one edge a-b (x) (x >= 1) denotes a->b ((x-1)*2) b->a ((x-1)*2 + 1) in our Linked-List;
So, a directed-edge with ID = x, its corresponding-edge is t = x/2 + 1, then perform t = -1 * t if t is Odd; it can be proved that the t of all directed-edges are distinct.

One vital thing is, the order of Adding-Edges is crucial

int a,b;
cin>>a>>b; //< problem inputs a edge `a-b`
G.Add_edge( a, b);
G.Add_edge( b, a);

Note that, (a,b) must be firstly added ahead of (b,a); that is to say, cin>>a>>b differs to cin>>b>>a;
When the problem gives a input a,b (the order is fixed), it shows a->b is the Positive-Orientation of the Undirected-Edge a-b;

四舍五入和取整

下取整到整数: int( a.???) = a;

去掉整数部分: (int)(a.??? - (int)a) = a;

保留3位小数下取整: int( a.??? * 1000), 然后再单独添加(小数点)

printf("%.2lf", a.125) = a.13

精度为0.1下, 四舍五入到整数: int( x + 0.5) (如果精度是0.01, 则为x + 0.05);

四舍五入保留3位小数, 输出: printf("%.3lf", double)

遍历一棵树的所有边

For a tree with $N$ points $[0, N)$ (Rooted by $0$ ), every edge $(a - b)$ can be characterized by a point $x$ which is the deeper point of $a, b$ (i.e., $x = a$ is $d e pt h [a] > d e pt h [b]$ )

We know this device is also the essence for the algorithm (Prefix-Sum or Differential) based on Edge in a tree;

For instance, now giving you a task that, after the process of Differential based on Edge on the tree (you get the array $V []$ which means the value of a edge), then you need to calculate the number of edges whose $V [] > 0$ ;

More speaking, $V [x]$ where the index $x$ actually is a point, although it essentially represents a edge $(a - b)$ whose deeper point equals $x$

One easy device is iterating all edges $(a, b)$ and find its deeper point $x$ , and that the value of this edge is $V [x]$ ;

for( edge `a-b` : all edges){
	if( depth[a] > depth[b]){ x = a;}
	else{ x = b;}
	if( V[x] > 0){ ++ ans;}
}

But in fact, this is not a good device; (cuz you need record $d e pt h []$ for every point)

A better device is:

for( x : [1, N)){
	if( V[x] > 0){ ++ ans;}
}

the $[1, N)$ in the code can be generalized to the set of all points except the root;

Proof:
For any two distinct edges $(a - b), (c - d)$ in a tree, let $x$ be the deeper point of $a, b$ and $y$ for this second edge; then, $\neq y$ always satisfies; that is a property of Tree (for any point in a tree, it may has multiple sons, but exactly one father (except that root has no father));
And also the deeper point $x$ of any edge would never be the root;
As a result, the set of the deeper-point ( $x$ ) for every edge in a true is the set of all points except its root (denoted $[1, N)$ )

So you can just iterate all these points $[1, N)$ which is essentially equivalent to iterating all edges on a tree.

基于邻接矩阵存储的二分图

For a Bipartite-Graph, we can divide it into two distinct sets of points $L, R$ , and all edges are the form of connecting A and B. (all edges are undirected and non-weighted)

If we use Boolean-Adjacency-Matrix to store the graph, this matrix represents undirected edge which is very different from the usual usage of Adjacency-Matrix.

For example, usually, $M [a] [b]$ denotes a directed edge $\to b$ , as a result, when expressing a undirected graph, it also satisfying $M [a] [b] = M [b] [a]$ .
However, in the current sense, it denotes a undirected edge between two points. e.g., $M [a] [b]$ denotes a undirected edge between a point of $L$ -set and a point of $R$ -set.

Notice that, when given a edge $(a, b)$ (meaning there is a undirected edge between a point of $L$ -set and a point of $R$ -set), it just corresponds to $M [a] [b] = t r u e$
Not involving with $M [b] [a]$ , it is vital to realize this. $M [b] [a]$ means a edge edge between $\in A$ and $\in B$

In other words, the one-dimension always denotes a point of $L$ -set.

区间的合并与消除

merge implication: If we have $[1, 5]$ and $[6, 10]$ , then the interval $[1, 10]$ is deduced
eliminate implication: If we have $[1, 10]$ and $[6, 10]$ , then the interval $[1, 5]$ is deduced

This is somewhat like the interval-merge, but they are different.
In interval-merge, if we get $[1, 10]$ , then the two sub-interval $[1, 5]$ and $[6, 10]$ will be removed. (But in this problem, we should retain the these sub-interval)
And, it can’t handle the eliminate in interval-merge.

For a interval $[l, r]$ , we use a pair to denote it $(l - 1, r)$ (also can be $(l, r + 1)$ ), and then, put $l - 1$ and $r$ into a same Disjoint-Set.

That is, for a set ${ a, b, c, d \}$ , it implies $3 + 2 + 1$ intervals.
We choose two elements $a, c$ (if $a > c$ , then swap), this pair $(a, c)$ denotes a interval $[a + 1, c]$

For more detail information

处理日期问题 (月日)

假如给定同一年里的两个日期A和B, 日期由月和日组成;
(每月天数是[31, 28, …], 不用考虑闰年)
求这两个期间中间差多少天;

最简洁的做法是:

int Days[] = { 0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31};

int Presum( int _month, int _day){
	int days = 0;
	for( int i = 1; i < _month; ++i){
		days += Days[ i];
	}
	days += _day;
	return days;
}

这样, 两个日期相减就可以得到中间的天数

多说一句, 判断日期(m1, d1) 是否是在日期(m0, d0) 的后面, 不是(m1 >= m0) && (d1 >= d0), 这是错的;
而是: (m1 > m0) || ( (m1 == m0) && (d1 >= d0))

多叉树的节点ID

对于二叉树, 经常用到的是ID设定方式是:
当前节点x, 左儿子是(x * 2), 右儿子是(x * 2 + 1)

1
2   |3
4 5 |6 7

这样, 所有节点都有一个唯一的ID号, 而且是从[1, 2, 3, …]递增的

那么, 对于一个N叉树, 该如何设定呢?

比如对于三叉树, 其规则是: x的三个儿子是(x * 3 - 1, x * 3, x * 3 + 1)

1
2     |3      |4
5 6 7 |8 9 10 |11 12 13

对于四叉树, x的四个儿子是(x * 4 - 2, x * 4 - 1, x * 4, x * 4 + 1)

1
2       |3           |4           |5
6 7 8 9 |10 11 12 13 |14 15 16 17 |18 19 20 21

总结下规律, 对于一个N叉树
1, 每一层的节点个数是 $N^0, N^1, N^2, ...$
2, 根节点的ID必须是1
3, 第x层的最小ID(即最左侧的节点) - 1, 等于上一层的最大ID(即最右侧的节点)
5, 对于x节点, (x * N)节点, 一定是他的儿子;
… 但是, x / N (不管是上/下取整) 不一定是他的父节点! (只有在二叉树时成立: $\lfloor \frac{x}{2} \rfloor$ 是其父节点)

最值元素的下标

对于一个数组A, 求出其最大元素的下标 (如果最大元素有多个, 则选择最小的)

这是常见的算法问题, 一般我们会ma, ans, ma来存储最大值, ans存答案的下标

其实有更简单的做法, 就用一个下标变量即可 (初始认为答案是在[0]位置)

int ans = 0;
for( int i = 0; i < n; ++i){
	if( A[ i] > A[ ans]){
		ans = i;
	}
}

遍历全排列

给定一个N个元素, 遍历其所有的(全排列) 即2^N个;
他可以对应两种二叉树, 参见

一种是DFS树, 他有2^N * 2个节点, 非叶子节点的点都是重复点;

另一种二叉树, 是2^N个节点, 没有重复节点;

巧用`auto`

A error-code is for( int i = l; i <= r; ++i) where the type of r is $in t$ and r = INT32_MAX;
For such a mistake, you must do two things:
. 1 Change int r to long long r;
. 2 Change int i to long long i;
So, if you use auto i = l;, then you just need do one-thing (1);

Therefore, it is a good-idea that making the type of variable within for(...) to be auto, when the process of for(...) is simple (e.g., for( auto i = 0; i < r; ...));

----

void Func( int _a){
    for( int i = _a; i > 0; i /= 10){
        ...
    }
}

如果我们后期改进函数, 将_a修改为了Ll_类型 , 此时, int i = _a就出错了!
就必须再手动修改为: Ll_ i = _a; 非常容易遗忘;

好的办法是: for( auto i = _a; ...), 这多好!

DP的空状态(边界情况)

以线性DP为例子: 123
dp[123] 由 [3] 和 dp[12] 推出, dp[12] 由 [2] 和 dp[1]推出
… 但是, dp[1] 由 [1] 和 dp[0]推出, 而dp[ 0]是什么呢?
… 假如dp[ 0]是空是非法的, 那么, dp[ 1]也是非法的; 因为, dp[1] 由 [1] + dp[ 0] 推出, dp[ 0]非法, 则dp[ 1]也非法
… 因此, dp[ 0]并不是空, 他其实是代表一个()(空数, 空值), 这个元素是(空)的, 是非法的, 但是, dp[ 0]不是空的, 他里面有一个元素就是()
… 此时[1] 可以拆分为 [1] + ()的形式,
… 换句话说, 假如dp代表其集合里的元素个数; dp[ 0]必须为1(因为()空元素是他的唯一元素), 而不能是0
… … 否则, 你的dp[ 1]一定为0, 因为无法进行拆分, 导致1是非法的

从元素个数来看, 假如每个位都有0-9共10种选择,
在[1]时, 他有10^1种形态; 在[2]时, 他有10^2种形态; 即在[k]时, 他有10^k种形态
那么, 在[0]时, 他有10^0 = 1种形态; 那么, 问题来了: 这个形态是什么?
… 这个形态, 是一种不存在的状态, 简称为(空状态)! 但他是存在的, 是一种方案! 只是, 他无形…
这个(不存在的数), 也作为一个(元素), 在这个集合里…
… 这个(空状态), 可以 (和任意状态)组合; 比如: 123 = 123 + (), 其中()就表示这个(空状态), 这个(空状态) 是一种方案!

比如在进行数位DP时, 定义Dp[ a]为: [a个0 – a个9]之间的这10^a个数中, 合法数的个数; (即一定会有: Dp[ a] <= 10^a)
那么, 对于Dp[ 0]是代表那些数? 这个问题非常重要
… 按照定义看, 他代表有10^0 = 1个数; (即, 他的集合不是空的!!)
… 但有一点不同, 这个数, 无法像Dp[ >0]所代表的那些数, 可以显式的写成数的形式; 这个元素, 是无形的… 他是(空)的
… 因此, 这个元素一定是非法的; … 因为: (合法) 的前提是: 这个元素是一个 (数), 就可以显式表达; 而它就无法显式表达
… 即, Dp[ k], k>0 代表 10^k个状态, 每个状态就对应一个字符串 (0-9组成的字符串)
… … 但是, 当Dp[ k], k=0时, 它代表 1个状态 (即空状态), 但不代表 (任何字符串), 即空状态无法对应为字符串;
… … 而我们的(合法), 判定的是(字符串); 所以, (空状态) 是非法的; 但它确实是个(状态)
… 即Dp[ 0] = 0, 这是肯定是; 因为(空状态) 是非法的;
… 这个(空状态), 可以与(任意状态)组合; 123 = 123 + (), 其中那个()表示(空状态)
… 但是, 如果涉及(元素个数) 不涉及合法性, 即10^k, Dp[ 0] = 1, 表示它有1个元素, 不可以设置成0个

如果你把Dp[ a]定义为: [0 – (10^a - 1)]这些数, 这和上面的 Dp定义, 是不同的!
上面定义的是(字符串数) , 即Dp[ 2] = [00 – 99] 不是[0 – 99], 但是这里, Dp[ 2] = [0 – 99];
此时, Dp[ 0] = [0 – 0], 这是一个状态, 这个状态同时也是合法的 (是0, 是个数字)
但之前的Dp[ 0] = (), 是一个状态, 但是不是合法的 (因为他不是个字符串数)