特殊卷积
- Convolution:
h k = ∑ i + j = k m o d n a i b j h_k = \sum_{i+j=k \mod n} a_i b_j hk=i+j=kmodn∑aibj
- XOR-Convolution:
h k = ∑ i ⊕ j = k a i b j h_k = \sum_{i \oplus j=k} a_i b_j hk=i⊕j=k∑aibj
- OR-Convolution:
h k = ∑ i ∨ j = k a i b j h_k = \sum_{i \vee j=k} a_i b_j hk=i∨j=k∑aibj
- AND-Convolution:
h k = ∑ i ∧ j = k a i b j h_k = \sum_{i \wedge j=k} a_i b_j hk=i∧j=k∑aibj
- 上述的 ⊕ , ∨ , ∧ \oplus,\vee,\wedge ⊕,∨,∧都是比特运算,序列 h , a , b h,a,b h,a,b长度为 n = 2 l n=2^l n=2l
特殊卷积的多项式环
作者的推导如有问题,还请读者积极指正! (* ̄︶ ̄)
一般结构
对于代数结构
R
=
(
Z
p
[
x
]
,
+
,
⋅
)
R=(Z_p[x],+,\cdot)
R=(Zp[x],+,⋅),元素表示为
f
(
x
)
=
∑
i
=
0
n
−
1
f
i
x
i
∈
Z
p
[
x
]
f(x) = \sum_{i=0}^{n-1} f_i x^i \in Z_p[x]
f(x)=∑i=0n−1fixi∈Zp[x],其中的
f
i
∈
Z
p
f_i \in \mathbb Z_p
fi∈Zp是素域元素。
R
R
R的运算定义如下:
f
(
x
)
+
g
(
x
)
=
∑
i
=
0
n
−
1
(
f
i
+
g
i
)
x
i
f(x)+g(x) = \sum_{i=0}^{n-1} (f_i+g_i) x^i
f(x)+g(x)=i=0∑n−1(fi+gi)xi
其中,左侧的运算符 + + + 是多项式加法,右侧的运算符 + + + 是素域加法,而运算符 ∑ \sum ∑是交换的单项式连接符(也就是多项式加法算符)。
f ( x ) ⋅ g ( x ) = ∑ k = 0 n − 1 ( ∑ i ∘ j = k f i ⋅ g j ) x k f(x) \cdot g(x) = \sum_{k=0}^{n-1} (\sum_{i \circ j=k} f_i \cdot g_j) x^k f(x)⋅g(x)=k=0∑n−1(i∘j=k∑fi⋅gj)xk
其中,左侧的运算符 ⋅ \cdot ⋅ 是多项式乘法,右侧的运算符 ⋅ \cdot ⋅ 是素域乘法,而运算符 ∘ \circ ∘ 是对于 Z \mathbb Z Z上元素的某种交换的二元运算。
那么, R R R是交换环:
-
因为素域 Z p Z_p Zp对加法构成交换加法群,易知 R R R是加群
-
验证 R R R满足结合律,
( f ⋅ g ) ⋅ h = ∑ k = 0 n − 1 ( ∑ i ∘ j = k f i ⋅ g j ) x k ⋅ ∑ l = 0 n − 1 h l x l = ∑ m = 0 n − 1 ( ∑ l ∘ k = m ( ∑ i ∘ j = k f i ⋅ g j ) ⋅ h l ) x m = ∑ m = 0 n − 1 ( ∑ l ∘ i ∘ j = m f i ⋅ g j ⋅ h l ) x m = f ⋅ ( g ⋅ h ) \begin{aligned} (f \cdot g) \cdot h &= \sum_{k=0}^{n-1} (\sum_{i \circ j=k} f_i \cdot g_j) x^k \cdot \sum_{l=0}^{n-1} h_l x^l\\ &= \sum_{m=0}^{n-1} (\sum_{l \circ k=m}(\sum_{i \circ j=k} f_i \cdot g_j) \cdot h_l) x^m\\ &= \sum_{m=0}^{n-1} (\sum_{l \circ i \circ j=m} f_i \cdot g_j \cdot h_l) x^m\\ &= f \cdot (g \cdot h)\\ \end{aligned} (f⋅g)⋅h=k=0∑n−1(i∘j=k∑fi⋅gj)xk⋅l=0∑n−1hlxl=m=0∑n−1(l∘k=m∑(i∘j=k∑fi⋅gj)⋅hl)xm=m=0∑n−1(l∘i∘j=m∑fi⋅gj⋅hl)xm=f⋅(g⋅h) -
验证 R R R满足左右分配律,由于 Z p Z_p Zp上的乘法是交换的,因此只需验证单边分配律:
( f + g ) ⋅ h = ∑ i = 0 n − 1 ( f i + g i ) x i ⋅ ∑ j = 0 n − 1 h j x j = ∑ k = 0 n − 1 ( ∑ i ∘ j = k ( f i + g i ) ⋅ h j ) x k = ∑ k = 0 n − 1 ( ∑ i ∘ j = k ( f i ⋅ h j + g i ⋅ h j ) ) x k = ∑ k = 0 n − 1 ( ∑ i ∘ j = k f i ⋅ h j ) x k + ∑ k = 0 n − 1 ( ∑ i ∘ j = k g i ⋅ h j ) x k = f ⋅ h + g ⋅ h \begin{aligned} (f+g) \cdot h &= \sum_{i=0}^{n-1} (f_i + g_i) x^i \cdot \sum_{j=0}^{n-1} h_j x^j\\ &= \sum_{k=0}^{n-1} (\sum_{i \circ j=k} (f_i + g_i) \cdot h_j) x^k\\ &= \sum_{k=0}^{n-1} (\sum_{i \circ j=k} (f_i\cdot h_j + g_i\cdot h_j) ) x^k\\ &= \sum_{k=0}^{n-1} (\sum_{i \circ j=k} f_i\cdot h_j)x^k + \sum_{k=0}^{n-1} (\sum_{i \circ j=k} g_i\cdot h_j)x^k\\ &= f \cdot h + g \cdot h \end{aligned} (f+g)⋅h=i=0∑n−1(fi+gi)xi⋅j=0∑n−1hjxj=k=0∑n−1(i∘j=k∑(fi+gi)⋅hj)xk=k=0∑n−1(i∘j=k∑(fi⋅hj+gi⋅hj))xk=k=0∑n−1(i∘j=k∑fi⋅hj)xk+k=0∑n−1(i∘j=k∑gi⋅hj)xk=f⋅h+g⋅h -
若包含幺元 e ( x ) = ∑ i = 0 n − 1 e i x i e(x) = \sum_{i=0}^{n-1} e_i x^i e(x)=∑i=0n−1eixi,它需要满足
g ( x ) ⋅ e ( x ) = e ( x ) ⋅ g ( x ) = ∑ k = 0 n − 1 ( ∑ i ∘ j = k e i ⋅ g j ) x k = ∑ k = 0 n − 1 g k x k = g ( x ) g(x) \cdot e(x) = e(x) \cdot g(x) = \sum_{k=0}^{n-1} (\sum_{i \circ j = k} e_i \cdot g_j) x^{k} = \sum_{k=0}^{n-1} g_k x^{k} = g(x) g(x)⋅e(x)=e(x)⋅g(x)=k=0∑n−1(i∘j=k∑ei⋅gj)xk=k=0∑n−1gkxk=g(x)
即要满足 g j = ∑ i ∘ j = k e i ⋅ g j g_j = \sum_{i \circ j = k} e_i \cdot g_j gj=∑i∘j=kei⋅gj
环的实例化
假设 R R R中的多项式长度最高是 n = 2 l , l ∈ N n=2^l,l \in \mathbb N n=2l,l∈N,那么:
-
选取运算符 ∘ \circ ∘为”模加“,那么为Convolution:
f ( x ) ⋅ g ( x ) = ∑ k = 0 n − 1 ( ∑ i + j ≡ k m o d n f i ⋅ g j ) x k f(x) \cdot g(x) = \sum_{k=0}^{n-1} (\sum_{i + j \equiv k \mod n} f_i \cdot g_j) x^k f(x)⋅g(x)=k=0∑n−1(i+j≡kmodn∑fi⋅gj)xk
此时,交换环 R R R含幺, e ( x ) = 1 ∈ R e(x) = 1 \in R e(x)=1∈R。环 R R R有零因子,例如 ( x n / 2 − 1 ) ⋅ ( x n / 2 + 1 ) = 0 (x^{n/2}-1)\cdot(x^{n/2}+1)=0 (xn/2−1)⋅(xn/2+1)=0。 -
选取运算符 ∘ \circ ∘为”按位或“,那么为Or Convolution:
f ( x ) ⋅ g ( x ) = ∑ k = 0 n − 1 ( ∑ i ∨ j = k f i ⋅ g j ) x k f(x) \cdot g(x) = \sum_{k=0}^{n-1} (\sum_{i \vee j=k} f_i \cdot g_j) x^k f(x)⋅g(x)=k=0∑n−1(i∨j=k∑fi⋅gj)xk
此时,交换环 R R R含幺, e ( x ) = 1 ∈ R e(x) = 1 \in R e(x)=1∈R。环 R R R有零因子,例如 ( x n − 1 + x n / 2 ) ⋅ ( x n / 2 − 1 ) = 0 (x^{n-1}+x^{n/2})\cdot(x^{n/2}-1)=0 (xn−1+xn/2)⋅(xn/2−1)=0。 -
选取运算符 ∘ \circ ∘为”按位与“,那么为And Convolution:
f ( x ) ⋅ g ( x ) = ∑ k = 0 n − 1 ( ∑ i ∧ j = k f i ⋅ g j ) x k f(x) \cdot g(x) = \sum_{k=0}^{n-1} (\sum_{i \wedge j=k} f_i \cdot g_j) x^k f(x)⋅g(x)=k=0∑n−1(i∧j=k∑fi⋅gj)xk
此时,交换环 R R R含幺, e ( x ) = x n − 1 ∈ R e(x) = x^{n-1} \in R e(x)=xn−1∈R。环 R R R有零因子,例如 ( x n / 2 + 1 ) ⋅ ( x n / 2 − 1 − 1 ) = 0 (x^{n/2}+1)\cdot(x^{n/2-1}-1)=0 (xn/2+1)⋅(xn/2−1−1)=0。 -
选取运算符 ∘ \circ ∘为”异或“,那么为Xor Convolution:
f ( x ) ⋅ g ( x ) = ∑ k = 0 n − 1 ( ∑ i ⊕ j = k f i ⋅ g j ) x k f(x) \cdot g(x) = \sum_{k=0}^{n-1} (\sum_{i \oplus j=k} f_i \cdot g_j) x^k f(x)⋅g(x)=k=0∑n−1(i⊕j=k∑fi⋅gj)xk
此时,交换环 R R R含幺, e ( x ) = 1 ∈ R e(x) = 1 \in R e(x)=1∈R。环 R R R有零因子,例如 ( x n − 1 + 1 ) ⋅ ( x n − 1 − 1 ) = 0 (x^{n-1}+1)\cdot(x^{n-1}-1)=0 (xn−1+1)⋅(xn−1−1)=0。
快速计算卷积
对于Convolution,使用快速数论变换(Number Theoretic Transform)来求解,复杂度 O ( n log n ) O(n \log n) O(nlogn)
对于Or Convolution、And Convolution、Xor Convolution,使用快速沃尔什变换(Walsh Transform)来求解,复杂度 O ( n log n ) O(n \log n) O(nlogn)
对于NTT,可以参考深入理解NTT,
-
假设 A A A是 n = 2 l n=2^l n=2l长的数组,令 A 0 A_0 A0表示上半段,令 A 1 A_1 A1表示下半段。
-
令 w w w是 n n n次单位根,正向变换为:
N T T ( A ) = { ( N T T ( A 0 + w A 1 ) , N T T ( A 0 − w A 1 ) ) , n > 1 A , n = 1 NTT(A) = \left\{ \begin{aligned} (NTT(A_0 + w A_1),\, NTT(A_0 - w A_1)) &,& n>1\\ A &,& n=1\\ \end{aligned} \right. NTT(A)={(NTT(A0+wA1),NTT(A0−wA1))A,,n>1n=1
对于FWT,可以参考快速沃尔什变换 FWT,
- 对于Or Convolution,正向变换为:
F W T ( A ) = { ( F W T ( A 0 ) , F W T ( A 0 + A 1 ) ) , n > 1 A , n = 1 FWT(A) = \left\{ \begin{aligned} (FWT(A_0),\, FWT(A_0 + A_1)) &,& n>1\\ A &,& n=1\\ \end{aligned} \right. FWT(A)={(FWT(A0),FWT(A0+A1))A,,n>1n=1
- 对于And Convolution,正向变换为:
F W T ( A ) = { ( F W T ( A 0 + A 1 ) , F W T ( A 1 ) ) , n > 1 A , n = 1 FWT(A) = \left\{ \begin{aligned} (FWT(A_0 + A_1),\, FWT(A_1)) &,& n>1\\ A &,& n=1\\ \end{aligned} \right. FWT(A)={(FWT(A0+A1),FWT(A1))A,,n>1n=1
- 对于Xor Convolution,正向变换为:
F W T ( A ) = { ( F W T ( A 0 + A 1 ) , F W T ( A 0 − A 1 ) ) , n > 1 A , n = 1 FWT(A) = \left\{ \begin{aligned} (FWT(A_0 + A_1),\, FWT(A_0 - A_1)) &,& n>1\\ A &,& n=1\\ \end{aligned} \right. FWT(A)={(FWT(A0+A1),FWT(A0−A1))A,,n>1n=1
上述各种变换的逆变换是容易推导的,略。
快速卷积的代码实现
NTT
FWT
/*
进一步加速方案:
1.将vector索引改为指针(这个索引花费了3/4的时间)
2.将取模运算用Barrett算法加速
3.循环展开
*/
#include <cstdio>
#include <iostream>
#include <algorithm>
#include <vector>
#include "tools.h"
using namespace std;
const int64 P = 998244353;
#define add(x, y) ((x += y) >= P && (x -= P))
#define sub(x, y) ((x -= y) < 0 && (x += P))
struct FWT {
int64 extend(int64 n) {
int64 N = 1;
for (; N < n; N <<= 1);
return N;
}
void FWTor(std::vector<int> &a, bool rev) {
int64 n = a.size();
auto p = a.data();
if (!rev)
for (int64 l = 2, m = 1; l <= n; l <<= 1, m <<= 1) {
for (int64 j = 0; j < n; j += l) for (int64 i = 0; i < m; i++) {
add(p[i + j + m], p[i + j]);
}
}
else
for (int64 l = 2, m = 1; l <= n; l <<= 1, m <<= 1) {
for (int64 j = 0; j < n; j += l) for (int64 i = 0; i < m; i++) {
sub(p[i + j + m], p[i + j]);
}
}
}
void FWTand(std::vector<int> &a, bool rev) {
int64 n = a.size();
auto p = a.data();
if (!rev)
for (int64 l = 2, m = 1; l <= n; l <<= 1, m <<= 1) {
for (int64 j = 0; j < n; j += l) for (int64 i = 0; i < m; i++) {
add(p[i + j], p[i + j + m]);
}
}
else
for (int64 l = 2, m = 1; l <= n; l <<= 1, m <<= 1) {
for (int64 j = 0; j < n; j += l) for (int64 i = 0; i < m; i++) {
sub(p[i + j], p[i + j + m]);
}
}
}
void FWTxor(std::vector<int> &a, bool rev) {
int64 n = a.size(), inv2 = (P + 1) >> 1;
auto p = a.data();
if (!rev)
for (int64 l = 2, m = 1; l <= n; l <<= 1, m <<= 1) {
for (int64 j = 0; j < n; j += l) for (int64 i = 0; i < m; i++) {
int64 x = p[i + j], y = p[i + j + m];
p[i + j] = (x + y) % P;
p[i + j + m] = (x - y + P) % P;
}
}
else
for (int64 l = 2, m = 1; l <= n; l <<= 1, m <<= 1) {
for (int64 j = 0; j < n; j += l) for (int64 i = 0; i < m; i++) {
int64 x = p[i + j], y = p[i + j + m];
p[i + j] = (x + y) * inv2 % P;
p[i + j + m] = (x - y + P) * inv2 % P;
}
}
}
std::vector<int> Or(std::vector<int> a1, std::vector<int> a2) {
int64 n = std::max(a1.size(), a2.size()), N = extend(n);
a1.resize(N), FWTor(a1, false);
a2.resize(N), FWTor(a2, false);
std::vector<int> A(N);
auto p = A.data(), p1 = a1.data(), p2 = a2.data();
for (int64 i = 0; i < N; i++) p[i] = p1[i] * p2[i] % P;
FWTor(A, true);
return A;
}
std::vector<int> And(std::vector<int> a1, std::vector<int> a2) {
int64 n = std::max(a1.size(), a2.size()), N = extend(n);
a1.resize(N), FWTand(a1, false);
a2.resize(N), FWTand(a2, false);
std::vector<int> A(N);
auto p = A.data(), p1 = a1.data(), p2 = a2.data();
for (int64 i = 0; i < N; i++) p[i] = p1[i] * p2[i] % P;
FWTand(A, true);
return A;
}
std::vector<int> Xor(std::vector<int> a1, std::vector<int> a2) {
int64 n = std::max(a1.size(), a2.size()), N = extend(n);
a1.resize(N), FWTxor(a1, false);
a2.resize(N), FWTxor(a2, false);
std::vector<int> A(N);
auto p = A.data(), p1 = a1.data(), p2 = a2.data();
for (int64 i = 0; i < N; i++) p[i] = p1[i] * p2[i] % P;
FWTxor(A, true);
return A;
}
} fwt;
int main() {
int64 n = 8;
std::vector<int> A;
std::vector<int> a1(n), a2(n);
for (int64 i = 0; i < n; i++)
a1[i] = 0;
a1[0] = 1;
for (int64 i = 0; i < n; i++)
a2[i] = 1+i*2;
printf("a1:");
for (int64 i = 0; i < n; i++) {
printf("%d%c", a1[i], " \n"[i == n - 1]);
}
printf("a2:");
for (int64 i = 0; i < n; i++) {
printf("%d%c", a2[i], " \n"[i == n - 1]);
}
A = fwt.Or(a1, a2);
printf("Or Convolution:");
for (int64 i = 0; i < n; i++) {
printf("%d%c", A[i], " \n"[i == n - 1]);
}
A = fwt.And(a1, a2);
printf("And Convolution:");
for (int64 i = 0; i < n; i++) {
printf("%d%c", A[i], " \n"[i == n - 1]);
}
A = fwt.Xor(a1, a2);
printf("Xor Convolution:");
for (int64 i = 0; i < n; i++) {
printf("%d%c", A[i], " \n"[i == n - 1]);
}
printf("\n测试效率:\n");
n = 1024;
a1.resize(n);
a2.resize(n);
for (int64 i = 0; i < n; i++)
a1[i] = i;
for (int64 i = 0; i < n; i++)
a2[i] = 1 + i * 2;
Loop(1000, A = fwt.Or(a1, a2));
Loop(1000, A = fwt.And(a1, a2));
Loop(1000, A = fwt.Xor(a1, a2));
return 0;
}
执行结果
a1:1 0 0 0 0 0 0 0
a2:1 3 5 7 9 11 13 15
Or Convolution:1 3 5 7 9 11 13 15
And Convolution:64 0 0 0 0 0 0 0
Xor Convolution:1 3 5 7 9 11 13 15
测试效率:
0.0443006 s
0.0525024 s
0.119512 s