F2. Neko Rules the Catniverse (Large Version)
time limit per test 7 seconds
memory limit per test 256 megabytes
input standard input
output standard output
This problem is same as the previous one, but has larger constraints.
Aki is playing a new video game. In the video game, he will control Neko, the giant cat, to fly between planets in the Catniverse.
There are n planets in the Catniverse, numbered from 1 to n. At the beginning of the game, Aki chooses the planet where Neko is initially located. Then Aki performs k−1 moves, where in each move Neko is moved from the current planet x to some other planet y such that:
Planet
y
y
y is not visited yet.
1
≤
y
≤
x
+
m
1≤y≤x+m
1≤y≤x+m (where m is a fixed constant given in the input)
This way, Neko will visit exactly k different planets. Two ways of visiting planets are called different if there is some index i such that, the i-th planet visited in the first way is different from the i-th planet visited in the second way.
What is the total number of ways to visit k planets this way? Since the answer can be quite large, print it modulo 1 0 9 + 7 10^9+7 109+7.
Input
The only line contains three integers n, k and m
(
1
≤
n
≤
1
0
9
,
1
≤
k
≤
m
i
n
(
n
,
12
)
,
1
≤
m
≤
4
)
(1≤n≤10^9, 1≤k≤min(n,12), 1≤m≤4)
(1≤n≤109,1≤k≤min(n,12),1≤m≤4) — the number of planets in the Catniverse, the number of planets Neko needs to visit and the said constant m.
Output
Print exactly one integer — the number of different ways Neko can visit exactly k planets. Since the answer can be quite large, print it modulo
1
0
9
+
7
10^9+7
109+7.
Examples
input
3 3 1
output
4
input
4 2 1
output
9
input
5 5 4
output
120
input
100 1 2
output
100
思路:假设现在已拜访的城市构成的序列方案是 v 1 , v 2 , v 3 . . . . . , v p . v_1,v_2,v_3.....,v_p. v1,v2,v3.....,vp.
那么对于一个新的未拜访城市 x x x来说,我们考虑有多少方法可以将 x x x插入上面的序列构成新的方案数。
可以将 x x x插入上述序列中某个 v i v_i vi的后面,当且仅当 x ≤ v i + m x\le v_i+m x≤vi+m;如果 x > = v 1 x>=v_1 x>=v1,则还可以将 x x x插入到 v 1 v_1 v1前面。
于是我们可以从城市 1 1 1枚举到城市 n n n, d [ i ] [ j ] [ t p ] d[i][j][tp] d[i][j][tp]表示前 i i i个城市已经拜访了 j j j个城市,且从 i i i往前的 m m m个城市的访问状态为 t p tp tp的方案数。
d
[
i
]
[
j
]
[
t
p
<
<
1
]
+
=
d
[
i
−
1
]
[
j
]
[
t
p
]
d[i][j][tp<<1]+=d[i-1][j][tp]
d[i][j][tp<<1]+=d[i−1][j][tp] (不拜访城市
i
i
i)
d
[
i
]
[
j
+
1
]
[
(
t
p
<
<
1
)
+
1
]
+
=
(
_
_
b
u
i
l
t
i
n
_
p
o
p
c
o
u
n
t
(
t
p
)
+
1
)
∗
d
[
i
−
1
]
[
j
]
[
t
p
]
d[i][j+1][(tp<<1)+1]+=(\_\_builtin\_popcount(tp)+1)*d[i-1][j][tp]
d[i][j+1][(tp<<1)+1]+=(__builtin_popcount(tp)+1)∗d[i−1][j][tp] (拜访城市
i
i
i,可以将
i
i
i插入
v
1
v_1
v1之前以及满足条件的
v
i
v_i
vi后)
找出转移方程后,就可以利用矩阵快速幂加速了。
#include<bits/stdc++.h>
using namespace std;
const int MAX=2e5+10;
const int MOD=1e9+7;
typedef long long ll;
struct lenka{int a[13*16][13*16];}res,a;
int len;
lenka cal(const lenka&A,const lenka&B)
{
lenka C;
memset(C.a,0,sizeof C.a);
for(int i=0;i<len;i++)
for(int j=0;j<len;j++)
for(int k=0;k<len;k++)
{
C.a[i][j]+=1ll*A.a[i][k]*B.a[k][j]%MOD;
C.a[i][j]%=MOD;
}
return C;
}
ll POW(int n)
{
while(n)
{
if(n&1)res=cal(a,res);
a=cal(a,a);
n>>=1;
}
memset(a.a,0,sizeof a.a);
a.a[0][0]=1;
a=cal(a,res);
}
int main()
{
int n,k,m;
cin>>n>>k>>m;
len=(k+1)*(1<<m);
memset(a.a,0,sizeof a.a);
memset(res.a,0,sizeof res.a);
for(int i=0;i<len;i++)res.a[i][i]=1;
for(int i=0;i<=k;i++)
for(int j=0;j<(1<<m);j++)
{
int tp=(j<<1)%(1<<m);
a.a[i*(1<<m)+j][i*(1<<m)+tp]=1;
if(i<k)a.a[i*(1<<m)+j][(i+1)*(1<<m)+tp+1]=__builtin_popcount(j)+1;
}
POW(n);
ll ans=0;
for(int i=0;i<(1<<m);i++)(ans+=a.a[0][k*(1<<m)+i])%=MOD;
cout<<ans<<endl;
return 0;
}