Pagerank算法在迭代过程中顶点Rank值的总和是可以收敛到等于顶点数的,这一性质与Pagerank算法每个顶点的rank初值无关。证明如下
假设
d
i
d_i
di为第i轮的rank值总和,可得
d
2
d_2
d2 =
d
1
∗
0.8
+
0.2
∗
n
;
d_1*0.8 + 0.2*n;
d1∗0.8+0.2∗n;
d
3
d_3
d3 =
d
2
∗
0.8
+
0.2
∗
n
;
d_2*0.8 + 0.2*n;
d2∗0.8+0.2∗n;
d
4
d_4
d4 =
d
3
∗
0.8
+
0.2
∗
n
;
d_3*0.8 + 0.2*n;
d3∗0.8+0.2∗n;
…
d
n
d_n
dn =
d
n
−
1
∗
0.8
+
0.2
∗
n
;
d_{n-1}*0.8 + 0.2*n;
dn−1∗0.8+0.2∗n;
将 d 2 d_2 d2代入 d 3 d_3 d3可得
d 3 d_3 d3 = ( d 1 ∗ 0.8 + 0.2 ∗ n ) ∗ 0.8 + 0.2 ∗ n (d_1*0.8 + 0.2*n)*0.8 + 0.2*n (d1∗0.8+0.2∗n)∗0.8+0.2∗n
将 d 3 d_3 d3代入 d 4 d_4 d4可得
d 4 d_4 d4 = ( ( d 1 ∗ 0.8 + 0.2 ∗ n ) ∗ 0.8 + 0.2 ∗ n ) ∗ 0.8 + 0.2 ∗ n ((d_1*0.8 + 0.2*n)*0.8 + 0.2*n ) * 0.8 +0.2 * n ((d1∗0.8+0.2∗n)∗0.8+0.2∗n)∗0.8+0.2∗n
展开得
d 4 d_4 d4 = d 1 ∗ 0. 8 3 + 0.2 ∗ n ∗ ( 1 + 0.8 + 0. 8 2 ) d_1*0.8^3 + 0.2 * n * (1 + 0.8 + 0.8^2) d1∗0.83+0.2∗n∗(1+0.8+0.82)
进而递推得
d n d_n dn = d 1 ∗ 0. 8 n − 1 + 0.2 ∗ n ∗ ( 1 + 0.8 + 0. 8 2 . . . + 0. 8 n − 2 ) d_1*0.8^{n-1} + 0.2 * n * (1 + 0.8 + 0.8^2... + 0.8^{n-2}) d1∗0.8n−1+0.2∗n∗(1+0.8+0.82...+0.8n−2)
1 + 0.8 + 0. 8 2 . . . + 0. 8 n − 2 1 + 0.8 + 0.8^2... + 0.8^{n-2} 1+0.8+0.82...+0.8n−2为等比数列 ,求和为 ( 1 − 0. 8 n − 2 ) / 0.2 (1-0.8^{n-2})/0.2 (1−0.8n−2)/0.2
带入后可得原式
d n d_n dn = d 1 ∗ 0. 8 n − 1 + n ∗ ( 1 − 0. 8 n − 2 ) d_1*0.8^{n-1} + n * (1-0.8^{n-2}) d1∗0.8n−1+n∗(1−0.8n−2)
对于上式,当n等于40附近, 0. 8 n − 1 , 0. 8 n − 2 0.8^{n-1} , 0.8^{n-2} 0.8n−1,0.8n−2 均接近0.0001,原式可近似为
d n d_n dn = n n n,其中n为图数据顶点数