PageRank算法

PageRank

PageRank assigns a score of importance to each node.Important nodes are those with many in-links from important pages.

——重要节点是具有来自重要页面的许多内链接的节点。

A node’s PageRank depends on the PageRank of other nodes.

  • PageRank can be used for any type of network,but it is mainly useful for directed networks.

Process

  • n=number of nodes in the network
  • k=number of steps
  1. 标记所有节点的PageRank 都是 1 n \frac{1}{n} n1

    因为我们需要所有结点PageRank之和为1

  2. 进行K次PageRank的更新

Basic PageRank Updata Rule: Each node give an equal share of its current PageRank to all nodes it links to.

The new PageRank of each node is the sum of all the PageRank it received from other nodes.

Example

TimeABCDE
01/51/51/51/51/5
1(1/3)*(1/5)+(1/5)=(D+E)1/5+1/5=(A+C)(1/2) * (1/5)+(1/3) * (1/5)=(B+D)(1/2)*(1/5)=B(1/3)*(1/5)=D
21/10=0.113/30=0.437/30=0.232/10=0.21/30=0.03

得到某点的权重,等于——该点权重*(1/出度)

可以看到B点拥有最高的PageRank(认为其是最重要的结点)

TimeABCDE
30.10.330.280.220.06
∞ \infty 0.120.380.250.190.06

随着次数的增加,值的变化会越来越小

Nodes with fewer in-degrees may have a high Page Rank when they are connected to a more important node.

Define

we’re going to talk about how to interpret it and identify a potential problem that it has and also a solution.

——我们将讨论如何解释它并确定它存在的潜在问题以及解决方案

So first of all, the PageRank value of a node after k steps can be interpreted as the probability that a random walker lands on that node after taking k random steps.

——一个节点经过 k 步后的 PageRank 值可以解释为随机游走者在经过 k 个随机步后到达该节点的概率

  • Random walk of K steps: Start on a random node.Then choose an outgoing edge at random and follow it to the next node.Repeat K times

PageRank Problem

image-20220814171122001

——Figure out what the PageRank of each node is after taking a lot of PageRank steps.

For a large enough K:

F and G have PageRank of 1 2 \frac{1}{2} 21 and all the other nodes have PageRank 0

原因——当我们随机降落在这个网络时,一旦下降到了F或G,就会永远在那里循环,而出不去

Imagine a random walk on this network.Whenever the walk lands on F or G,it is “stuck” on F and G

Solve

To fix this ,we introduce a “damping parameter” α \alpha α

Ramdom walk of K stpes with damping parameter α \alpha α:

  • Start on a random node
  • With probability α \alpha α:choose an outgoing edge at random and follow it to the next node.
  • With probability 1 − α 1-\alpha 1α:choose a node at random and go to it

Repeat K times

So again, at every step, what we used to do before was to always follow the edges.

What we’re going to do now is that at every step, we’re either going to follow the edges with probability alpha.

Or we are going to forget about the edges, and choose a random node, and go to it with probability one minus(-) alpha, and we’re going to repeat this k times.

当我们到达F和G这种结点时,我们依然拥有 1 − α 1-\alpha 1α的概率,跳去别的结点

The Scaled PageRank of k steps and damping factor α \alpha α of a node n is the probability that a random walk with damping factor α \alpha α lands on a n after k steps.

For most networks, as k gets larger, Scaled PageRank converges to a unique value, which depends on α \alpha α

一般情况下, α \alpha α取值为0.8-0.9

image-20220815151517277
import networkx as nx

G=nx.Graph()
a=nx.pagerank(G,alpha=0.8)
print(type(a))
#>>dict
print(a)
  • 7
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Caaaaaan

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值