pagerank简单应用

最新推荐文章于 2024-06-25 18:52:12 发布

丹丹是个有福蛋儿

最新推荐文章于 2024-06-25 18:52:12 发布

阅读量1.9k

点赞数

分类专栏：算法实现

本文链接：https://blog.csdn.net/kee_ha/article/details/84550997

版权

算法实现专栏收录该内容

3 篇文章 0 订阅

订阅专栏

文本借助networkx模块1实现pagerank算法。
networkx中有三种方式实现pagerank，分别如下：

networkx.pagerank():a pure-Python implementation of the power-method to compute the largest eigenvalue/eigenvector or the Google matrix. It has two parameters that control the accuracy - tol and max_iter.
networkx.pagerank_scipy(): a SciPy sparse-matrix implementation of the power-method. It has the same two accuracy parameters.
networkx.pagerank_numpy():a NumPy (full) matrix implementation that calls the numpy.linalg.eig() function to compute the largest eigenvalue and eigenvector. That function is an interface to the LAPACK dgeev function which is uses a matrix decomposition (direct) method with no tunable parameters.
如果tol参数足够小并且max_iter参数足够大，则上述三个方法为表现良好的图形产生相同的答案（在数值舍入内）。哪一个更快取决于图表的大小以及求幂方法在图表上的效果2。

networkx实现pagerank代码如下3：

import matplotlib.pyplot as plt
import networkx as nx
import numpy as np

# 构建空图
G=nx.DiGraph()

# 向图中添加节点
pages = ["1","2","3","4"]
G.add_nodes_from(pages)

# 向图中添加边，可以不添加节点，直接添加边
G.add_edges_from([('1','2'), ('1','4'),('1','3'), ('4','1'),('2','3'),('2','4'),('3','1'),('4','3')])

# 绘图
nx.draw(G, with_labels = True)
plt.show() # display

# 计算pagerank值，一种方式
def findPageRank(linkmatrix,pages):
    eigval, eigvector= np.linalg.eig(linkmatrix) # 计算特征值和特征向量
    dominant_eigval = np.abs(eigval).max()
    PageRank= np.where(eigval == dominant_eigval)  # pagerank值
    print("The most important node is %s"% str(pages[PageRank[0][0]]))
linkmatrix = np.matrix([[0,0,1,0.5],
                      [1.0/3,0,0,0],
                      [1.0/3,0,0.5,0.5],
                      [1.0/3,0,0.5,0]])
findPageRank(linkmatrix,pages)
# 另一种方式,可以改用pagerank或pagerank_numpy
result = nx.pagerank_scipy(G, alpha=1, personalization=None, max_iter=100, tol=1e-06, weight='weight', dangling=None)