表1.经验网络的特性:节点的数量N和边的数量M,平均度<k>,平均最短路径长度<d>,聚类系数C,度匹配性 r
| N | M | <k> | <d> | C | r |
USAir | 332 | 2126 | 12.81 | 2.74 | 0.625 | -0.208 |
Celegans | 297 | 2148 | 14.46 | 2.46 | 0.308 | -0.163 |
CGScience | 6158 | 11898 | 3.86 | 5.32 | 0.486 | 0.243 |
NetScience | 1461 | 2742 | 3.75 | 6.04 | 0.694 | 0.462 |
在做复杂网络或是链路预测时,我们通常会用到一些网络数据集,上面的是4个经验无向有权简单网络。我们通常需要了解这些网络的特性,如节点的个数,边的条数,平均最短路径是多少?列出数据,显得你的论文比较专业,对吧。
下面的代码就是求上面6个特征:
# 作者:LeiHanhan
# 日期:2020/10/14 下午 08:50
# 工具:PyCharm
# Python版本:3.8.0
# -*- coding: utf-8 -*-
import math
import networkx as nx
import pandas as pd
import numpy as np
import operator
import time
# 网络的特性,节点、边、平均度,平均最短路径,聚类系数,度分配性
# 统计网络的拓扑特性
fpaths = {
'test': './undirected_weight_standard/Test.txt',
'USAir': './undirected_weight_standard/USAir.txt',
'Celegans': './undirected_weight_standard/Celegans.txt',
'CGScience': './undirected_weight_standard/CGScience.txt',
'NetScience': './undirected_weight_standard/NetScience.txt',
}
'''保存网络特性'''
result = {}
'''节点数'''
N = 0
'''边数'''
M = 0
'''平均度'''
k = 0
'''平均最短路径长度'''
d = 0
'''聚类系数'''
C = 0
'''度分配性'''
r = 0
'''test,Celegans,USAir,CGScience,NetScience'''
data_name = 'USAir'
fpath = fpaths[data_name]
data_network = pd.read_csv(filepath_or_buffer=fpath, sep='\t', header=None, names=['v1', 'v2', 'w'])
tmp = np.unique(np.sort(data_network.get(['v1', 'v2']).values, axis=1), axis=0)
if np.any(tmp == 0):
edges = tmp.copy() + 1
else:
edges = tmp.copy()
print('%s网络中:' % (fpath))
'''创建网络'''
graph = nx.Graph()
graph.add_edges_from(edges)
'''网络的点数'''
nodes = graph.nodes()
N = graph.number_of_nodes()
result['N'] = N
'''网络的边数'''
M = len(edges)
result['M'] = M
k = np.array(nx.degree(graph))
k = np.mean(k[:, 1])
result['k'] = k
try:
d = nx.average_shortest_path_length(graph)
result['d'] = d
except nx.NetworkXError:
print('该网络不是连通网络')
c_graph = nx.connected_component_subgraphs(graph)
max_rnm = 0
d = 0
for c in c_graph:
if len(c.nodes) > max_rnm:
max_rnm = len(c.nodes)
d = nx.average_shortest_path_length(c)
result['d'] = d
C = np.mean(np.array(list(nx.clustering(graph).values())))
CC = nx.average_clustering(graph)
result['C'] = C
r = nx.degree_assortativity_coefficient(graph)
result['r'] = r
print(result)
df_result = pd.DataFrame([result])
print(df_result)
df_result.to_csv('./result/{fname}.txt'.format(fname=data_name), encoding='utf-8', index=False, sep=',')
需要数据集评论留言,我看到了会回复的。