Python处理Les Misérables network并利用networkx画图

25 篇文章 6 订阅
15 篇文章 9 订阅
这篇博客介绍了如何利用网络分析工具处理《悲惨世界》中的人物关系数据集。数据集包含77个节点和254条边,节点代表人物,边表示人物共同出现的章节及频率。通过使用networkx库,作者展示了如何从CSV文件加载数据,初始化无向图,以及进行图的绘制。最后,提到了networkx自带的相同数据集,简化了数据加载过程。
摘要由CSDN通过智能技术生成

数据集介绍

《悲惨世界》中的人物关系图,图中共77个节点、254条边。

数据集截图:
在这里插入图片描述
打开README文件:

Les Misérables network, part of the Koblenz Network Collection
===========================================================================

This directory contains the TSV and related files of the moreno_lesmis network: This undirected network contains co-occurances of characters in Victor Hugo's novel 'Les Misérables'. A node represents a character and an edge between two nodes shows that these two characters appeared in the same chapter of the the book. The weight of each link indicates how often such a co-appearance occured.


More information about the network is provided here: 
http://konect.cc/networks/moreno_lesmis

Files: 
    meta.moreno_lesmis -- Metadata about the network 
    out.moreno_lesmis -- The adjacency matrix of the network in whitespace-separated values format, with one edge per line
      The meaning of the columns in out.moreno_lesmis are: 
        First column: ID of from node 
        Second column: ID of to node
        Third column (if present): weight or multiplicity of edge
        Fourth column (if present):  timestamp of edges Unix time
        Third column: edge weight


Use the following References for citation:

@MISC{konect:2017:moreno_lesmis,
    title = {Les Misérables network dataset -- {KONECT}},
    month = oct,
    year = {2017},
    url = {http://konect.cc/networks/moreno_lesmis}
}

@book{konect:knuth1993,
	title = {The {Stanford} {GraphBase}: A Platform for Combinatorial Computing},
	author = {Knuth, Donald Ervin},
	volume = {37},
	year = {1993},
	publisher = {Addison-Wesley Reading},
}

@book{konect:knuth1993,
	title = {The {Stanford} {GraphBase}: A Platform for Combinatorial Computing},
	author = {Knuth, Donald Ervin},
	volume = {37},
	year = {1993},
	publisher = {Addison-Wesley Reading},
}


@inproceedings{konect,
	title = {{KONECT} -- {The} {Koblenz} {Network} {Collection}},
	author = {Jérôme Kunegis},
	year = {2013},
	booktitle = {Proc. Int. Conf. on World Wide Web Companion},
	pages = {1343--1350},
	url = {http://dl.acm.org/citation.cfm?id=2488173},
	url_presentation = {https://www.slideshare.net/kunegis/presentationwow},
	url_web = {http://konect.cc/},
	url_citations = {https://scholar.google.com/scholar?cites=7174338004474749050},
}

@inproceedings{konect,
	title = {{KONECT} -- {The} {Koblenz} {Network} {Collection}},
	author = {Jérôme Kunegis},
	year = {2013},
	booktitle = {Proc. Int. Conf. on World Wide Web Companion},
	pages = {1343--1350},
	url = {http://dl.acm.org/citation.cfm?id=2488173},
	url_presentation = {https://www.slideshare.net/kunegis/presentationwow},
	url_web = {http://konect.cc/},
	url_citations = {https://scholar.google.com/scholar?cites=7174338004474749050},
}

从中可以得知:该图是一个无向图,节点表示《悲惨世界》中的人物,两个节点之间的边表示这两个人物出现在书的同一章,边的权重表示两个人物(节点)出现在同一章中的频率。

真正的数据在out.moreno_lesmis_lesmis中,打开并另存为csv文件:
在这里插入图片描述

数据处理

networkx中对无向图的初始化代码为:

g = nx.Graph()
g.add_nodes_from([i for i in range(1, 78)])
g.add_edges_from([(1, 2, {'weight': 1})])

节点的初始化很容易解决,我们主要解决边的初始化:先将dataframe转为列表,然后将其中每个元素转为元组。

df = pd.read_csv('out.csv')
res = df.values.tolist()
for i in range(len(res)):
    res[i][2] = dict({'weight': res[i][2]})
res = [tuple(x) for x in res]
print(res)

res输出如下(部分):

[(1, 2, {'weight': 1}), (2, 3, {'weight': 8}), (2, 4, {'weight': 10}), (2, 5, {'weight': 1}), (2, 6, {'weight': 1}), (2, 7, {'weight': 1}), (2, 8, {'weight': 1})...]

因此图的初始化代码为:

g = nx.Graph()
g.add_nodes_from([i for i in range(1, 78)])
g.add_edges_from(res)

画图

nx.draw(g)
plt.show()

在这里插入图片描述

networkx自带的数据集

忙活了半天发现networkx有自带的数据集,其中就有悲惨世界的人物关系图:

g = nx.les_miserables_graph()
nx.draw(g, with_labels=True)
plt.show()

在这里插入图片描述

完整代码

# -*- coding: utf-8 -*-
import networkx as nx
import matplotlib.pyplot as plt
import pandas as pd

# 77 254
df = pd.read_csv('out.csv')
res = df.values.tolist()

for i in range(len(res)):
    res[i][2] = dict({'weight': res[i][2]})

res = [tuple(x) for x in res]
print(res)

# 初始化图
g = nx.Graph()
g.add_nodes_from([i for i in range(1, 78)])
g.add_edges_from(res)

g = nx.les_miserables_graph()
nx.draw(g, with_labels=True)
plt.show()
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Cyril_KI

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值