python csv数据处理生成图_在Python 3中从CSV文件创建Networkx图

I am trying to build a NetworkX social network graph from a CSV file. I am using Networkx 2.1 and Python 3

I followed this post with no luck because I keep receiving the error: AttributeError: 'list' object has no attribute 'decode'.

My goal is to make the weights display thicker edges for the higher weights.

Here is my code so far:

import networkx as nx

import csv

Data = open('testest.csv', "r", encoding='utf8')

read = csv.reader(Data)

Graphtype=nx.Graph() # use net.Graph() for undirected graph

G = nx.read_edgelist(read, create_using=Graphtype, nodetype=int, data=(('weight',float),))

for x in G.nodes():

print ("Node:", x, "has total #degree:",G.degree(x), " , In_degree: ", G.out_degree(x)," and out_degree: ", G.in_degree(x))

for u,v in G.edges():

print ("Weight of Edge ("+str(u)+","+str(v)+")", G.get_edge_data(u,v))

nx.draw(G)

plt.show()

Is there a more simplified way to approach this? The data is relatively simple.

Thank you for your help!

解决方案

You are misusing the function read_edgelist. From the documentation, each line needs to be parsed a string, while csv.reader parses the lines in the input file into lists of strings (for example, 202,237,1 -> ['202', '237', '1']). Therefore, AttributeError is raised because read_edgelist is trying to parse the lists provided by csv.reader, while they should be strings.

We can correctly parse the graph from the input file without using the csv module. However, we still need to deal with the first line (the headers) of the input file, which should not be parsed. There are two methods. The first method skip the first line using next:

Data = open('test.csv', "r")

next(Data, None) # skip the first line in the input file

Graphtype = nx.Graph()

G = nx.parse_edgelist(Data, delimiter=',', create_using=Graphtype,

nodetype=int, data=(('weight', float),))

The second method is a bit "hacky": since the first line starts with target, we mark the character t as the start of a comment in the input file.

Data = open('test.csv', "r")

Graphtype = nx.Graph()

G = nx.parse_edgelist(Data, comments='t', delimiter=',', create_using=Graphtype,

nodetype=int, data=(('weight', float),))

In both methods, we have to use parse_edgelist instead of read_edgelist because the input file uses \r for newlines. To use read_edgelist, the file needs to be opened in binary mode, whose lines are split iff the newlines are either \r\n or \n. Thus the input file with \r newlines cannot be split into lines, and thus cannot parsed correctly.

Also, since you want to find the in-degrees and out-degrees, the graph should be created using DiGraph, not Graph.

Edit

The key point here is to skip the header in the input file. We can achieve this by first reading the input file into a pandas.DataFrame, then we convert it to a graph.

import networkx as nx

import pandas as pd

df = pd.read_csv('test.csv')

Graphtype = nx.Graph()

G = nx.from_pandas_edgelist(df, edge_attr='weight', create_using=Graphtype)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值