I have a CSV file that represents the adjacency matrix of a graph. However the file has as the first row the labels of the nodes and as the first column also the labels of the nodes. How can I read this file into a networkx graph object? Is there a neat pythonic way to do it without hacking around?
My trial so far:
x = np.loadtxt('file.mtx', delimiter='\t', dtype=np.str)
row_headers = x[0,:]
col_headers = x[:,0]
A = x[1:, 1:]
A = np.array(A, dtype='int')
But of course this doesn't solve the problem since I need the labels for the nodes in the graph creation.
Example of the data:
Attribute,A,B,C
A,0,1,1
B,1,0,0
C,1,0,0
A Tab is the delimiter, not a comma tho.
解决方案
You could read the data into a structured array. The labels can be obtained from x.dtype.names, and then the networkx graph can be generated using nx.from_numpy_matrix:
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
# read the first line to determine the number of columns
with open('file.mtx', 'rb') as f:
ncols = len(next(f).split('\t'))
x = np.genfromtxt('file.mtx', delimiter='\t', dtype=None, names=True,
usecols=range(1,ncols) # skip the first column
)
labels = x.dtype.names
# y is a view of x, so it will not require much additional memory
y = x.view(dtype=('int', len(x.dtype)))
G = nx.from_numpy_matrix(y)
G = nx.relabel_nodes(G, dict(zip(range(ncols-1), labels)))
print(G.edges(data=True))
# [('A', 'C', {'weight': 1}), ('A', 'B', {'weight': 1})]
The nx.from_numpy_matrix has a create_using parameter you can use to specify the type of networkx Graph you wish to create. For example,
G = nx.from_numpy_matrix(y, create_using=nx.DiGraph())
makes G a DiGraph.