「Python数据分析」社交网络：共现/合作网络（无向有权图）的节点列表、边列表-CSDN博客

本文链接：https://blog.csdn.net/Parzival_/article/details/107501973

存储合作网络的一种形式是：节点列表+边列表。

下面是从原始数据里构建这两种列表的代码实现。

使用的数据形式示例：

其中F列是作者信息，其他列包含其他信息。

输入形式：

# 二维列表形式的数据
co_list = [ ["AA | BB | CC | DD",2019],
                ["EE | BB | FF ",2018],
                ["AA | GG | FF | HH | KK",2019],
                ["CC | DD | FF | LL | AA",2020],
                ["AA | BB | FF ",2017],
                ["EE | BB | GG ",2018],
                ["DD | GG | LL | HH | EE",2019],
                ["AA | GG | CC | DD",2018]]

节点列表：

# 1.节点列表(输入co_list形式的原始数据行)
def get_nodes(co_list,col=0):
    ''' 用来生成节点列表。
        co_list: 二维列表/一维列表
        col: 节点所在列数，默认为第一列，默认为第一列（仅二维列表用到此参数
    '''
    nodes_list = []
    for authors in co_list:
        if type(authors)==list:
            auths = authors[col].split(" | ")
        else:
            auths = authors.split(" | ")
        for auth in auths:
            if auth not in nodes_list:
                nodes_list.append(auth)
    return nodes_list

结果：

["AA","BB","CC","DD","EE","FF","GG","HH","KK","LL"]

边列表：

可作为networkx中G.add_edges_from()的输入。有的网络绘制软件比如Gephi可以把这个当做邻接表（adjacency list）输入。
（头结点（父节点）-子节点形式的邻接表生成方式：get_adjacency_list()）

# 2. 边列表(输入原始数据行)
def get_edges(co_list,col=0):
    ''' （Newest）co_list: 二维列表/一维列表
        col: 节点所在列数，默认为第一列（仅二维列表用到此参数）
        返回值：边列表,[[企业1,企业2],...]
    '''
    edge_list = []
    num = 1
    for authors in co_list:
        if type(authors)==list: # 输入为二维列表
            auths = authors[col].split(" | ")
        else: # 输入为一维列表
            auths = authors.split(" | ")
            auths = sorted(auths)
        # 边
        length = len(auths)
        for i in range(length-1):
            for j in range(i+1,length):
                edge_list.append([auths[i],auths[j]])
    
    return edge_list

结果：

节点列表+边列表

上面两个函数的合并，一次性输出节点列表和边列表。

# 3. [节点列表,边列表]（1.2.的合并,省一次遍历）
def get_nodes_edges(co_list,col=0):
    ''' 用来一次性生成节点列表、边列表。
        co_list: 二维列表/一维列表
        col: 节点所在列数，默认为第一列（仅二维列表用到此参数）
        输出: [节点列表,边列表]
    '''
    nodes_list = []
    edge_list = []
    for authors in co_list:
        if type(authors)==list: # 输入为二维列表
            auths = authors[col].split(" | ")
        else: # 输入为一维列表
            auths = authors.split(" | ")
            auths = sorted(auths)
        # 节点
        for auth in auths:
            if auth not in nodes_list:
                nodes_list.append(auth)
        # 边
        length = len(auths)
        for i in range(length-1):
            for j in range(i+1,length):
                edge_list.append([auths[i],auths[j]])

    return [nodes_list,edge_list]