用python绘制微博社交网络图及社群检测

asdfg7766

已于 2024-05-07 18:50:39 修改

阅读量1.2k

点赞数 16

文章标签： python 开发语言

于 2024-05-06 23:54:56 首次发布

本文链接：https://blog.csdn.net/asdfg7766/article/details/138512455

版权

本文介绍了如何使用Python的NetworkX和Matplotlib库构建社交网络图，通过用户间的关注和粉丝关系，展示交友情形、社群结构，并利用Louvain方法进行社群检测。教程详细展示了数据准备、网络构建和度中心性分析的过程。

摘要由CSDN通过智能技术生成

同人女热爱查成分，来个无痛一键查

disclaimer：如果你是一个喜欢查成分的究极乐子人，喜欢整活而且热爱无差别攻击所有人，又不怕被骂的话，请食用这份教程。

本次教程可以产出：

1. 你圈的互相关注情形

2. 你圈的分群

3. 你圈的女明星

本文探讨如何用使用 Python 中的 NetworkX 和 Matplotlib 库创建社交网络图，并使用 Louvain 方法进行社群检测，简言之，利用每个用户的“关注”与“粉丝”列表，来绘制用户的交友情形及社交圈。

例图如左

图1是交友情形，每个点代表一个用户，蓝色箭头为单向关注，红色箭头为互相关注。这边我把点区分大小，因为我希望让某些我指定的用户更加突出。(我把id先拿掉了）

图2是社交分群，使用Louvain 方法进行社群检测，可以大致上了解谁和谁是一伙的。这也相当程度能协助识别你有兴趣/没兴趣的用户。阵营颜色是套件本来就有的哈，不是我特意弄的。同个颜色属于同个阵营，白话解释就是比较要好的一群人，至于算法详细可以参考：

(图3)

图3是度中心指数，可以了解该用户（该节点）的"受欢迎程度"，亦即他的外链有多少。

关于度中心性的说明与算法，可以参考这篇：

[算法系列04] 中心性算法 (Centrality Algorithms) - 知乎 (zhihu.com)

绘制社交网络图及社群检测

1. 准备数据

首先，准备数据。先将用户的“关注”与“粉丝”列表存成csv档。这部分暂时略过不教学，真的不会也可以手动保存。数据应该是一个包含用户 ID 和其关注列表以及被关注列表的 CSV 文件。可以使用 pandas 库读取这个 CSV 文件。

这步骤虽然被我略过，但满关键的。建议一定要至少有至少30个用户的资料，比较容易识别出关键用户及人际网路。当然多多益善。

import networkx as nx
import matplotlib.pyplot as plt
from matplotlib.font_manager import fontManager
from matplotlib.font_manager import FontProperties
import matplotlib.patches as mpatches
import matplotlib.font_manager as fm

import pandas as pd
import ast
df = pd.read_csv('dct.csv',header=None,encoding='utf-8') 
# 这是一个CSV档，里面存储了指定ID追踪与被追踪的名单

2. 数据预处理

接下来，需要对数据进行预处理，将其转换为适合创建网络图的格式。通常会将数据转换为字典，其中键是用户 ID，值是其关注列表和被关注列表。

id_1=[ "ＯＯＯ","XXX"] #此列表是你主要查成分的对象

def parse_list(s):
    try:
        return ast.literal_eval(s)
    except ValueError:
        return []  # 返回空列表作为异常处理

dic={}
for i in range(len(df)):
    lst= ast.literal_eval(df[1][i])

    filtered_list = [x for x in lst[0] if x]
    lst[0]=filtered_list
    filtered_list = [x for x in lst[1] if x]
    lst[1]=filtered_list

    dic[df[0][i]]=lst

'''字典应该会长这样
dic={"ID1":[[关注人A,关注人B,关注人C],[粉丝A,粉丝B,粉丝C]], 
"ID2":[[关注人B,关注人C,关注人D],[粉丝A,粉丝B]], ......
}'''

3. 创建网络图

现在，可以使用 NetworkX 库创建有向图，并将节点和边添加到图中。

# 创建有向图
G = nx.DiGraph()


# 添加节点和边
all_ids = set()

all_users = set(data.keys())
for user, lists in data.items():
    following, fans = lists
    all_users.update(following)
    all_users.update(fans)



for user_id, (following, fans) in data.items():
    all_ids.add(user_id)
    all_ids.update(following)
    all_ids.update(fans)
    for target_id in following:
        if target_id in fans:
            G.add_edge(user_id, target_id, color="red") # 特别标记互相关注的关系
        else:
            G.add_edge(user_id, target_id, color="blue")  # 只是单向关注
            #print("blue")
    for fan_id in fans:
        if fan_id in following:
            G.add_edge(fan_id, user_id, color="red")
        else:
            G.add_edge(fan_id, user_id,color="blue")  # 只是单向关注

# 进行过滤，删除连接少于某个阈值的节点
threshold = 30  # 自定义阈值
nodes_to_remove = [node for node, degree in G.degree() if degree < threshold]
print(type(nodes_to_remove))
nodes_to_remove.append('新手指南') #一些太常出现的ID可以滤掉
G.remove_nodes_from(nodes_to_remove)


plt.figure(figsize=(20, 20))
node_size = {node: 1000 if node in id_1 else 100 for node in G.nodes()}
node_color = ['thistle' if node in id_1 else 'lightblue' for node in G.nodes()]

4. 绘制网络图

 pos = nx.spring_layout(G)


 nx.draw(
        G,
        pos,
        with_labels=False,
        node_color=node_color,
        node_size=[node_size[node] for node in G.nodes()],
        font_size=4,
        edge_color=[
            edata.get("color", "blue") for _, _, edata in G.edges(data=True)
        ],  # Use 'edge_color'
        arrowstyle="-|>",
        arrowsize=10,
    )
    # 单独绘制所有节点的标签，并应用字体

 for node, (x, y) in pos.items():
         if node in id_1:  # 设置特定节点的标签字体大小
             plt.text(x, y, node, fontsize=12, ha='center', va='center')
         else:
             plt.text(x, y, node, fontsize=8, ha='center', va='center')

 plt.title('社交网络图')
 plt.savefig('no_label_1.png', dpi=300)

5. 社群检测

最后，可以使用 Louvain 方法对社群进行检测，并将检测结果可视化。

    import community as community_louvain

    G_undirected = G.to_undirected()

    # 使用 Louvain 方法进行社群检测
    # resolution是分辨率，default是1.0，可以调整
    partition = community_louvain.best_partition(G_undirected,resolution=0.9)

    print("社群分布：", partition)

    # 计算模块度
    mod = community_louvain.modularity(partition, G_undirected)
    print("模块度：", mod)

    # 使用 matplotlib 来可视化社群
    pos = nx.spring_layout(G_undirected)  # 为无向图生成布局
    cmap = plt.cm.get_cmap('viridis', max(partition.values()) + 1)


    # 绘制节点
    node_sizes = [150 if node in id_1 else 100 for node in G_undirected.nodes()]
    nx.draw_networkx_nodes(G_undirected, pos, node_size=node_sizes, cmap=cmap,     node_color=list(partition.values()))

    # 绘制边
    nx.draw_networkx_edges(G_undirected, pos, alpha=0.5)

    # 创建一个包含所有节点标签的字典
    clabels = {node: node for node in G_undirected.nodes()}

    # 设置节点标签的字体大小

    nx.draw_networkx_labels(G_undirected, pos, labels,font_size=12)
   
    plt.savefig(name, dpi=300)

# 度中心性
degree_centrality = nx.degree_centrality(G)
print("度中心性：", degree_centrality)

计算度中心性。

如此你就完成了你圈查成分的壮举了。

学会这个最大的好处是以后再也不担心有人说你乱凿成分了，同人女的想法可以很主观，但微博的关注列表和粉丝列表很客观，还可以客观评比女明星。以后打架再也不担心没有论据了河河。

asdfg7766

关注

16
点赞
踩
18

收藏

觉得还不错? 一键收藏
2
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫