开发者日志-树形结构处理

Yifanan

已于 2022-12-06 21:19:16 修改

阅读量197

点赞数

文章标签：开源

于 2022-11-14 21:28:16 首次发布

本文链接：https://blog.csdn.net/nana7mii/article/details/127855616

版权

提到包依赖分析，最容易想到的就是一个从上到下的树形结构。

本项目最终决定使用skill-tree-parser包进行树形结构的维护，使用networkx进行可视化的构建。

skill-tree-parser简介

该包是在开源软件课上推荐的项目“开源技能树”中所采用的包。

该包的特点如下：

操作简单
只需要在文件结构下按照要求创建文件，执行程序即可形成树形结构，实现层面是零代码的状态。这样能够让更多开源工作者参与到我们的分析项目中来，能够为项目后续的生态打下良好的基础。
易重构
该包是为技能树项目量身定做的，其中有许多模块对于包分析来说是不需要的。在源代码中对于这部分代码有高结构化的模块，可以很轻松的重构来满足特定的需求。
结构标准
该包形成的树形结构结构标准，可以作为一个底层模块，应用于许多上游项目中。后续可以对树形结构进行相应的可视化工作。

networkx简介

networkx是一个用于分析复杂网络的python包。其具有生成有向图的功能，同时自带树形结构，是一个能够用来对树形结构进行可视化表达的一个方便的工具。

项目细节

skill-tree-parser

对于一个节点，定义它的结构如下：

——[序号].[包名]--[版本号]
————[包名]--[版本号].md
————[包名]--[版本号].json
————config.json

其中[包名]--[版本号].md是需要开发者自己创建的文件，内容如下：

 # {在此填写包名}
 {在此填写包描述}
 ## 版本号
 {在此填写版本号}
 ## 作者信息
 ### 作者A
 {在此填写作者A描述}
 ### 作者B
 {在此填写作者B描述}
 ### 作者C
 {在此填写作者C描述}

如果后续发现还有信息需要放进节点，可以修改markdown.py的parse函数。

剩余的两个文件是自动生成的。

如此，便可轻松构建树形结构，运行代码之后会生成相应的tree.json文件，用于上游应用。

networkx

针对tree.json文件，读取其中生成的树形结构，然后利用深度优先算法进行树的构建并画图。

由于networkx自动画图的结果是随机的，不是很美观，我引用了Joel在https://stackoverflow.com/a/29597209/2966723的回答，使用了一个美观的树形结构来生成最终图形。

如果你也想用到这个树形结构，我将代码贴到这里，希望能够帮到你：

def hierarchy_pos_ugly(G, root, levels=None, width=1., height=1.):
    """If there is a cycle that is reachable from root, then this will see infinite recursion.
       G: the graph
       root: the root node
       levels: a dictionary
               key: level number (starting from 0)
               value: number of nodes in this level
       width: horizontal space allocated for drawing
       height: vertical space allocated for drawing"""
    TOTAL = "total"
    CURRENT = "current"

    def make_levels(levels, node=root, currentLevel=0, parent=None):
        """Compute the number of nodes for each level
        """
        if not currentLevel in levels:
            levels[currentLevel] = {TOTAL: 0, CURRENT: 0}
        levels[currentLevel][TOTAL] += 1
        neighbors = G.neighbors(node)
        for neighbor in neighbors:
            if not neighbor == parent:
                levels = make_levels(levels, neighbor, currentLevel + 1, node)
        return levels

    def make_pos(pos, node=root, currentLevel=0, parent=None, vert_loc=0):
        dx = 1 / levels[currentLevel][TOTAL]
        left = dx / 2
        pos[node] = ((left + dx * levels[currentLevel][CURRENT]) * width, vert_loc)
        levels[currentLevel][CURRENT] += 1
        neighbors = G.neighbors(node)
        for neighbor in neighbors:
            if not neighbor == parent:
                pos = make_pos(pos, neighbor, currentLevel + 1, node, vert_loc - vert_gap)
        return pos

    if levels is None:
        levels = make_levels({})
    else:
        levels = {l: {TOTAL: levels[l], CURRENT: 0} for l in levels}
    vert_gap = height / (max([l for l in levels]) + 1)
    return make_pos({})


def hierarchy_pos_beautiful(G, root=None, width=1., vert_gap=0.2, vert_loc=0, xcenter=0.5):
    '''
    From Joel's answer at https://stackoverflow.com/a/29597209/2966723.
    Licensed under Creative Commons Attribution-Share Alike

    If the graph is a tree this will return the positions to plot this in a
    hierarchical layout.

    G: the graph (must be a tree)

    root: the root node of current branch
    - if the tree is directed and this is not given,
      the root will be found and used
    - if the tree is directed and this is given, then
      the positions will be just for the descendants of this node.
    - if the tree is undirected and not given,
      then a random choice will be used.

    width: horizontal space allocated for this branch - avoids overlap with other branches

    vert_gap: gap between levels of hierarchy

    vert_loc: vertical location of root

    xcenter: horizontal location of root
    '''
    if not nx.is_tree(G):
        raise TypeError('cannot use hierarchy_pos on a graph that is not a tree')

    if root is None:
        if isinstance(G, nx.DiGraph):
            root = next(iter(nx.topological_sort(G)))  # allows back compatibility with nx version 1.11
        else:
            root = random.choice(list(G.nodes))

    def _hierarchy_pos(G, root, width=1., vert_gap=0.2, vert_loc=0, xcenter=0.5, pos=None, parent=None):
        '''
        see hierarchy_pos docstring for most arguments

        pos: a dict saying where all nodes go if they have been assigned
        parent: parent of this branch. - only affects it if non-directed

        '''

        if pos is None:
            pos = {root: (xcenter, vert_loc)}
        else:
            pos[root] = (xcenter, vert_loc)
        children = list(G.neighbors(root))
        if not isinstance(G, nx.DiGraph) and parent is not None:
            children.remove(parent)
        if len(children) != 0:
            dx = width / len(children)
            nextx = xcenter - width / 2 - dx / 2
            for child in children:
                nextx += dx
                pos = _hierarchy_pos(G, child, width=dx, vert_gap=vert_gap,
                                     vert_loc=vert_loc - vert_gap, xcenter=nextx,
                                     pos=pos, parent=root)
        return pos

    return _hierarchy_pos(G, root, width, vert_gap, vert_loc, xcenter)

使用时添加如下代码：

pos = hierarchy_pos_beautiful(g, root)//g是networkx的图，root是根节点
node_labels = nx.get_node_attributes(g,"attr")
nx.draw(g, pos, with_labels=True, labels=node_labels)

需要注意的是，在添加节点的时候，应该添加attr属性，这个属性是图中会展现在节点上的属性，否则可能会报错。

为了方便使用树形结构的结果进行包重要性的分析，对于树中每一个节点的入度和出度应该要有记录。

这里我使用了networkx的out_degree和in_degree函数，将每一个节点的入度出度自动读取，并整合成csv文件保存到本地。后续开发可以直接调用该csv文件，对节点进行分析，从而得到相应的结论。

for node in g.nodes:
    writer.writerow([node, g.out_degree(node), g.in_degree(node)])

Yifanan

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
开发者日志-树形结构处理

该文章用于介绍OpenEuler的上下游组成分析项目
复制链接

扫一扫