重复子树查找(Find Duplicate Subtrees)

以下面的递归树为例,查找相似的子树:

上述递归树包含的重复子树有:

1. roofplane: roofplane1, roofplane2, roofplane3, roofplane4

2. roofplane: roofplane5, roofplane6, roofplane7, roofplane8

3. rooftop: rooftop1, rooftop2

注意:叶子节点也算一个树,也需要找跟它重复的子树。

方法:

对整棵树序列化,遍历方式可以采用前、后、层序遍历方式,然后在遍历的同时,根据类别信息(label)将遍历到的节点分到不同的组中。

1. 树的遍历及序列化

前序、中序、后序、层序遍历

下面以后序遍历为例:

 def serialize(self):
        """Encodes a tree to a single string.
        
        :type root: TreeNode
        :rtype: str
        """
        
        def postOrder(tree, cur = ''):
            if not tree:
                return 
            
            children = tree.children
            numChildren = len(children)
            if numChildren:
                for child in children:
                    res = postOrder(child) 
                    cur += res
                    cur += ' '
                cur += tree.data.componentName
            else:
                cur = tree.data.componentName
            return cur
        cur = postOrder(self)
        print('cur: ', cur)
        return cur

tree.serialize()

2. 查找相似子树

    def findDiplicateSubtrees(self):
        ''' 1. Encodes a tree to a single string.
            2. Find diplicate subtrees.
        '''
        
        subtrees = {}
        def traverse(tree, cur = ''):
            if not tree:
                return '#'

            children = tree.children
            numChildren = len(children)
            if numChildren:
                ####前序或者后序遍历都可以保证序列唯一,中序遍历不行。本次采用后续遍历
                for child in children:
                    res = traverse(child)
                    cur += res
                    cur += ','
                cur += str(tree.data.label)
            else:
                cur = tree.data.label

            if not cur in subtrees.keys():
                subtrees[cur] = []
            subtrees[cur].append(tree)
            return cur
     
        ##### 二叉树的序列化
        traverse(self)
        repeatedSubtrees = {}
        for cur, treeList in subtrees.items():
            if len(treeList)>1:
                repeatedSubtrees[cur] = treeList
        return repeatedSubtrees


    def treeDisplay(tree, labels):
	    colors = randomColor(labels)
	    colorDict = {key:get_colour_name(value)[0] for key,value in zip(labels, colors)}
	    p = show_Tree(tree, colorDict)
	    p.showTree()


    ### 将上述代码复制到recursionTree类中,然后使用下面的代码调用。
    #### 默认已经使用recursionTree类进行了树的构建。
    repeatedSubtrees = tree.findDiplicateSubtrees()
    ### 显示重复子树
	i  = 0
	for stringConsequence, treeList in repeatedSubtrees.items():
        print('\n', stringConsequence,'parentNodeName: ')
		for repeatedSubtree in treeList:
            print(repeatedSubtree.data.componentName)
			labelName = stringConsequence.split(',')
			labels = list(set(labelName))
			saveFile = 'repeatedTree' + str(i+1) + '.png'
			treeDisplay(repeatedSubtree, labels, saveFile)
			i+=1

3. 结果展示

1. 序列化的结果:

除了16个叶子节点外,树中还包含4棵子树

'roofplane1,roofplane2,roofplane3,roofplane4,rooftop1',

'roofplane8,roofplane5,roofplane7,roofplane6,rooftop2',

'pillar4,pillar6,pillar10,pillar3,pillar7,pillar8,pillar9,pillar5,pillar1,pillar2,railing1',

'roofplane1,roofplane2,roofplane3,roofplane4,rooftop1,roofplane8,roofplane5,roofplane7,roofplane6,rooftop2,pillar4,pillar6,pillar10,pillar3,pillar7,pillar8,pillar9,pillar5,pillar1,pillar2,railing1,building',

2. 通过序列的重复模式进行分组,可得到三组重复的子树:

stringConsequence: ’roofplane‘ 
parentNodeOfTheTree:
roofplane1
roofplane2
roofplane3
roofplane4
roofplane8
roofplane5
roofplane7
roofplane6

stringConsequence: ’pillar‘
parentNodeOfTheTree:
pillar4
pillar6
pillar10
pillar3
pillar7
pillar8
pillar9
pillar5
pillar1
pillar2

stringConsequence: ’roofplane,roofplane,roofplane,roofplane,rooftop‘
parentNodeOfTheTree:
rooftop1
rooftop2

将最后一种显示出来如下:

4. 将重复子树 根据父节点进行分组

自上而下查找每棵子树的父节点,将不同父节点的子树分组。将下面的代码放入到Recursion_Tree类中,即可找到每棵树的父节点。

    def getParentNode(self, curNode):
        '''
        层序遍历
        '''
        if len(self.children) == 0:
            return None

        name = curNode.data.componentName
        parentNode = None
        for child in self.children:
            if child.data.componentName == name:
                parentNode=self
                break
        
        if not parentNode:
            for child in self.children:
                parentNode = child.getParentNode(curNode)
                if parentNode:
                    break

        return parentNode
   
parentNode = tree.getParentNode(repeatedSubtree)

结果展示:

stringConsequence: roofplane
parentNodeOfTheTree: rooftop1
roofplane1
roofplane2
roofplane3
roofplane4
parentNodeOfTheTree: rooftop2
roofplane8
roofplane5
roofplane7
roofplane6


stringConsequence: roofplane,roofplane,roofplane,roofplane,rooftop
parentNodeOfTheTree: building
rooftop1
rooftop2


stringConsequence: pillar
parentNodeOfTheTree: railing1
pillar4
pillar6
pillar10
pillar3
pillar7
pillar8
pillar9
pillar5
pillar1
pillar2

参考:

序列化:https://blog.csdn.net/qq_17550379/article/details/102624426

查找相似子树:https://blog.csdn.net/CsWarmSun/article/details/112410914

字符串匹配:https://blog.csdn.net/yifan403/article/details/4272793https://blog.csdn.net/wdsmao/article/details/79225290

Enable GingerCannot connect to Ginger Check your internet connection
or reload the browserDisable in this text fieldRephraseRephrase current sentence57Edit in Ginger×

Enable GingerCannot connect to Ginger Check your internet connection
or reload the browserDisable in this text fieldRephraseRephrase current sentence76Edit in Ginger×

Enable Ginger Cannot connect to Ginger Check your internet connection
or reload the browserDisable in this text fieldRephraseRephrase current sentence 46Edit in Ginger×
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值