以下面的递归树为例,查找相似的子树:
上述递归树包含的重复子树有:
1. roofplane: roofplane1, roofplane2, roofplane3, roofplane4
2. roofplane: roofplane5, roofplane6, roofplane7, roofplane8
3. rooftop: rooftop1, rooftop2
注意:叶子节点也算一个树,也需要找跟它重复的子树。
方法:
对整棵树序列化,遍历方式可以采用前、后、层序遍历方式,然后在遍历的同时,根据类别信息(label)将遍历到的节点分到不同的组中。
1. 树的遍历及序列化
前序、中序、后序、层序遍历
下面以后序遍历为例:
def serialize(self):
"""Encodes a tree to a single string.
:type root: TreeNode
:rtype: str
"""
def postOrder(tree, cur = ''):
if not tree:
return
children = tree.children
numChildren = len(children)
if numChildren:
for child in children:
res = postOrder(child)
cur += res
cur += ' '
cur += tree.data.componentName
else:
cur = tree.data.componentName
return cur
cur = postOrder(self)
print('cur: ', cur)
return cur
tree.serialize()
2. 查找相似子树
def findDiplicateSubtrees(self):
''' 1. Encodes a tree to a single string.
2. Find diplicate subtrees.
'''
subtrees = {}
def traverse(tree, cur = ''):
if not tree:
return '#'
children = tree.children
numChildren = len(children)
if numChildren:
####前序或者后序遍历都可以保证序列唯一,中序遍历不行。本次采用后续遍历
for child in children:
res = traverse(child)
cur += res
cur += ','
cur += str(tree.data.label)
else:
cur = tree.data.label
if not cur in subtrees.keys():
subtrees[cur] = []
subtrees[cur].append(tree)
return cur
##### 二叉树的序列化
traverse(self)
repeatedSubtrees = {}
for cur, treeList in subtrees.items():
if len(treeList)>1:
repeatedSubtrees[cur] = treeList
return repeatedSubtrees
def treeDisplay(tree, labels):
colors = randomColor(labels)
colorDict = {key:get_colour_name(value)[0] for key,value in zip(labels, colors)}
p = show_Tree(tree, colorDict)
p.showTree()
### 将上述代码复制到recursionTree类中,然后使用下面的代码调用。
#### 默认已经使用recursionTree类进行了树的构建。
repeatedSubtrees = tree.findDiplicateSubtrees()
### 显示重复子树
i = 0
for stringConsequence, treeList in repeatedSubtrees.items():
print('\n', stringConsequence,'parentNodeName: ')
for repeatedSubtree in treeList:
print(repeatedSubtree.data.componentName)
labelName = stringConsequence.split(',')
labels = list(set(labelName))
saveFile = 'repeatedTree' + str(i+1) + '.png'
treeDisplay(repeatedSubtree, labels, saveFile)
i+=1
3. 结果展示
1. 序列化的结果:
除了16个叶子节点外,树中还包含4棵子树:
'roofplane1,roofplane2,roofplane3,roofplane4,rooftop1',
'roofplane8,roofplane5,roofplane7,roofplane6,rooftop2',
'pillar4,pillar6,pillar10,pillar3,pillar7,pillar8,pillar9,pillar5,pillar1,pillar2,railing1',
'roofplane1,roofplane2,roofplane3,roofplane4,rooftop1,roofplane8,roofplane5,roofplane7,roofplane6,rooftop2,pillar4,pillar6,pillar10,pillar3,pillar7,pillar8,pillar9,pillar5,pillar1,pillar2,railing1,building',
2. 通过序列的重复模式进行分组,可得到三组重复的子树:
stringConsequence: ’roofplane‘
parentNodeOfTheTree:
roofplane1
roofplane2
roofplane3
roofplane4
roofplane8
roofplane5
roofplane7
roofplane6
stringConsequence: ’pillar‘
parentNodeOfTheTree:
pillar4
pillar6
pillar10
pillar3
pillar7
pillar8
pillar9
pillar5
pillar1
pillar2
stringConsequence: ’roofplane,roofplane,roofplane,roofplane,rooftop‘
parentNodeOfTheTree:
rooftop1
rooftop2
将最后一种显示出来如下:
4. 将重复子树 根据父节点进行分组
自上而下查找每棵子树的父节点,将不同父节点的子树分组。将下面的代码放入到Recursion_Tree类中,即可找到每棵树的父节点。
def getParentNode(self, curNode):
'''
层序遍历
'''
if len(self.children) == 0:
return None
name = curNode.data.componentName
parentNode = None
for child in self.children:
if child.data.componentName == name:
parentNode=self
break
if not parentNode:
for child in self.children:
parentNode = child.getParentNode(curNode)
if parentNode:
break
return parentNode
parentNode = tree.getParentNode(repeatedSubtree)
结果展示:
stringConsequence: roofplane
parentNodeOfTheTree: rooftop1
roofplane1
roofplane2
roofplane3
roofplane4
parentNodeOfTheTree: rooftop2
roofplane8
roofplane5
roofplane7
roofplane6
stringConsequence: roofplane,roofplane,roofplane,roofplane,rooftop
parentNodeOfTheTree: building
rooftop1
rooftop2
stringConsequence: pillar
parentNodeOfTheTree: railing1
pillar4
pillar6
pillar10
pillar3
pillar7
pillar8
pillar9
pillar5
pillar1
pillar2
参考:
序列化:https://blog.csdn.net/qq_17550379/article/details/102624426
查找相似子树:https://blog.csdn.net/CsWarmSun/article/details/112410914
字符串匹配:https://blog.csdn.net/yifan403/article/details/4272793、https://blog.csdn.net/wdsmao/article/details/79225290
Enable GingerCannot connect to Ginger Check your internet connection
or reload the browserDisable in this text fieldRephraseRephrase current sentence57Edit in Ginger×
Enable GingerCannot connect to Ginger Check your internet connection
or reload the browserDisable in this text fieldRephraseRephrase current sentence76Edit in Ginger×
or reload the browserDisable in this text fieldRephraseRephrase current sentence Edit in Ginger×