Python数据结构11：树的实现，树的应用，前中后序遍历，二叉查找树BST，平衡二叉树AVL树，哈夫曼树和哈夫曼编码

YangStudent

已于 2023-02-11 11:02:18 修改

阅读量2k

点赞数 2

分类专栏： Python数据结构文章标签：数据结构 b树算法

于 2022-03-14 11:13:04 首次发布

本文链接：https://blog.csdn.net/YangStudent/article/details/123461918

版权

Python数据结构专栏收录该内容

21 篇文章 8 订阅

订阅专栏

1.概念

树一种基本的“非线性”数据结构。

2.树的数据结构表示方法：

2.1 嵌套列表法

用嵌套的列表表示树。

[根节点root, 左子树left, 右子树right]

例如

这个数表示就是

[a, [b, [d, [], []], [e, [], [] ]], [c, [f, [], []], []]]

2.2 嵌套列表法树插入新节点、返回根节点、返回子树的操作代码实现

def binary_tree(root):
    # 创建只有根节点的二叉树
    return [root, [], []]


def insert_left(root, new_branch):
    # 将新节点插入到树的根节点的左节点，作为其左子树的根节点
    # 注：不是将新节点直接插入到左子树的最后一个节点上
    temp = root.pop(1)
    if len(temp) > 1:
        root.insert(1, [new_branch, temp, []])
    else:
        root.insert(1, [new_branch, [], []])


def insert_right(root, new_branch):
    # 将新节点插入到树的根节点的右节点，作为其右子树的根节点
    # 注：不是将新节点直接插入到右子树的最后一个节点上
    temp = root.pop(2)
    if len(temp) > 1:
        root.insert(2, [new_branch, [], temp])
    else:
        root.insert(2, [new_branch, [], []])


def get_root_val(root):
    return root[0]


def set_root_val(root, new_val):
    root[0] = new_val


def get_left_child(root):
    return root[1]


def get_right_child(root):
    return root[2]


r = binary_tree(3)
insert_left(r, 4)
insert_left(r, 5)
insert_right(r, 6)
insert_right(r, 7)
l = get_left_child(r)
print(l)

set_root_val(l, 9)
print(r)
insert_left(l, 11)
print(r)
print(get_right_child(get_right_child(r)))

[5, [4, [], []], []]
[3, [9, [4, [], []], []], [7, [], [6, [], []]]]
[3, [9, [11, [4, [], []], []], []], [7, [], [6, [], []]]]
[6, [], []]

2.2 链表实现：节点链接法

每个节点保存根节点的数据项，以及指向左右子树的链接

class BinaryTree:
    def __init__(self, root_obj):
        self.key = root_obj
        self.left_child = None
        self.right_child = None

    def insert_left(self, new_node):
        if self.left_child is None:
            self.left_child = BinaryTree(new_node)
            # 和之前的操作是一样的，意思是插入到根节点的左节点上
            # 原来的左子树插入到现在这个左子树的左子树上
        else:
            t = BinaryTree(new_node)
            t.left_child = self.left_child
            self.left_child = t

    def insert_right(self, new_node):
        if self.right_child is None:
            self.right_child = BinaryTree(new_node)
        else:
            t = BinaryTree(new_node)
            t.right_child = self.right_child
            self.right_child = t

    def set_root_val(self, obj):
        self.key = obj

    def get_root_val(self):
        return self.key

    def get_left_child(self):
        return self.left_child

    def get_right_child(self):
        return self.right_child


r = BinaryTree('a')
r.insert_left('b')
r.insert_right('c')
r.get_right_child().set_root_val('hello')
r.get_left_child().insert_right('d')
print(r.get_root_val())
print(r.get_right_child().get_root_val())
print(r.get_left_child().get_root_val())
print(r.get_left_child().get_right_child().get_root_val())

a
hello
b
d

上述操作画成图就是：

3. 树的应用：解析树

树可以应用到自然语言处理（机器翻译、语义理解）中，用来分析句子的语法成分，进而可以对句子的各成分进行处理。

语法分析树包含：
主谓宾，定状补

语法树还可以用于程序设计语言的编译当中：
词法、语法检查
从语法树中生成目标代码

4. 树的应用：表达式解析

树结构可以表示表达式：

叶节点：保存操作数
内部节点：保存操作符

例如 ((7 + 3) * (5 - 2))的树结构的写法如下：

由于括号的存在，需要计算*的话，就必须先计
算7+3和5-2。
表达式层次决定计算的优先级。
越底层的表达式，优先级越高。

树中每个子树都表示一个子表达式。
将子树替换为子表达式值的节点，即可实现求值。
例如把左子树的 7 + 3 表示成根节点的左叶子节点10的图示如下。

下面，我们用树结构来做如下尝试：

从全括号表达式构建表达式解析树
利用表达式解析树对表达式求值
从表达式解析树恢复原表达式的字符串形式

实例：

将全括号表达式分解为符号Token列表
符号包括：

括号“（）”
操作符“+ - * /”
操作数“0～9”这几类

左括号就是表达式的开始，而右括号是表达式的
结束。

如对于全括号表达式：(3 + (4 * 5))，将其分解为token表：

[‘(’, ‘3’, ‘+’, ‘(’, ‘4’, ‘*’, ‘5’, ‘)’, ‘)’]

创建表示解析树过程

（3 + (4 * 5)）

创建空树，当前节点为根节点
读入’('，创建了左子节点，当前节点下降
读入’3’，当前节点设置为3，上升到父节点
读入’+'，当前节点设置为+，创建右子节点，当前节点下降

读入’('，创建左子节点，当前节点下降
读入’4’，当前节点设置为4，上升到父节点
读入’*‘，当前节点设置为’*'，创建右子节点，当前节点下降

读入’5’，当前节点设置为5，上升到父节点
读入’)'，上升到父节点
读入’)'，再上升到父节点

建立表达式解析树的顺序就是：
从左到右扫描全括号表达式的每个字符token，依据规则建立解析树

如果当前字符是"("：为当前节点添加一个新节点作为其左子节点，当前节点下降为这个新节点
如果当前字符是操作符"+, -, /, *"：将当前节点的值设为此符号，为当前节点添加一个新节点作为其右子节点，当前节点下降为这个新节点
如果当前字符是操作数：将当前节点的值设为此数，当前节点上升到父节点
如果字符单词是")" ：则当前节点上升到父节点

对全括号表达式 (3 + (4 * 5))，建立表达式解析树的流程就是：

从图示过程中我们看到，创建树过程中关键的是对当前节点的跟踪：

创建左右子树可调用insert_left/right
当前节点设置值，可以调用set_root_val
下降到左右子树可调用get_left/right_child
但是，上升到父节点，这个没有方法支持！

我们可以用一个栈来记录跟踪父节点。
当前节点下降时，将下降前的节点push入栈。
当前节点需要上升到父节点时，上升到pop出栈的节点即可！

# 定义一个节点链接法实现的树
class Tree:
    def __init__(self, root_obj):
        self.key = root_obj
        self.left_child = None
        self.right_child = None

    def insert_left(self, new_node):
        if self.left_child is None:
            self.left_child = Tree(new_node)
        else:
            temp_tree = Tree(new_node)
            temp_tree.left_child = self.left_child
            self.left_child = temp_tree

    def insert_right(self, new_node):
        if self.right_child is None:
            self.right_child = Tree(new_node)
        else:
            temp_tree = Tree(new_node)
            temp_tree.right_child = self.right_child
            self.right_child = temp_tree

    def get_root_val(self):
        return self.key

    def set_root_val(self, new_node):
        self.key = new_node

    def get_left_child(self):
        return self.left_child

    def get_right_child(self):
        return self.right_child


# 建构表达解析式树
def build_parse_tree(expression):
    expression_list = expression.split()
    # 先把表达解析式拆分到列表当中
    father_node_stack = []
    # 用栈存储父节点，便于做上下节点的操作

    parse_tree = Tree('')
    father_node_stack.append(parse_tree)
    cur_node = parse_tree

    for token in expression_list:
        if token == '(':
            cur_node.insert_left('')
            father_node_stack.append(cur_node)
            cur_node = cur_node.get_left_child()
            # 当前节点下降到左子节点

        if token in ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']:
            cur_node.set_root_val(token)
            parent = father_node_stack.pop()
            # 当前节点上升到父节点
            cur_node = parent

        if token in ['+', '-', '*', '/']:
            cur_node.set_root_val(token)
            cur_node.insert_right('')
            father_node_stack.append(cur_node)
            cur_node = cur_node.get_right_child()
            # 当前节点下降到右子节点

        if token == ')':
            if father_node_stack is not None:
                cur_node = father_node_stack.pop()

    return parse_tree


ex_tree = build_parse_tree('( 3 * ( 4 + 5 ) )')
print(ex_tree.get_root_val())
print(ex_tree.get_left_child().get_root_val())
print(ex_tree.get_right_child().get_root_val())
print(ex_tree.get_right_child().get_left_child().get_root_val())
print(ex_tree.get_right_child().get_right_child().get_root_val())

5. 用表达式解析树求值

表达式解析树是用来求全括号表达式的值的。二叉树是递归数据结构，可用递归算法处理。

求值递归函数evaluate：
由前述对子表达式的描述，可从树的底层子树开始，逐步向上层求值，最终得到整个表达式的值。

求值函数evaluate的递归三要素：

基本结束条件：叶节点是最简单的子树，没有左右子节点，其根节点的数据项即为子表达式树的值
缩小规模：将表达式树分为左子树、右子树，即为缩小规模
调用自身：分别调用evaluate计算左子树和右子树的值，然后将左右子树的值依根节点的操作符进行计算，从而得到表达式的值

import operator


def evaluate(parseTree):
    operators = {'+': operator.add,
                 '-': operator.sub,
                 '*': operator.mul,
                 '/': operator.truediv}
    # 缩小规模
    leftC = parseTree.get_left_child()  # 范围缩小到左子树，先求左子树的小表达式的值
    rightC = parseTree.get_right_child()  # 范围缩小到右子树，再求右子树的小表达式的值

    if leftC and rightC:
        fn = operators[parseTree.get_root_val()]  # 每棵子树的根节点，保存着操作符
        return fn(evaluate(leftC), evaluate(rightC))  # 递归调用
    else:
        return int(parseTree.get_root_val())  # 基本结束条件，到叶节点就直接返回值了


the_result = evaluate(ex_tree)
print("The result of the expression: ", the_result)

整个构建表达式解析树和求解的过程如下：

# 定义一个节点链接法实现的树
class Tree:
    def __init__(self, root_obj):
        self.key = root_obj
        self.left_child = None
        self.right_child = None

    def insert_left(self, new_node):
        if self.left_child is None:
            self.left_child = Tree(new_node)
        else:
            temp_tree = Tree(new_node)
            temp_tree.left_child = self.left_child
            self.left_child = temp_tree

    def insert_right(self, new_node):
        if self.right_child is None:
            self.right_child = Tree(new_node)
        else:
            temp_tree = Tree(new_node)
            temp_tree.right_child = self.right_child
            self.right_child = temp_tree

    def get_root_val(self):
        return self.key

    def set_root_val(self, new_node):
        self.key = new_node

    def get_left_child(self):
        return self.left_child

    def get_right_child(self):
        return self.right_child


# 建构表达解析式树
def build_parse_tree(expression):
    expression_list = expression.split()
    # 先把表达解析式拆分到列表当中
    father_node_stack = []
    # 用栈存储父节点，便于做上下节点的操作

    parse_tree = Tree('')
    father_node_stack.append(parse_tree)
    cur_node = parse_tree

    for token in expression_list:
        if token == '(':
            cur_node.insert_left('')
            father_node_stack.append(cur_node)
            cur_node = cur_node.get_left_child()
            # 当前节点下降到左子节点

        if token in ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']:
            cur_node.set_root_val(token)
            parent = father_node_stack.pop()
            # 当前节点上升到父节点
            cur_node = parent

        if token in ['+', '-', '*', '/']:
            cur_node.set_root_val(token)
            cur_node.insert_right('')
            father_node_stack.append(cur_node)
            cur_node = cur_node.get_right_child()
            # 当前节点下降到右子节点

        if token == ')':
            if father_node_stack is not None:
                cur_node = father_node_stack.pop()

    return parse_tree


ex_tree = build_parse_tree('( 3 * ( 4 + 5 ) )')
print(ex_tree.get_root_val())
print(ex_tree.get_left_child().get_root_val())
print(ex_tree.get_right_child().get_root_val())
print(ex_tree.get_right_child().get_left_child().get_root_val())
print(ex_tree.get_right_child().get_right_child().get_root_val())

import operator


def evaluate(parseTree):
    operators = {'+': operator.add,
                 '-': operator.sub,
                 '*': operator.mul,
                 '/': operator.truediv}
    # 缩小规模
    leftC = parseTree.get_left_child()  # 范围缩小到左子树，先求左子树的小表达式的值
    rightC = parseTree.get_right_child()  # 范围缩小到右子树，再求右子树的小表达式的值

    if leftC and rightC:
        fn = operators[parseTree.get_root_val()]  # 每棵子树的根节点，保存着操作符
        return fn(evaluate(leftC), evaluate(rightC))  # 递归调用
    else:
        return int(parseTree.get_root_val())  # 基本结束条件，到叶节点就直接返回值了


the_result = evaluate(ex_tree)
print("The result of the expression: ", the_result)

6. 树的遍历

例图来源于：一文搞懂二叉树的前序遍历，中序遍历，后序遍历

6.1 前序遍历 preorder

遍历顺序：根节点 -> 左子树 -> 右子树

对上图： 4->2->1->3->6->5->7

Python代码:

def preorder(tree):
	if tree:
		print(tree.getRootVal())
		preorder(tree.getLeftChild())
		preorder(tree.getRightChild())

6.2 中序遍历 inorder

遍历顺序：左子树 -> 根节点 -> 右子树

对上图：1->2->3->4->5->6->7

Python代码:

def inorder(tree):
	if tree != None:
		inorder(tree.getLeftChild())
		print(tree.getRootVal())
		inorder(tree.getRightChild())

6.3 后序遍历 postorder

遍历顺序：左子树 -> 右子树 -> 根节点

对上图：1->3->2->5->7->6->4

Python代码:

def postorder(tree):
	if tree != None:
		postorder(tree.getLeftChild())
		postorder(tree.getRightChild())
		print(tree.getRootVal())

6.4 前序遍历还有可以写在建树的代码里

def preorder(self):
    print(self.key)
    if self.left_child:
        self.left_child.preorder()
    if self.right_child:
        self.right_child.preorder()

6.5 后序遍历：表达式求值

回顾第5节的内容，表达式解析树求值，也是一个后序遍历的过程。

import operator


def post_order_evaluate(tree):
    opers = {
        "+": operator.add,
        "-": operator.sub,
        "*": operator.mul,
        "/": operator.truediv
    }

    res1 = None
    res2 = None

    if tree:
        res1 = post_order_evaluate(tree.left_child)
        res2 = post_order_evaluate(tree.right_child)
        if res1 and res2:
            return opers[tree.get_root_val()](res1, res2)
        else:
            return tree.get_root_val()

6.6 中序遍历建立全括号中缀表达式

def print_exp(tree):
    the_exp = ""
    if tree:
        if tree.get_left_child():
            the_exp = '(' + print_exp(tree.get_left_child())
        else:
            the_exp = print_exp(tree.get_left_child())
        the_exp = the_exp + str(tree.get_root_val())
        if tree.get_right_child():
            the_exp = the_exp + print_exp(tree.get_right_child()) + ')'
        else:
            the_exp = the_exp + print_exp(tree.get_right_child())
    return the_exp

print(print_exp(ex_tree)) # 接在第5节的后，结果是 (3*(4+5))