通俗易懂的数据结构(3)_python实现_树结构

冷风的云

于 2020-05-16 20:55:44 发布

阅读量471

点赞数 2

分类专栏：通俗易懂的数据结构_python实现文章标签：数据结构二叉树树结构 python

本文链接：https://blog.csdn.net/weixin_42796152/article/details/106164993

版权

通俗易懂的数据结构_python实现专栏收录该内容

7 篇文章 1 订阅

订阅专栏

概念

树是n个有限结点的有限集合。n=0为空树。在任意一课非空树中：有且仅有一个根结点，当n>1时，其余结点可分为m个互不相交的有限集，每个集合本身又是一棵树，称为根的子树

术语

边/分支

树是由结点（Node）和边（Edge）组成的，将一个父结点连接到其子结点的线，从（树的结构）上往下看，针对这条边，上面的结点是这个边的出边，下面的结点是这个边的入边

父节点

一个结点是其所有出边所连接结点的父结点，一个节点只能有一个父结点

子结点

入边均来自于同一个结点的若干结点，称之为这个结点的子结点

兄弟结点

具有同一个父结点的结点之间称之为兄弟结点

叶结点

没有子结点的结点称之为叶结点

深度/层级

一个结点的深度或者层级数等于将其连接到根结点的路径的长度，根结点深度为0

高度

树中的最长的路径的长度（叶子结点的最大层级数）。最大层数即为树的高度，根结点的高度为0

路径

由边依次连接在一起的结点可以看成是一个有序的列表

后代

一个节点的子结点，以及其子结点的子结点，以此类推

树结构特点

树是由结点（Node）和边（Edge）组成的
- 树的结点包含一个数据元素以及若干指向其子树的分支。结点拥有的子树称为结点的度，度为0的结点称为叶节点，度不为0的结点称之为分支节点，也叫内部节点
- 树的每条边都连接两个节点，表示结点有关联，
树是一种分层的数据结构，一般规律是，越接近顶部的层越普遍（包含的数据范围越大），反之则数据会越来越集中
一棵树的不同结点的子结点之间不能相交
叶子结点具有唯一性（叶子节点是指没有子分支的结点）
从根结点开始，每一个结点下方连接着的分支都称之为该结点的孩子
线性结构和树结构最大的不同在于线性结构是一对一的，而树结构是一对多的关系

生活中的例子

文件系统
- linux系统/windows系统的目录都是以树状图的方式设计
域名系统
xml/html文档结构

<!-- html包含body和head两大块，body又包含其他，呈树形呈现 -->
<!DOCTYPE html>
<html>
<head>
	<title></title>
</head>
<body>
	<h1>hello,world</h1>
</body>
</html>

二叉树

概念

一棵树， 如果每个节点最多有两个子树，就称之为二叉树。二叉树是n个结点的有限集合，该集合或者为空集（称为空二叉树），或者由一个根结点和两棵互不相交的，分别称为根节点的左子树和右子树组成

特点

每个结点最多有两个子树，所以二叉树中不存在度大于2的结点。可以没有子树或者只有一课子树也是可以的
左子树和右子树是有顺序的，次序不能颠倒。
即便树中只有一棵子树，也要区分它是左子树还是右子树

特殊二叉树的类型

平衡二叉树

除了最后一个层级，其他的每一层都包含了完整的结点

满二叉树

所以分支结点都存在左子树和右子树，并且所有的叶子都在同一层上，这样的二叉树称为满二叉树
满二叉树是同样深度的二叉树中结点最多的

完全二叉树

最后一层上的所有结点(叶子结点)都是从左往右填充的
完全二叉树按照线性表的方式进行存储，如果某个节点的下标为p，则其左子节点的下标为2p，右子结点为2p+1，其父结点下标为p//2，根结点的下标默认从1开始

满二叉树和完全二叉树的区别

满二叉树一定是完全二叉树，但是完全二叉树不一定是满二叉树，满二叉树是完全二叉树的一个特例
完全二叉树的叶子结点只能出现在最下面两层
最下面的叶子一定集中在左边连续位置
倒数第二层，若有叶子结点，一定在右部连续位置

二叉树的特点

在二叉树的第n层上最多有2^(n-1)结点（最多的情况其实就是满二叉树）。反过来说，对于包含N个结点的满二叉树，高度为h=log2(n+1) - 1
深度为n的二叉树最多有2^n - 1个结点
对任何一棵二叉树，如果其终端结点数为n0,度为2的结点数为n2,则n0=n2 + 1
具有n个结点的完全二叉树的深度为logn + 1
一般来讲，二叉树越平衡，插入，访问以及删除的性能越高

二叉树的实现（基于python）

嵌套列表法

# 二叉树是一种逻辑结构，物理结构同样可以用数组或是链表完成，这里是数组实现的二叉树，不推荐数组，虽然数组也有一定的特点在里面
def binary_tree(r):
    return [r, [], []]
#
def insert_left(root, new_branch):
    t = root.pop(1)
    if len(t) > 1:
        root.insert(1, [new_branch, t, []])
    else:
        root.insert(1,[new_branch, [], []])
    return root
        
def insert_right(root, new_branch):
    t = root.pop(2)
    if len(t) > 1:
        root.insert(2, [new_branch, [], [t]])
    else:
        root.insert(2,[new_branch, [], []])
    return root

def get_root(root):
    return root[0]

def set_root(root, data):
    root[0] = data

def get_left_child(root):
    return root[1]

def get_right_child(root):
    return root[2]


r = binary_tree(3)
insert_left(r,4)
insert_left(r,5)
insert_right(r,6)
insert_right(r,7)
l = get_left_child(r)
print(l)

结点链表法

# 用链表实现的二叉树
class BinaryTree(object):
    def __init__(self, root):
        self.key = root
        self.left_child = None
        self.right_child = None
        
    def insert_left(self, data):
        if self.left_child == None:
            self.left_child = BinaryTree(data)
        else:
            t = BinaryTree(data)
            t.left_child = self.left_child
            self.left_child = t
            
    def insert_right(self, data):
        if self.right_child == None:
            self.right_child = BinaryTree(data)
        else:
            t = BinaryTree(data)
            t.right_child = self.right_child
            self.right_child = t 
            
    def get_right_child(self):
        return self.right_child
    
    def get_left_child(self):
        return self.left_child
    
    # 取出当前结点的数据
    def get_root(self):
        return self.key
    
    # 设置当前结点的值
    def set_root(self, value):
        self.key = value

# 生成二叉树的结构
tree = BinaryTree(0)
tree.insert_left(3)
tree.insert_left(1)
tree.get_left_child().insert_right(4)
tree.insert_right(6)
tree.insert_right(2)
# 这个二叉树层级为2，根结点（第0层）为数据0，第一层为1，2，第二层为 3，4，6

二叉树的遍历（解释及代码）

不论哪种遍历，其实原理都是一样的。比如中序遍历（左中右的顺序），也就是先从最左边的最左结点开始，然后此子树的中间结点（当前树的根结点），然后此子树的右节点，此时这棵左子树已经遍历完成，把当前遍历完成的左子树看作一个整体，找到这棵树是哪个结点的左子节点，然后遍历此左子节点的中间结点，以此类推···
二叉树的遍历从结点的相对位置来看的话，分为以下四种，前序，中序，后序，层级。
二叉树的遍历从宏观的角度看，分为深度优先遍历（前序，中序，后序），广度优先遍历（层级）
以下的遍历代码是基于递归实现

前序遍历

简单来说，前序遍历先访问根结点，然后在以相同的方式（先访问子树的根，然后左，右）访问左子树和右子树

# 递归实现
def pre_order(tree):
    if tree:
        print(tree.get_root())
        pre_order(tree.get_left_child())
        pre_order(tree.get_right_child())
        
# 利用栈的思想
def pre_order(tree_node):
    s = Stack()
    while tree_node or (not s.is_empty):
        while tree_node != None:
            print(tree_node.get_root())
            s.push(tree_node)
            tree_node = tree_node.get_left_child()
        if not s.is_empty():
            tree_node = s.pop()
            tree_node = tree_node.get_right_child()

中序遍历

先遍历左子树，然后访问根结点，最后遍历右子树，以此类推（每个子树遍历完，就到当前这个子树的根的父结点，然后右）

def in_order(tree):
    if tree != None:
        in_order(tree.get_left_child())
        print(tree.get_root())
        in_order(tree.get_right_child())

后序遍历

先遍历左子树，然后右子树，最后访问根结点

def post_order(tree):
    if tree != None:
    	post_order(tree.get_left_child())
        post_order(tree.get_right_child())
        print(tree.get_root())

层序遍历

从每一层开始，按照从左往右的顺序遍历结点

# 利用队列的数据结构
def level_order(root):
    q = queue()
    q.enqueue(root)
    while not q.is_empty():
        current_root = q.dequeue()
        print(current_root.get_root())
        if current_root.left_child != None:
            q.enqueue(current.left)
        if current_root.right_child != None:
            q.enqueue(current.right)

二叉树的常见应用

堆（heap）

有这样一个特点（堆次序）：任何一条路径都是已经排好序的有序数列

最小堆
- 每个结点的数据项都小于或等于其两个子结点数据，最小的项位于根结点
最大堆
- 每个结点的数据项都大于等于其两个子结点的数据，最大的项位于根结点

# 最小堆实现（python标准库也自带heapq模块，这里为自己实现的逻辑）
class BinHeap(object):

    def __init__(self):
        # 这里给一个初始元素占位，是为了下面的计算方便(parent_index = 2 * left_child_index = 2 * right_child_index + 1)
        self._heap = [0]
        self.current_size = 0

    # 有序的添加数据
    def heappush(self, data):
        self._heap.append(data)
        self.current_size += 1
        # 避免current_size的变化,用新的变量引用
        child_index = self.current_size
        while child_index // 2 > 0:
            if self._heap[child_index] < self._heap[child_index // 2]:
                self._heap[child_index], self._heap[child_index // 2] = self._heap[child_index // 2], self._heap[child_index]
            child_index = child_index // 2
    
    # 删除堆中最小元素并返回
    def heappop(self):
        # 将最小值抛出,并将当前最后的一个值填充给抛出的位置
        rm_data = self._heap[1]
        self._heap[1] = self._heap[self.current_size]
        self.current_size -= 1
        self._heap.pop(1)
        current_index = 1
        while current_index * 2 <= self.current_size:
            min_index = self.min_child(current_index)
            if self._heap[current_index] > self._heap[min_index]:
                self._heap[current_index], self._heap[min_index] = self._heap[min_index], self._heap[current_index]
                current_index = min_index
        return rm_data

    # 辅助函数,帮助pop_data选择子结点的最小索引并返回最小索引
    def min_child(self, current_index):
        # 如果只有一个子节点的情况，因为是完全二叉树，必然有左子节点的存在
        if 2 * current_index + 1 > self.current_size:
            return 2 * current_index
        if self._heap[2 * current_index] > self._heap[2 * current_index + 1]:
            return 2 * current_index + 1
        return 2 * current_index

二叉搜索树（BTS）

概念：保证左子节点的值小于其父结点，右子结点的值大于其父结点

表达式树（讲解及代码）

概念：将中缀表达式转换为解析树结构表示

算法：

将中缀转换为全括号表达式来进行操作，exp: 3+5*3-2 -> （ ( 3 + (5 * 3 ) ) - 2)
从左到右扫描全括号表达式的每个单词
创建一个空树，当前结点就为根结点
如果是"("：为当前结点添加一个新的左结点，当前结点下降为这个新结点
如果是"±*/"，将当前结点的值赋值为此符号，同时为当前结点添加一个新结点作为其右子结点，当前结点下降为这个新结点
如果当前是操作数，将当前结点的值设为此数，当前结点上升到父结点
如果当前结点是")"，则当前结点上升到父结点

# 将中缀表达式转换为解析树结构表示
def build_parse_tree(tokens):
    token_list = tokens.split(" ")
    # 栈用来保存当前结点
    s = Stack()
    # 最开始构建一个空树
    root = BinaryTree("")
    s.push(root)
    current_root = root
    for i in token_list():
        if i == "(":
            current_root.insert_left("")
            s.push(current_root)
            current_root = current_root.get_left_child()
        elif isinstance(i, int):
            current_root.set_root(int(i))
            parent = s.pop()
            current_root = parent
        elif i in ["+", "-", "*", "/"]:
            current_root.set_root(i)
            current_root.insert_right("")
            s.push(current_root)
            current_root = current_root.get_right_child()
        elif i == ")":
            current_root = s.pop()
        else:
            raise ValueError("wrong")
    return root

冷风的云

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
通俗易懂的数据结构(3)_python实现_树结构

文章目录概念术语树结构特点生活中的例子二叉树概念特点特殊二叉树的类型平衡二叉树满二叉树完全二叉树满二叉树和完全二叉树的区别二叉树的特点二叉树的实现（基于python）嵌套列表法结点链表法二叉树的遍历（解释及代码）前序遍历中序遍历后序遍历层序遍历二叉树的常见应用堆（heap）二叉搜索树（BTS）表达式树（讲解及代码）概念树是n个有限结点的有限集合。n=0为空树。在任意一课非空树中：有且仅有一个根结点，当n>1时，其余结点可分为m个互不相交的有限集，每个集合本身又是一棵树，称为根的子树术语边/
复制链接

扫一扫