机器学习的数学基础4：二叉树定义的 N 个版本

最新推荐文章于 2024-02-02 11:17:00 发布

闵帆

最新推荐文章于 2024-02-02 11:17:00 发布

阅读量382

点赞数

分类专栏：计算机数学基础文章标签：数据结构二叉树

本文链接：https://blog.csdn.net/minfanphd/article/details/117120456

版权

计算机数学基础专栏收录该内容

14 篇文章 8 订阅

订阅专栏

本文详细探讨了二叉树的定义，从基本的描述性定义、递归描述性定义到元组定义和有限状态自动机定义，层层深入，阐述了各种定义的优缺点，并提出了作业——定义树。内容涵盖了二叉树的基础概念、递归思想和结构约束，适合数据结构初学者理解。

摘要由CSDN通过智能技术生成

二叉树的定义有一定的难度，本贴详细描述如何逐步获得完美的版本。

1. 基本描述性定义

在《数据结构》中，对二叉树的定义是描述性的。
Definition 1. A set of nodes form a binary tree if it is empty (no node) or:
a) Any node has at most one left child and one right child. A node is called the parent of its left or right child;
b) Any node except the root should have exactly one parent; and
c) There is exactly one node without parent, which will be called the root.

分析：

空二叉树一个节点都没有;
条件 a) 说明了"二叉", 以及左子节点、右子节点、父节点;
条件 b) 杜绝了一个节点有两个父节点的情况，保证了无环. 这是树的基本要求;
条件 c) 说明有且仅有一个根节点. 注意空二叉树的情况在前面已经排除了. 这同时也保证了二叉树的连通性，即从任意节点都可以找父节点，一路找到根.
这个定义的特点是全部用文字描述，一个符号都没有, 对于见到数学符号就头疼的人来说易于理解.
一开始我写的是 A tree is a binary tree if …, 这种写法需要首先定义什么是树，但树的不同子节点是无序的，又会引入新的问题. 因此，改成了当前版本.
定义的完备性原则. 写这个定义的时候，需要用纸画出很多棵不同的二叉树，然后论证该定义是否覆盖了所有的情况. 这和写程序做测试的道理是一样的，不留下任何漏洞. 当然，写其它定义的时候，也需要这样做.

2. 递归描述性定义

递归定义需要基础与归纳.
Definition 2. A set of nodes form a binary tree if it satisfies one of the following conditions:
a) It is empty;
b) It has only one node, which is called the root of the binary tree;
c) It has one left child, which is a root of another binary tree;
d) It has one right child, which is a root of another binary tree;
e) It has one left child and one right child, which are roots of binary trees, and the nodes of the left sub-tree and right sub-tree has no intersection.

分析:

这里的条件 a) 与 b) 是基础, c-e)是归纳;
多达 5 个条件，太复杂了. 可以考虑把 c-d) 去掉, 因为子节点如果是空的话，按条件 a) 也是可以解释的. 但是, 这又引入了一个新的问题: 空树是没有根节点的.

为解决该问题, 一种极(piao)端(liang)的做法是：不承认空树. 可能读者要问: 这也行？对哒！我们写定义，是自己来创造一套系统，怎么方便怎么写，想怎么写就怎么写. 只要内部没有矛盾, 就是良好的系统 😃

Definition 3. A set of nodes form a binary tree if it satisfies one of the following conditions:
a) It has only one node, which is called the root of the binary tree;
b) It has one left child and/or one right child, which are roots of binary trees, and the nodes of the left sub-tree and right sub-tree has no intersection.

分析:

定义的极简性原则. 从 Definition 2 到 Definition 3, 就是为了极简性.
Definition 2 与 3 直接使用了左子树、右子树的概念，不够严谨. 还没想好怎么破, 毕竟描述的方式有局限, 要不大家用数学符号干什么？

既然我们已经用递归的方式定义 (描述) 了二叉树, 能否进一步, 用符号把它描述出来? 毕竟我们有成功的先例:
Definition 4 (Positive integer):
a) 1 is a positive integer;
b) if $n$ is a positive integer, then $n + 1$ is also a positive integer.

我们来尝试一下:

Definition 5 (binary tree):
a) A single node $v$ is a binary tree with root $v$ ;
b) Let $T_i$ and $T_j$ be two binary trees with roots $v_i$ and $v_j$ , respectively. A binary tree with root $v$ can have $T_i$ as its left child and/or $T_j$ as its right child.

这个乱七八糟的定义写得令人崩溃, 根本原因在于: 递归式的定义只适合于简单的集合及其元素. 特别地, 如果一个超集已经定义好, 要定义其中一个子集就比较容易.
如果要使用递归方式定义这种复杂的结构, 就好比写程序时要用一个浮点数来表示复数, 或者想骑着自行车登上月球, 都是不可行的.

3. 用元组定义

元组就是为了定义模型而生的. 使用它来定义二叉树是再适合不过了. 它对应于面向程序设计中“类”的概念, 当然, 具体的一个元组就是实例, 也就是“对象”.
我们来分析二叉树涉及的几个方面:
a) 节点的集合, 包括根节点.
b) 根节点. 如前所述, 我们的二叉树不可以是空的, 所以必须有根结点.
c) 节点之间的关系: 左子节点、右子节点.
我们先来写一个简单的版本:

Definition 5: A binary tree is a quadraple $(\mathbf{V}, r, \mathbf{L}, \mathbf{R})$ , where
a) $\mathbf{V}$ is the set of nodes;
b) $\in \mathbf{V}$ is the root;
c) $\mathbf{L} \subset \mathbf{V} \times \mathbf{V}$ is the left child relation, where $\langle v_i, v_j \rangle \in \mathbf{L}$ indicates that $v_i$ is the left child of $v_j$ ;
d) $\mathbf{R} \subset \mathbf{V} \times \mathbf{V}$ is the left child relation, where $\langle v_i, v_j \rangle \in \mathbf{R}$ indicates that $v_i$ is the right child of $v_j$ .

优点: 这个定义把二叉树的几个要素都写出来了. 特别地, 虽然 $\in \mathbf{V}$ , 但 $r$ 在元组中必须单独指定.
缺点: 这里没有把约束条件表达出来, 即二叉树的左右子树不相交.

为表达“左右子树不相交”的约束, 需要首先把左、右子树的节点集合表达出来. 这涉及到传递性问题, 即“左子节点的左子节点”属于左子树的节点集合. 另外, “左子节点的右子节点”也属于左子树的节点集合. 如何涵盖所有的情况呢?

3. 用自动机的方式定义

我们引入更高档的玩艺儿: 有限状态自动机. 二叉树符合确定的有穷状态自动机的几个特征:
a) 有若干的节点, 它们构成状态;
b) 有一个开始状态, 即根节点;
c) 没有终止状态不重要, 因为我们这里并不是做语言的识别;
d) 有一个字母表, 即 {l, r};
e) 有状态的转移, 从某一节点读入字母 l 转到左子树; 读入 r 转到右子树.

Definition 5: Let $\Sigma = \{\mathrm{l}, \mathrm{r}\}$ be the alphabet and $\phi$ be a null node. A binary tree is a triple $(\mathbf{V}, r, c)$ , where
a) $\mathbf{V} = \{v_1, v_2, \dots, v_n\}$ is the set of nodes;
b) $\in \mathbf{V}$ is the root; and
c) $\mathbf{V} \cup \{\phi\} \times \Sigma^* \rightarrow \mathbf{V} \cup \{\phi\}$ is the mapping function satisfying: $\forall v \in \mathbf{V}$ , $\exists$ 1 $\in \Sigma^*$ , st. $c (s) = v$ ;

分析:
a) 函数 $c$ 通过对字符串的处理, 描述了节点之间的关系;
b) 支持空串, 因此 $r$ 到自己的路径也是唯一的;
c) 路径的存在性, 保证了节点之间的连通性;
d) 路径的唯一性, 避免了环的出现;
e) 任意节点到空节点的路径都有无数条, 条件 c) 说存在唯一的时候, 排除了 $\phi$ ;
f) 节点 $\phi$ 是一个黑洞, 绝招是吸星大法, 把后面的所有字符吸收.