《Data Structure And Algorithm Analysis In C++》读书笔记四

Chapter 4 Trees

This chapter discusses the data structure for which the average running time of most operation is O(logN).
The data structure is the Binary search tree. In STL , it is implemented by std::set and std::map(red-black tree)
* See how trees are used to implement the file system of several popular operating systems.
* See how trees can be used to evaluate arithmetic expressions.
* Show how to use trees to support searching operations in O(logN) average time and how to refine these ideas to obtain O(logN) worst-case bounds. We will see how to implement these operations when the data are stored on the disk.
* Discuss and use the std::set and std::map classes.

4.1 Preliminaries

The recursive definitions of Tree, we find that a tree is a collection of N nodes, one of which is the root, and N - 1 edges. That there are N - 1 edges follows from the fact that each edge connects some node to its parent, and every node except the root has one parent.(We can use the mathematics induction to proof that the edge is always N - 1)

    A path from node n1 to nk is defined as a sequence of nodes n1, n2, ..., nk such that ni is the parent of ni+1 for 1 <= i < k.

    The length of this path is the number of edges on the path, namely, k-1. There is a path of length zero from every node to itself. Notice that in a tree there is exactly one path from the root to each node.

    For any node ni, the depth of ni is the length of the unique path from the root to ni. The root is at depth 0. 

    The height of ni is the length of the longest path from ni to a leaf. Thus all leaves are at height 0. The height of a tree is equal to the height of the root. For the tree in Figure 4.2,  E is at depth 1 and height 2(from E to longest leaf node  P or Q);  F is at depth 1(from root to F) and height 1(from F to leaf K or L or M); the height of the tree is 3(from root node to the longest leaf such as P or Q). The depth of the tree(notice this concept is not equal to the root node) is equal to the depth of the deepest leaf; this is always equal to the height of the tree.

    If there is a path from n1 to n2, then n1 is an ancestor of n2 and n2 is a descendant of n1. if n1 != n2, then n1 is aproper ancestor of n2 and n2 is aproper descendant of n1.

4.1.1 Implementation of Trees

template <typename Object>
struct TreeNode
    Object element;
    TreeNode *firstChild;
    TreeNode *nextSibling;
Keep the children of each node in a linked list of three nodes. and keep the sibling of each node as another linklist.

4.1.2 Tree Traversals with an Application

One of the popular uses for tree is the directory structure in many common operating systems, indluding UNIX and DOS. Refer Fig 4.5 a typical directory in the UNIX file system.

    This traversal strategy is known as a preorder traversal (先序遍历). In a preorder traversal, work at a node is performed before (pre) its children are processed.  

    Time complexity analysis: line 1 and 2 is executed exactly once per node, line 4 can be executed at most once for each child of each node. But the number of children is exactly one less than the number of nodes. Finally the for loop iterates once per execution of line 4 plus once each time the loop ends. Thus, the total amount of work by this traversal strategy is constant per node. If there are N file names to be output, then the running time is O(N).

    Another common method of traversing a tree is the postorder traversal (后续遍历). The work at a node is performed after (post) its children are evaluated. Figure 4.8 represents the same directory structure as before,  with the numbers in parentheses representing the number of disk blocks taken up by each file.

    Since the directories are themselves files, they have size too. Suppose we would like to calculate the total number of blocks used by all the files in the tree. The most natural way to do this would be to find the number of blocks contained the subdirectories /usr/mark(30), /usr/alex(9), and /usr/bill(32). The total number of blocks is then the total in subdirectories(71) plus the one block used by /usr, for a total of 2. Refer the pseudocode method size infigure 4.9.

if the current object is not a directory, then size merely returns the number of blocks it uses in the current object. Otherwise, the number of blocks used by the directory s added to the number of blocks(recursively)found in all the children. figure 4.10 trace the size function (postorder)

4.2 Binary Trees

4.2.1 Implementation

An implementation for Binary Tree

template <typename Object>
struct BinaryNode
    Object element;     // the data in the node
    BinaryNode *left;   // left child
    BinaryNode *right;  // right child
We also do not explicity draw nullptr links when refering to trees, because every binary tree with N nodes would requir N + 1  nullptr links(this could be shown by mathematical induction)
Binary Tree is not only used for searching but also be used to evaluate as compiler design, refer the example following:

4.2.2 An Example: Expression Trees

Figure 4.14 shows an example of an expression tree. The leaves of an expression tree are  operands, such as constants or variable names, and the other nodes contain  operators
This particular tree happens to be binary, because all the operators are binary, and although this is the simplest case, it is possible for nodes to have more than two children. It is also possible for a node to have only one child, as the case with the  unary minus operator. 
    We can evaluate an expression tree, T, by applying the operator at the root to the values obtained by recursively evaluating the left and right subtrees. In our example, the left subtree evaluates to a + (b * c) and the right sub tree evaluates to ((d * e) + f) * g. The entire tree is ( a + (b * c)) + (((d * e) + f) * g).
    An infix expression could be presented by the expression tree. The general strategy for traversal is left, node, right as the  inorder traversal.(中序遍历)
    If we change the traversal strategy to postorder traversal (mention in section 4.1 about calculate the size of the directory) And postfix expression will be obtained. such as a b c * + d e * f + g * +.
    A third traversal strategy is to print out the operator first and then print left and right subtrees. the expression will be + + a * b c * + * d e f g , this is the prefix notation. and the traversal strategy is preorder traversal. (mentioned in section 4.1 to print the structure of the directory tree)
   In general, a prefix notation is obtained by preorder traversal(先序遍历 node, left, right).  postfix notation is obtained by  postorder traversal (后续遍历 right, left, node). And the infix notation is obtained by  inorder traversal (中序遍历 left , node, right)

Constructing an Expression Tree
Consider the algorithm to convert a postfix(post order traversal) expression into an expression tree. Since we already to have an algorithm to converto infix to postfix, we can generate expression trees from the two common types of input.
Sketch progress:
1) We read our expression(post fix) one symbol at a time. If the symbol is oprand, we create a one-node tree and push a pointer to it onto a stack.
2) if the symbol is an operator , we pop to two trees T1 and T2 from the stack (T1 is popped first) and form a new tree whose root is the operator and whose left and right children point to T2 and T1, respectively. A pointer to this new tree is then pushed onto the stack.
3) after we have process all of the operand /operator from the input postfix expression, we just pop the last Tree from the stack, which is the expression Tree required.
As an example:
input: a b + c d e + * *

4.3 The Search Tree ADT--Binary Search Trees

An important application of binary trees is their use in searching. Assume that each node in the tree stores an item, In our examples, we will assume, for simplicity, that these are integers, although arbitrarily complex items are easily handled in C++. We also assume that all the items are distinct,(we will discuss duplicates later).

A implementation of binary search tree, we defined an comparable object (support < ) and use the function object to support comparison. refer section 1.6.3
The data member is a pointer to the root node, this member is nullptr for empty trees.
The public funcitons use the general technique of calling private recursive functions. such as contains, insert, and remove etc.

A Binary search tree class skeleton
template <typename Comparable>
class BinarySearchTree

    BinarySearchTree(const BinarySearchTree &rhs);
    BinarySearchTree(BinarySearchTree &&rhs);

    const Comparable &findMin() const;
    const Comparable &findMax() const;
    bool contains(const Comparable &x) const;
    bool isEmpty() const;
    void printTree(std::ostream &out = std::cout) const;

    void makeEmpty();
    void insert(const Comparable &x);
    void insert(Comparable &&x);
    void remove(const Comparable &x);

    BinarySearchTree &operator=(const BinarySearchTree &rhs);
    BinarySearchTree &operator=(BinarySearchTree &&rhs);

    struct BinaryNode
        Comparable element;
        BinaryNode *left;
        BinaryNode *right;

        BinaryNode(const Comparable &theElement, BinaryNode *lt, BinaryNode *rt)
            : element{theElement}, left{lt}, right{rt}

        BinaryNode(Comparable &&theElement, BinaryNode *lt, BinaryNode *rt)
            : element{std::move(theElement)}, left{lt}, right{rt}

    BinaryNode *root;   // pointer to the root node of the BinarySearchTree

    // private member for recursive call
