【DSA_Fall2020】2. Trees (Templates in C)

树状结构:树的基本概念、N叉树、二叉树、二叉搜索树、AVL树(平衡树)、伸展树、B树、最大/最小堆(优先队列)、Huffman编码、广义表;

C语言实现模板:N叉树、二叉树、二叉搜索树、AVL树(平衡树)、最大/最小堆
本文为本人学习过程中整理的学习笔记,想顺带学英语所以用英文呈现。发现错误还烦请指正。欢迎交流。

未经同意,请勿转载。

Trees

Definition

(Recursive)

A tree is a collection of nodes. The collection can be empty; otherwise, a tree consists of
(1) a distinguished node r r r, called the root;
(2) and zero or more nonempty (sub)trees T 1 , T 2 , . . . , T k T_1, T_2, ..., T_k T1,T2,...,Tk , each of whose roots are connected by a directed edge from r r r.

Basis Concepts

  • Degree of a node

    number of subtrees of the node.

  • Degree of a tree
    D e g r e e t r e e = max ⁡ n o d e i ∈ t r e e { D e g r e e n o d e i } Degree_{tree} = \max_{node_i \in tree}\{Degree_{node_i}\} Degreetree=nodeitreemax{Degreenodei}

  • Parent

  • Child

  • Sibling(s)

  • Leaf (Terminal node)

    node with 0 degree

  • Path from n 1 n_1 n1 to n k n_k nk

    a (unique) sequence of nodes n 1 , n 2 , … , n k n_1, n_2, …, n_k n1,n2,,nk such that n i n_i ni is the parent of n i + 1 n_{i+1} ni+1 for 1 ≤ i < k 1 \le i < k 1i<k.

  • Length of path

    number of edges on the path

  • Depth of n i n_i ni

    length of the unique path from the root to n i n_i ni. Depth(root) = 0.

  • Height of n i n_i ni

    length of the longest path from n i n_i ni to a leaf. Height(leaf) = 0.

  • Height (Depth) of a tree

    height(root) = depth(deepest leaf).

  • Ancestor of a node

    all the nodes along the path from the node up to the root.

  • Descendants(后裔) of a node

    all the nodes in its subtrees.

Average Search Time

And the depth D i D_{i} Di of a root node i i i equals 0.

The search time of a node T i T_i Ti is defined as T i = D i + 1 T_i = D_i + 1 Ti=Di+1. (the search time of a root is 1)

Then the average search time (AST) of a tree is define as follow
AST = 1 N ∑ n o d e i ∈ t r e e T i \text{AST} = \frac{1}{N}\sum_{node_i \in tree} T_i AST=N1nodeitreeTi
image-20201113173439686

N-ary Tree

Implementation

image-20201103133014659
  • Properties

    typedef struct _tnode{
        ElementType val;
        struct _tnode *next_sibling;
        struct _tnode *first_child;
    } TNode, *Tree;
    
  • Constructor

    DO WE NEED A SENTINEL IN FIRST-CHILD & NEXT-SIBLING LINK LSITS?

    – NO. If doing this, we need as many dummy nodes (sentinel) as leaves of the tree, which will bring too much cost on space just for coding convenience. (A link-lists only need one sentinel)

    Tree create_tree(ElementType val){
       	Tree root = (Tree)malloc(sizeof(TNode));
        root->val = val;
        root->next_sibling = NULL;
        root->first_child = NULL;
        return root;
    }
    
  • Destructor

    void delete_tree(Tree root){
        postorder_traversal(root, (void (*)(TNode*))free));
    }
    
  • Insertion (Building Tree)

    // Add a new node to the root as a child, return the pointer of the new node
    TNode* add_node(ElementType val, Tree root){
        if(root == NULL)
            return NULL;
        
        TNode *node = create_tree(val);
        // consider whether the root has child or not
        if(root->first_child){
            TNode *pos;
            for(pos = root->first_child; pos->next_sibling; pos = pos->next_sibling)
                ;
           	pos->next_sibling = node;
        }
        else{
            root->first_child = node;
        }
        return node;
    }
    
  • Removal

  • Preorder Traversal

    Same loop for in/post-order traversal.

    void preorder_traversal(Tree root, void (*visit)(TNode*)){
        if(root == NULL)
            return ;
        
        (*visit)(root); // any operation
        
        // recursion on every child of the root
        TNode *pos;
        for(pos = root->first_child; pos; pos = pos->next_sibling){
            preorder_traversal(pos);
        }
    }
    
  • Level-order Traversal (BFS)

    void bfs(Tree root, void (*visit)(TNode*)){
        queue<TNode*> q;
        q.enqueue(root);
        
        TNode *pos, *node;
        while(!q.empty()){
            node = q.dequeue();
            for(pos = node->first_child; pos; pos = pos->next_sibling)
                q.enqueue(pos);
           	(*visit)(node); // any operation
        }
    }
    

Binary Tree

Traversal

Pre/In/Post/Level-order traversal of binary tree share the same routines with n-ary trees, except visiting only left and right subtrees instead of scanning all children linearly.

Height & Depth

With storing and maintaining height/depth in nodes, get_height can be implemented by postoreder traversal while get_depth can be implemented by preorder traversal.

Implementation

  • Properties

    typedef struct _tnode{
        ElementType val;
        struct _tnode *left, *right;
    } TNode, *BiTree;
    
  • Constructor

    Alternative: Adding allocation error check. Example is in comment.

    AVLTree create_bitree(ElementType root_val){
        TNode *node = (TNode*)malloc(sizeof(TNode));
        
        // if(node == NULL){
        //     AllocationError();
        //     return NULL;
        // }
        
        node->val = root_val;
        node->left = NULL;
        node->right = NULL;
        return node;
    }
    
  • Destructor

    void delete_bitree(AVLTree root){
        if(root == NULL)
           	return ;
       	delete_bitree(root->left);
        delete_bitree(root->right);
        free(root);
    }
    
  • Print Tree in 90 Degree Rotation

    #define IDENTATION 4
    
    void rotate_90(AVLTree root, int depth){
    	if(root == NULL)
    		return ;
    	
    	rotate_90(root->right, depth+1);
    	
    	int i;
    	int num_space = depth * IDENTATION;
    	for(i = 0; i < num_space; i++)
    		printf(" ");
    	
        PRINT(root->val); // print the node
    	
    	rotate_90(root->left, depth+1);
        
    	return ;
    }
    
    void print_bitree(AVLTree root){
        // package interface (hide depth)
        
    	rotate_90(root, 0);
    	return ;
    }
    

Binary Search Tree (BST)

Max & Min

The leftmost node in a BST has the smallest value while the rightmost one has the largest value.

Removal

  • Delete a leaf node : Reset its parent link to NULL.
  • Delete a degree 1 node : Replace the node by its single child.
  • Delete a degree 2 node :
    • Replace the node by the largest one in its left subtree or the smallest one in its right subtree.
    • Delete the replacing node from the subtree.
image-20201106101653572

Implementation

  • Properties

    The member of bst structures are the same as vanilla binary tree.

    typedef BiTree BST;
    
  • Constructor / Destructor

    Same as binary tree’s.

    BST create_bst(ElementType val){
        return create_bitree(val);
    }
    
    void delete_bst(BST root){
        delete_bitree(root);
    }
    
  • Insertion

    Alternative: Adding allocation error check. Example is in comment.

    It’s easy to implement iterative version with a moving pointer.

    Recursive Version

    // Return the root of new tree after insertion
    AVLTree add_node(ElementType val, BST root){
        if(root == NULL){
            root = create_avl(val);
            // if(root == NULL)
            //     AllocationError();
        }
        else{
            if(val > root->val)
                add_node(root->right);
           	else if (val < root->val)
                add_node(root->left);
            // do nothing if val == root->val
            else
                ;
        }
        return root;
    }
    
  • Finding Min & Max

    d d d : depth of binary tree

    Iterative Example

    time complexity: O ( d ) O(d) O(d) , space complexity: O ( d ) O(d) O(d)

    // Return null for empty tree (root == null).
    TNode* find_max(AVLTree root){
        TNode *pos_max = root;
        while(pos_max && pos_max->right)
            pos_max = pos_max->right;
       	return pos_max;
    }
    

    Recursive Example

    time complexity: O ( d ) O(d) O(d) , space complexity: O ( d ) O(d) O(d)

    // Return null for empty tree (root == null).
    TNode* find_min(AVLTree root){
        if(root == NULL)
            return NULL;
        
        return (root->left) ? find_min(root->left) : root;
    }
    
  • Removal

    Recursive Version

    The code shown is inefficient, because it makes two passes down the tree using find_min and remove to delete the smallest node in the right subtree, which both take O ( d ) O(d) O(d). It is easy to reduce this inefficiency by writing a function completing two operations in one pass down.

    Meanwhile, the recursive implementation takes O ( d ) O(d) O(d) extra space which you can also reduce it by changing the recursion to iteration with little more work.

    // Return the root of new tree after removal.
    // Return NULL if target is not found.
    // call: root = remove(target, root);
    BST remove(ElementType target, BST root) 
    {    
        // if ( T == NULL ){
        //     Error( "Element not found" ); 
        //     return NULL;
        // }
        
        BST pos = root;
        if(target < pos->val)
            root->left = remove(target, root->left); 
        else if(target > pos->val)
            root->right = remove(target, root->right);
        else{
            // two children: Replace with smallest in right subtree.
            if(root->left && root->right){
                TNode *pos_min = find_min(root->right);
                root->val = pos_min->val;
                root->right = remove(root->val, root->right);
            }
            // one or zero child
            else{
                TNode *to_del = root;
                if(root->left == NULL) // also handles 0 child in this branch
                    root = root->right;
               	else if(root->right == NULL)
                    root = root->left;
               	free(root);
            }
        }
          
        return root;
    }
    

AVL Tree

An AVL tree is identical to a binary search tree, except that for every node in the tree, the height of the left and right subtrees can differ by at most 1.

  • The height of an empty tree is defined to be -1.

  • Height information is kept for each node.

image-20201104162106712

It is easy to show that the height of an AVL tree is at most roughly 1.44 l o g ( n + 2 ) − 0.328 1.44 log(n + 2) - 0.328 1.44log(n+2)0.328, but in practice it is about l o g ( n + 1 ) + 0.25 log(n + 1) + 0.25 log(n+1)+0.25 (although the latter claim has not been proven). (proof of the former one?)

Thus, all the tree operations can be performed in O ( log ⁡ N ) O(\log N) O(logN) time, except possibly insertion and deletion (lazy deletion will be O ( log ⁡ N ) O(\log N) O(logN)).

Maintain Operation

Single Rotation
  • Maintaining BST property: X < k 1 < Y < k 2 < Z X < k_1 < Y < k_2 < Z X<k1<Y<k2<Z

  • Fixing AVL property: left tree ↔ \leftrightarrow right tree

image-20201104163800502

If an insertion causes a node ( k 2 k_2 k2 in left → \to right) in an AVL tree to lose the balance property, do a rotation at that node to fix it up .

The basic algorithm is to start at the node inserted and travel up the tree, updating the balance information at every node on the path. Do a rotation at the first bad node found to adjust its balance, and then stop travelling up. Essentially, what the rotation does is exchanging subtrees to move the taller subtree of the imbalanced node to the layer where the node itself is.

  1. RR Rotation (right → \to left)

    The rotation should be done after RR insertion, which is the insertion of a trouble node on the right subtree of the right subtree of the node that discovers imbalance.

    In right figure, k 1 k_1 k1 discovers imbalance when a trouble node inserts to Z Z Z (the left-left subtree of k 1 k_1 k1). The operation focuses on modifying the positions of k 1 , k 2 k_1, k_2 k1,k2.

    image-20201110195406468
  2. LL Rotation

    The rotation should be done after LL insertion, which is the insertion of a trouble node on the left subtree of the left subtree of the node that discovers imbalance.

    In left figure, k 2 k_2 k2 discovers imbalance when a trouble node inserts to X X X (the left-left subtree of k 2 k_2 k2). The operation focuses on modifying the positions of k 1 , k 2 k_1, k_2 k1,k2.

    image-20201110195447865
Double Rotation

The problem occurs when a node inserts into the tree containing the middle elements ( Y Y Y) at the same time as the other tree ( Z Z Z) had identical heights.

Therefore we need double rotation to fix this.

Assume k 2 k_2 k2 loses balance:

image-20201104201009980
  1. RL Rotation

    The rotation should be done after RL insertion, which is the insertion of a trouble node on the left subtree of the right subtree of the node that discovers imbalance.

    In right figure, k 3 k_3 k3 discovers imbalance when a trouble node inserts to k 2 k_2 k2 (the right-left subtree of k 3 k_3 k3) on B B B or C C C. The operation focuses on modifying the positions of k 1 , k 2 , k 3 k_1, k_2, k_3 k1,k2,k3.

    image-20201110195548799
  2. LR Rotation

    The rotation should be done after LR insertion, which is the insertion of a trouble node on the right subtree of the left subtree of the node that discovers imbalance.

    In right figure, k 3 k_3 k3 discovers imbalance when a trouble node inserts to k 2 k_2 k2 (the left-right subtree of k 3 k_3 k3) on B B B or C C C. The operation focuses on modifying the positions of k 1 , k 2 , k 3 k_1, k_2, k_3 k1,k2,k3.

    image-20201110195515801

Implementation

NOTICE:

Whenever you update a height of node, don’t forget to PLUS ONE!
updated height = max ⁡ { subtree heights } + 1 \text{updated height} = \max\{\text{subtree heights}\} + 1 updated height=max{subtree heights}+1
OPTION:

You can choose either store the height or balance factor ( f b = h L − h R f_b = h_L - h_R fb=hLhR) of each node.

If using balance factor, you need to check ∣ f b ∣ = 2 |f_b| = 2 fb=2 before doing rotation. But you still need to use get_height function to calculate and update balance factor.

  • Properties

    typedef struct _tnode{
        ElementType val;
        int height;
        struct _tnode *left, *right;
    } TNode, *AVLTree;
    
  • Constructor

    AVLTree create_avl(ElementType root_val){
        TNode *node = (TNode*)malloc(sizeof(TNode));
        node->val = root_val;
        node->height = 0;
        node->left = NULL;
        node->right = NULL;
        return node;
    }
    
  • Destructor

    void delete_avl(AVLTree root){
        if(root == NULL)
           	return ;
       	delete_avl(root->left);
        delete_avl(root->right);
        free(root);
    }
    
  • Height

    // the height of an empty tree (root == null) is defined to be -1
    int get_height(AVLTree root){
        return root ? root->height : -1;
    }
    
  • Single Rotation

    #define max(a,b) (((a)>(b))?(a):(b))
    
    // Return the new root
    // call: root = rotation(root);
    AVLTree rr_rotation(AVLTree root){
        // Rotate, then maintain heights of two nodes(new & old roots).
        // Caller should guarantee that root has a right subtree.
        
        AVLTree new_root = root->right;
        root->right = new_root->left;
        new_root->left = root;
        
        root->height = max(get_height(root->left), get_height(root->right)) + 1;
        new_root->height = max(get_height(new_root->left), get_height(new_root->right)) + 1;
        
        return new_root;
    }
    
    AVLTree ll_rotation(AVLTree root){
        AVLTree new_root = root->left;
        root->left = new_root->right;
        new_root->right = root;
        
        root->height = max(get_height(root->left), get_height(root->right)) + 1;
        new_root->height = max(get_height(new_root->left), get_height(new_root->right)) + 1;
        
        return new_root;
    }
    
  • Double Rotation

    a. Simple Version

    AVLTree rl_rotation(AVLTree root){
        root->right = ll_rotation(root->right);
        return rr_rotation(root);
    }
    
    AVLTree lr_rotation(AVLTree root){
        root->left = rr_rotation(root->left);
        return ll_rotation(root);
    }
    

    b. Efficient Version

    The version without the inefficiency of doing two single rotations.

    AVLTree RL_rotation(AVLTree root){
        TNode *new_root = root->right->left;
        
        root->right->left = new_root->right;
        new_root->right = root->right;
        root->right = new_root->left;
        new_root->left = root;
        
        return new_root;
    }
    
    AVLTree LR_rotation(AVLTree root){
        TNode *new_root = root->left->right;
        
        root->left->right = new_root->left;
        new_root->left = root->left;
        root->left = new_root->right;
       	new_root->right = root;
        
        return new_root;
    }
    
  • Insertion

    Notice: We don’t need to do if check when doing insertion to decide whether call create_avl in current layer or add_node to dive in next layer. Calling add_node in current layer covers both situations.

    // call: root = add_node(val, root);
    // Skip allocation error check for clarity of avl related algorithm
    
    AVLTree add_node(ElementType val, AVLTree root){
        // In each branch, do insertion firstly,
        // then check for rotation need.
        // Return the new tree root (considering the null input root).
        
        if(root == NULL)
            return create_avl(val);
        
       	if(val > root->val){
            root->right = add_node(val, root->right);
            if(get_height(root->right) - get_height(root->left) == 2){
                if(val > root->right->val)
                    root = rr_rotation(root);
                else
                    root = rl_rotation(root);
            }
        }
        else if(val < root->val){
            root->left = add_node(val, root->left);
            if(get_height(root->right) - get_height(root->left) == -2){
                if(val < root->left->val)
                    root = ll_rotation(root);
                else
                    root = lr_rotation(root);
            }
        }
        // do nothing if val is already in avltree
        else
            ;
        
        // DON'T FORGET TO PLUS ONE ON THE MAXIMUM
        root->height = max(get_height(root->left), get_height(root->right)) + 1;
        
        return root;
    }
    

Splay Tree

Any M M M consecutive(连续的) tree operations take at most O ( M log ⁡ N ) O(M\log N) O(MlogN) time, which means the amortized(分期偿还) time is log ⁡ N \log N logN.

Idea: After a node is accessed, it will be pushed to the root by a series of AVL rotations.

image-20201113163553605

As the figure shown, in case 2, when X X X is on its grandparent’s LR (left-right) or RL (right-left), a double rotation (LR or RL) will be applied on G G G. But when X X X is on its grandparent’s LL (left-left) or RR (right-right), two single rotations (LL or RR) will be applied – one on G G G firstly, another on P P P secondly.

What splaying tree do is keeping using the rotations until X X X becomes root of the tree. Each rotate operation means X X X moves 2 layers upwards, which mean the operation roughly reaches log ⁡ N \log N logN.

image-20201113170221282

Informally, splaying not only moves the accessed node to the root, but also roughly halves the depth of most nodes on the path (this may not be observed easily in a more balanced BST). And some shallow nodes are pushed down at most two levels. This crucial property of splay tree shows when access paths are long, thus leading to a longer-than-normal search time, the rotations tend to be good for future operations while when accesses are cheap, the rotations are not as good and can be bad.

Compared with AVL tree, splay tree is easier to code (without considering too many special conditions) and has no need for storing the balance factor (or height) in each tree node.

Implementation

  • Building and Maintaining

    The rotations for splay trees are performed in pairs from the bottom up, a recursive implementation DOES NOT work. The pairs of nodes to consider are not known until the length of the path is determined to be even or odd. Thus, splay trees are coded non-recursively and work in two passes. The first pass goes down the tree and the second goes back up, performing rotations. This requires that the path be saved (maybe by using a stack or by adding an extra field to the node record that will point to the parent.).

  • Removal

    The deletion is performed by accessing the node to be removed. This puts the node at the root. If it is deleted, we get two subtrees T L T_L TL and T R T_R TR (left and right). If we find the largest node in T L T_L TL, then this node will be rotated to the root of T L T_L TL, and T L T_L TL will now have a root with no right child. We can finish the deletion by making T R T_R TR the right child.

B Tree

A B-tree of order m is a tree with the following structural properties:

  • The root is either a leaf or has between 2 2 2 and m m m children.
  • All non-leaf nodes (except the root) have between ⌈ m / 2 ⌉ \lceil m/2 \rceil m/2 and m m m children.
  • All leaves are at the same depth.

A B-tree of order 4 is more popularly known as a 2-3-4 tree, and a B-tree of order 3 is known as a 2-3 tree.

image-20201114193602197

All the actual data are stored in leaves in ascending order.

The illustration can be drawn as follow.

image-20201114194008243

Insertion

With general B-trees of order m m m, when a key is inserted to a node already has m keys. This key gives the node m + 1 m + 1 m+1 keys, which we can split into two nodes with ⌈ ( m + 1 ) / 2 ⌉ \lceil (m + 1) / 2 \rceil (m+1)/2 and ⌊ ( m + 1 ) / 2 ⌋ \lfloor (m + 1) / 2 \rfloor (m+1)/2 keys respectively. As this gives the parent an extra node, we have to check whether this node can be accepted by the parent and split the parent if it already has m m m children. We repeat this until we find a parent with less than m m m children. If we split the root, we create a new root with two children.

Removal

If the key to be deleted was one of only two keys in a node, then its removal leaves only one key (which break the property of a B-Tree). We can fix this by combining this node with a sibling. If the sibling has three keys, we can steal one and have both nodes with two keys. If the sibling has only two keys, we combine the two nodes into a single node with three keys. The parent of this node now loses a child, so we might have to percolate(渗透) this strategy all the way to the top. If the root loses its second child, then the root is also deleted and the tree becomes one level shallower. As we combine nodes, we must remember to update the information kept at the internal nodes.

Heap

Priority Queue

image-20201127101210373
Limitation of Simple Implementation
image-20201127081130659

If using binary search tree, the search efficiency will drop steeply because of the unbalance brought by a sequence of deletions.

If using balance tree, the implementation will includes extra operations to maintain the balance properties which are unrelated to the properties of a priority queue.

Binary Heap

  • Structure Properties

    A heap is a binary tree that is completely filled, with the possible exception of the bottom level, which is filled from left to right. Such a tree is known as a complete binary tree.
    Assuming the height of a tree with only the root node is 0, it is easy to show that a complete binary tree of height h h h has number of node in [ 2 h , 2 h + 1 − 1 ] [2^h, 2^{h+1}-1] [2h,2h+11].

    This implies(说明) that the height of a complete binary tree is ⌊ log ⁡ n ⌋ \lfloor \log n \rfloor logn, which is clearly O ( log ⁡ n ) O(\log n) O(logn).

    An important observation is that because a complete binary tree is so regular, it can be represented in an array and no pointers are necessary.

    image-20201127082731562 image-20201127083741110
  • Order Properties

    In a max (min) heap, its root is the max (mini) node of all elements with two max (min) heaps as subtrees, which means every node having a larger (smaller) key than its descendants have.

    image-20201127083959980
Implementation
  • Properties

    typedef struct{
        ElementType *data;
        int size, capacity;
        int (*cmp)(ElementType, ElementType);
    } _heap, *Heap;
    
    // Comparison functions defining the relative size of elements to pass.
    // Notice: x-y may cause overflowing
    int cmp_max(ElementType a, ElementType b){
        return (a > b) ? 1 : (a < b) ? -1 : 0;
    }
    
    int cmp_min(ElementType a, ElementType b){
        return (a > b) ? -1 : (a < b) ? 1 : 0;
    }
    
  • Constructor

    Heap create_heap(int capacity, int (*cmp_func)(ElementType, ElementType)){
        Heap h = (Heap)malloc(sizeof(_heap));
        // Remember the capacity should be 1 larger than input because the index 1 cell will be empty or sentinel.
        h->data = (ElementType*)calloc(capacity + 1, sizeof(ElementType));
        h->capacity = capacity;
        h->size = 0;
        h->cmp = cmp_func;
        return h;
    }
    
  • Destructor

    void delete_heap(Heap h){
        free(h->data);
        free(h);
    }
    
  • Size

    int get_size(Heap h){
        return h->size;
    }
    
    int is_full(Heap h){
        return get_size(h) == h->capacity;
    }
    
    int is_empty(Heap h){
        return get_size(h) == 0;
    }
    
  • Built (in place)

    T ( N ) = O ( N log ⁡ N ) T(N) = O(N\log N) T(N)=O(NlogN)

    Heap Sort

    // percolate (the cell with index p) down
    void perc_down(Heap h, int p){
        int parent, child;
        ElementType x;
        
        x = h->data[p];
        for(parent = p; parent*2 <= h->size; parent = child){
            child = parent * 2;
            if(child != h->size && h->cmp(h->data[child+1], h->data[child]) > 0)
                child++;
           	if(h->cmp(x, h->data[child]) >= 0)
                break;
           	else
                h->data[parent] = h->data[child];
        }
        h->data[parent] = x;
    }
    
    void build_heap(Heap h){
        int i;
        for(i = h->size / 2; i > 0; i--)
            perc_down(h, i);
    }
    
  • Insertion

    T ( N ) = O ( log ⁡ N ) T(N) = O(\log N) T(N)=O(logN)

    “Put” the new element into the end of the array (tree), then percolate it up to the correct position.

    In order to reach child using i/2, cell at index 0 doesn’t store data.

    Additionally, we can use it as a sentinel by assigning a bound value to it, after which we can reduce i > 1 check in every loop.

    void insert(ElementType val, Heap h){
        if(is_full(h)){
            printf("insert Error: Heap is full.\n");
            return ;
        }
        // cell with index 0 does not store data
        int p;
        for(p = ++h->size; p > 1 && h->cmp(val, h->data[p/2]) > 0; p /= 2)
            h->data[p] = h->data[p/2];
        h->data[p] = val;
    }
    
  • Peek Top

    ElementType get_top(Heap h){
        if(is_empty(h)){
            printf("get_top Error: Heap is empty.\n");
            return h->data[0];
    	}	
    	return h->data[1];
    }
    
  • Pop (Max/Min)

    T ( N ) = O ( log ⁡ N ) T(N) = O(\log N) T(N)=O(logN)

    Shift the element in the last cell to the top, then percolate it down.

    ElementType del_top(Heap h){
        if(is_empty(h)){
            printf("del_top Error: Heap is empty.\n");
            return h->data[0]; // return the sentinel
        }
        
        ElementType top_elem = get_top(h);
        h->data[1] = h->data[h->size--];
        perc_down(h, 1);
        
        return top_elem;
    }
    

d-Heap

image-20201127101003440

Huffman Coding

Weighted Path Length (WPL)

WPL of a tree is defined to be the sum of the weighted depths of its all leaves.
W P L = ∑ n o d e i ∈ l e a v e s w i d i WPL = \sum_{node_i \in leaves} w_i d_i WPL=nodeileaveswidi
image-20201117160832910

Optimal Tree and Huffman Codes

Optimal tree is a full tree with smallest WPL of a specific set of characters to be coded.

(full tree: All nodes either are leaves or have two children. An optimal code will
always have this property.)

All characters are represented at leaves, which means any sequence of bits can always be decoded unambiguously. Thus, it does not matter if the character codes are different lengths, as long as no character code is a prefix of another character code. Such an encoding is known as a prefix code.

For one set of characters, there is more than one optimal Huffman coding tree.

Huffman’s Algorithm

Maintain a forest of trees. The weight of a tree is equal to the sum of the frequencies of its leaves. C − 1 C - 1 C1 times, select the two trees, T 1 T_1 T1 and T 2 T_2 T2, of smallest weight, merge them together to form a new tree whose weight is the sum of two subtree weights.

At the beginning of the algorithm, there are C C C single-node trees – one for each character. At the end of the algorithm there is one tree, which is the optimal Huffman coding tree.

In the building of Huffman tree, we merely select the two smallest trees in each round. If we maintain the trees in a priority queue (min heap), ordered by weight, then the running time is O ( C log ⁡ C ) O(C \log C) O(ClogC), since there will be one build_heap, 2 C − 2 2C - 2 2C2 delete_mins, and C − 2 C - 2 C2 inserts, on a priority queue that never has more than C C C elements.

image-20201117160457567 image-20201117160517370
  • Decoding: FSA (Finite State Automaton)

General List

Definition

L S = ( α 1 , α 2 , . . . , , α n ) LS = (\alpha_1, \alpha_2, ..., , \alpha_n) LS=(α1,α2,...,,αn)

where α i \alpha_i αi is atom (element) or sub-list (general list).

  • List name: L S LS LS

  • List head: α 1 \alpha_1 α1, which is a single element.

  • List tail: ( α 2 , . . . , , α n ) (\alpha_2, ..., , \alpha_n) (α2,...,,αn), which is defined as a general list of the rest elements.

  • List length: n n n

  • Depth: Depth of parenthesis nesting. Depth of atom is defined as 0 0 0 and depth of an empty general list is defined to be 1 1 1.

image-20201120084100497

When every element in the general list is atom in the same type, the general list actually degenerates into a linear list.

Representation

image-20201120085351273

Pseudocode

void create_tree(Tree t, GList L){
    // Create root node
    create_root(t, head(L));
    if(tail(L) is not empty){
        // Create the first child
        cur_head = head(tail(L));
        create_tree(t->firstchild, curhead);
        // iteratively create children
        cur_tail = tail(tail(L));
        pos = t->firstchild;
        while(cur_child){
            cur_head = head(cur_tail);
            create_tree(p->nextsibling, cur_head);
            cur_tail = tail(cur_tail);
            p = p->next_sibling;
        }
        p->next_sibling = NULL;
    }
}
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值