Study Notes of Data Structures

Arrays

A static array is a fixed length container containing n elements ​ from the range [0, n-1]. indexable means that each slot/index in the array can be referenced with a number.

When and where is a static array used?

  • Storing and accessing sequential data

  • Temporarily storing objects

  • Used by IO routines as buffers

  • Lookup tables and inverse lookup tables

  • Can be used to return multiple values

  • Used in dynamic programming to cache answers to subproblems

Complexity

Static ArrayDynamic Array
AccessO(1)O(1)
SearchO(n)O(n)
InsertionN/AO(n)
AppendingN/AO(1)
DeletionN/AO(n)

Elements in an Array are referenced by their index. There is no other way to access elements in an array. Array indexing is zero-based, meaning the first element is found in position zero.

Operations on Dynamic Array

The dynamic array can grow and shrink in size.

How can we implement a dynamic array?

One way is to use a static array!

1) Create a static array with an initial capacity.

2) Add elements to the underlying static array, keeping track of the number of elements.

3) If adding another element will exceed the capacity, then create a new static array with twice the capacity and copy the original elements into it.

Implementation of an dynamic array

@SuppressWarnings("unchecked")
public class Array<T> implements Iterable<T> {
    
    private T[] arr;
    private int len = 0;//length user thinks array is
    private int capacity = 0;//actual array size
    
    public Array() { this(16); }
    
    public Array(int capacity) {
        if (capacity < 0) throw new IllegalArgumentException("Illegal Capacity:  " + capacity);
        this.capacity = capacity;
        arr = (T[]) new Object[capacity];
    }
    
    public int size() { return len; }
    public boolean isEmpty() { return size() == 0; }
    
    public T get(int index) { return arr[index]; }
    public void set(int index, T elem) { arr[index] = elem; }
    
    public void clear() {
        for (int i = 0; i < capacity; i++)
            arr[i] = null;
        len = 0;
    }
    
    public void add(T elem) {
        //Time to resize!
        if (len + 1 >= capacity) {
            if (capacity == 0) capacity = 1;
            else capacity *= 2; //double the size
            T[] new_arr = (T[]) new Object[capacity];
            for (int i = 0; i < len; i++) 
                new_arr[i] = arr[i]; //copy values from old array to new array
            arr = new_arr; //arr has extra nulls padded
        }
        arr[len++] = elem;
    }
    
    //Removes the element at the specified index in this list.
    public T removeAt(int rm_index) {
        if (rm_index >= len && rm_index < 0) throw new IndexOutOfBoundsException();
        T data = arr[rm_index];
        T[] new_arr = (T[]) new Object[len - 1];
        for (int i = 0, j = 0; i < len; i++, j++) {
            if (i == rm_index) j--; //Skip over rm_index by fixing j temporarily
            else new_arr[j] = arr[i];
        }
        arr = new_arr;
        capacity = --len;
        return data;
    }
    
    pubic boolean remove(Object obj) {
        for (int i = 0; i < len; i++) {
            if (arr[i].equals(obj)) {
                removeAt(i);
                return true;
            }
        }
        return false;
    }
    
    public int indexOf(Object obj) {
        for (int i = 0; i < len; i++)
            if (arr[i].equals(obj))
                return i;
        return -1;
    }
    
    public boolean contains(Object obj) {
        return indexOf(obj) != -1;
    }
    
    //Iterator is still fast but not as fast as iterative for loop
    @Override
    public java.util.Iterator <T> iterator() {
        return new java.util.Iterator<T>() {
            int index = 0;
            public boolean hasNext() { return index < len; }
            public T next() { return arr[index++]; }
        };
    }
    
    @Override
    public String toString() {
        if (len == 0) return "[]";
        else {
            StringBuilder sb = new StringBuilder(len).append("[]");
            for (int i = 0; i < len - 1; i++)
                sb.append(arr[i] + ", ");
            return sb.append(arr[len - 1] + "]").toString();
        }
    }
}

Singly and Doubly Linked Lists

What is a linked list?

A linked list is a sequential list of nodes that hold data which point to other nodes also containing data.

Data -> Data -> Data -> Data -> null

Where are linked lists used?

  • Used in many List, Queue & Stack implementations.

  • Great for creating circular lists.

  • Can easily model real world objects such as trains.

  • Used in separate chaining, which is present certain HashTable implementations to deal with hashing collisions.

  • Often used in the implementation of adjacency lists for graphs.

Terminology

Head: The first node in a linked list

Tail: The last node in a linked list

Pointer: Reference to another node

Node: An object containing data and pointer(s)

Singly vs Doubly Linked Lists

Singly linked lists only hold a reference to the next node. In the implementation you always maintain a reference to the ​ to the linked list and a reference to the ​ node for quick additions/removals.

With a doubly linked list each node holds a reference to the next and previous node. In the implementation you always maintain a reference to the head and the tail of the doubly linked list to do quick additions/removals from both ends of your list.

ProsCons
Singly LinkedUses less memory . Simpler implementation.Cannot easily access previous elements
Doubly LinkedCan be traversed backwardsTakes 2 times memory

Complexity Analysis

Singly LinkedDoubly Linked
SearchO(n)O(n)
Insert at headO(1)O(1)
Insert at tailO(1)O(1)
Remove at headO(1)O(1)
Remove at tailO(n)O(1)
Remove in middleO(n)O(n)

Implementation of a Doubly Linked List

public class DoublyLinkedList<T> implements Iterable<T> {
    
    private int size = 0;
    private Node<T> head = null;
    private Node<T> tail = null;
    
    //Internal node class to represent data
    private class Node<T> {
        T data;
        Node<T> prev;
        Node<T> next;
        public Node(T data, Node<T> prev, Node<T> next) {
            this.data = data;
            this.prev = prev;
            this.next = next;
        }
        @Override
        public String toString() {
            return data.toString();
        }
    }
    
    //Empty this linked list, O(n)
    public void clear() {
        Node<T> trav = head;
        while (trav != null) {
            Node<T> next = trav.next; //tracking the next node
            trav.prev = trav.next = null; //remove two pointers of node trav
            trav.data = null; //remove data of node trav
            trav = next; // next node to be removed
        }
        head = tail = trav = null;
        size = 0;
    }
    
    //Return the size of this linked list
    public int size() {
        return size;
    }
    
    //Is this linked list empty?
    public boolean isEmpty() {
        return size() == 0;
    }
    
    //Add an element to the tail of the linked list, O(1)
    public void add(T elem) {
        addLast(elem);
    }
    
    //Add an element to the beginning of this linked list, O(1)
    public void addFirst(T elem) {
        //The linked list is empty
        if (isEmpty()) {
            head = tail = new Node<T>(elem, null, null);
        } else {
            head.prev = new Node<T>(elem, null, head/* set the next pointer of this new node points to the current head node */);
            head = head.prev; //reset the head node
        }
        size++;
    }
    
    //Add a node to the tail of the linked list, O(1)
    public void addLast(T elem) {
        //The linked list is empty
        if (isEmpty()) {
            head = tail = new Node<T>(elem, null, null);
        } else {
            tail.next = new Node<T>(elem, tail, null);
            tail = tail.next;
        }
        size++;
    }
    
    //Check the value of the first node if it exists, O(1)
    public T peekFirst() {
        if (isEmpty()) throw new RuntimeException("Empty list");
        return head.data;
    }
    
    //Check the value of the last node if it exists, O(1)
    public T peekLast() {
        if (isEmpty()) throw new RuntimeException("Empty list");
        return tail.data;
    }
    
    //Remove the first value at the head of the linked list, O(1)
    public T removeFirst() {
        //Can't remove data from an empty list -_-
        if (isEmpty()) throw new RuntimeException("Empty list");
        
        //Extract the data at the head and move
        //the head pointer forwards one node
        T data = head.data;
        head = head.next;
        --size;
        
        //If the list is empty set the tail to null as well
        if (isEmpty()) tail = null;
        //Do a memory clean of the previous node
        else head.prev = null;
        
        //Return the data that was at the first node we just removed
        return data;
    }
    
    //Removed the last value at the tail of the linked list, O(1)
    public T removeLast() {
        //Can't remove data from an empty list -_-
        if (isEmpty()) throw new RuntimeException("Empty list");
        
        //Extract the data at the tail and move
        //the tail pointer backwards one node
        T data = tail.data;
        tail = tail.prev;
        --size;
        
        //If the list is now empty set the head to null as well
        if (isEmpty()) head = null;
        //Do a memory clean of the node that was just removed
        else tail.next = null;
        
        //Return the data that was at the first node we just removed
        return data;
    }
    
    //Removed an arbitrary node from the linked list, O(1)
    private T remove(Node<T> node) {/* the Node is private, so don't let user access it */
        //If the node to remove is somewhere either at the
        //head or the tail handle those independently
        if (node.prev == null) return removeFirst();
        if (node.next == null) return removeLast();
        
        //Make the pointers of adjacent nodes skip over 'node'
        node.next.prev = node.prev;
        node.prev.next = node.next;
        
        //Temporary store the data we want to return
        T data = node.data;
        
        //Memory cleanup
        node.data = null;
        node = node.prev = node.next = null;
        --size;
        
        //Return the data at the node we just removed
        return data;
    }
    
    //Remove a node at a particular index, O(n)
    public T removeAt(int index) {
        //Make sure the index provided is valid -_-
        if (index < 0 || index >= size) throw new IllegalArgumentException();
        
        int i;
        Node<T> trav;
        
        //Search from the front of the list
        if (index < size/2) {
            for (i = 0, trav = head; i != index; i++)
                trav = trac.next
        } else {
            for (i = size - 1, trav = tail; i != index; i--)
                trav = trav.prev;
        }
        return remove(trav);
    }
    
    //Remove a particular value in the linked list, O(n)
    public boolean remove(Object obj) {
        Node<T> trav = head;
        
        //Support searching for null
        if (obj == null) {
            for (trav = head; trav != null; trav = trav.next) {
                if (trav.data == null) {
                    remove(trav);
                    return true;
                }
            }
        } else { /* search for non null object */
            for (trav = head; trav != null; trav = trav.next) {
                if (obj.equals(trav.data)) {
                    remove(trav);
                    return true;
                }
            }
        }
        return false;
    }
    
    //Find the index of a particular value in the linked list, O(n)
    public int indexOf(Object obj) {
        int index = 0;
        Node<T> trav = head;
        
        //Support searching for null
        if (obj == null) {
            for (trav = head; trav != null; trav = trav.next, index++)
                if (trav.data == null)
                    return index;
        } else { /* search for non null object */
            for (trav = head; trav != null; trav = trav.next, index++)
                if (obj.equals(trav.data))
                    return index;
        }
        return -1;
    }
    
    //Check is a value is contained within the linked list
    public boolean contains(Object obj) {
        return indexOf(obj) != -1;
    }
    
    @Override 
    public java.util.Iterator <T> iterator() {
        return new java.util.Iterator<T>() {
            private Node<T> trav = head;
            @Override
            public boolean hasNext() {
                return trav != null;
            }
            @Override
            public T next() {
                T data = trav.data;
                trav = trav.next;
                return data;
            }
        };
    }
    
    @Override
    public String toString() {
        StringBuilder sb = new StringBuilder();
        sb.append("[ ");
        Node<T> trav = head;
        while (trav != null) {
            sb.append(trav.data + ", ");
            trav = trav.next;
        }
        sb.append(" ]");
        return sb.toString();
    }
}

Stack

What is a Stack?

A stack is a one-ended linear data structure which models a real world stack by having two primary operations, namely ​ and ​.

When and where is a Stack used?

  • Used by undo mechanisms in text editors.

  • Used in compiler syntax checking for matching brackets and braces.

  • Can be used to model a pile of books or plates.

  • Used behind the scenes to support recursion by keeping track of previous function calls.

  • Can be used to do a Depth First Search (DFS) on a graph.

Complexity

PushingO(1)
PoppingO(1)
PeekingO(1)
SearchingO(n)
SizeO(1)

Example of Stack's usage - Brackets matching

Let S be a stack
For bracket in bracket_string:
    reversed = getReversedBracket(bracket) //get the reversed bracket of the 'bracket'
    If isLeftBracket(bracket):
        S.push(bracket)
    Else If S.isEmpty() or S.pop() != reversed:
        return false //Invalid
return S.isEmpty() //Valid if S is empty

Implementation of a Stack

public class Stack<T> implements Iterable<T> {
    
    private java.util.LinkedList<T> list = new java.util.LinkedList<T>();
    
    //Create an empty stack
    public Stack() {}
    
    //Create a Stack with an initial element
    public Stack(T firstElem) {
        push(firstElem);
    }
    
    //Return the number of elements in the stack
    public int size() {
        return list.size();
    }
    
    //Check if the stack is empty
    public boolean isEmpty() {
        return size() == 0;
    }
    
    //Push an element on the stack
    public void push(T elem) {
        list.addLast(elem);
    }
    
    //Pop an element off the stack
    //Throws an error if the stack is empty
    public T pop() {
        if (isEmpty())
            throw new java.util.EmptyStackException();
        return list.removeLast();
    }
    
    //Peek the top of the stack without removing an element
    //Throws an exception if the stack is empty
    public T peek() {
        if (isEmpty())
            throw new java.util.EmptyStackException();
        return list.peekLast();
    }
    
    //Allow users to iterate through the stack using an iterator
    @Override
    public java.util.Iterator<T> iterator() {
        return list.iterator();
    }
}

Queues

What is a queue?

A queue is linear data structure which models real world queues by having two primary operations, namely ​ and ​.

Queue Terminology

There does not seem to be consistent terminology for inserting and removing elements from queues.

Enqueue = Adding = Offering

Dequeue = Polling

When and where is a Queue used?

  • Any waiting line models a queue, for example a lineup at a movie theater.

  • Can be used to efficiently keep track of the X most recently added elements.

  • Web server request management where you want first come first serve.

  • Breadth first search (BFS) graph traversal.

Complexity

EnqueueO(1)
DequeueO(1)
PeekingO(1)
ContainsO(n)
RemovalO(n)
Is EmptyO(1)

Queue Example - BFS

Let Q be a Queue
Q.enqueue(starting_node)
startint_node.visited = true
​
while Q is not empty Do
    node = Q.dequeue()
    For neighbour in neighbours(node):
        If neighbour has not been visited:
            neighbour.visited = true
            Q.enqueue(neighbour) //for visiting next layer neighbours

Queue Implementation Details

public class Queue<T> implements Iterable<T> {
    
    private javal.util.LinkedList<T> list = new java.util.LinkedList<T>();
    
    public Queue() {}
    
    public Queue(T firstElem) {
        offer(firstElem);
    }
    
    //Return the size of the queue
    public int size() {
        return list.size();
    }
    
    //Returns whether or not the queue is empty
    public boolean isEmpty() {
        return size() == 0;
    }
    
    //Peek the element at the front of the queue
    //The method throws an error if the queue is empty
    public T peek() {
        if (isEmpty())
            throw new RuntimeException("Queue Empty");
        return list.peekFirst();
    }
    
    //Poll an element from the front of the queue
    //The method throws an error if the queue is empty
    public T poll() {
        if (isEmpty())
            throw new RuntimeException("Queue Empty");
        return list.removeFirst();
    }
    
    //Add an element to the back of the queue
    public void offer(T elem) {
        list.addLast(elem);
    }
    
    //Return an iterator to allow the user to traverse
    //through the elements found inside the queue
    @Override
    public java.util.Iterator<T> iterator() {
        return list.iterator();
    }
}

Priority Queue

What is a Priority Queue?

A priority queue is an abstract data type that operates similar to a normal queue except that ​ ​ ​ ​ ​ ​. The priority of the elements in the priority queue determine the order in which elements are removed from the PQ.

NOTE: Priority queues only supports comparable data, meaning the data inserted into the priority queue must be able to be ordered in some way either from least to greatest or greatest to least. This is so that we are able to assign relative priorities to each element.

What is a Heap?

A heap is a ​ based DS that satisfies the heap invariant (also called heap property): If A is a parent node of B then A is ordered with respect to B for all nodes A, B in the heap.

When and where is aPQ used?

  • Used in certain implementations of Dijkstra's Shortest Path algorithm.

  • Anytime you need the dynamically fetch the 'next best' or 'next worst' element.

  • Used in Huffman coding (which is often used for lossless data compression).

  • Best First Search (BFS) algorithms such as A* use PQs to continuously grab the next most promising node.

  • Used by Minimum Spanning Tree (MST) algorithms.

Complexity PQ with binary heap

Binary Heap constructionO(n)
PollingO(log(n))
PeekingO(1)
AddingO(log(n))
Naive RemovingO(n)
Advanced removing with help from a hash table *O(log(n))
Naive containsO(n)
Contains check with help of a hash table *O(1)
  • Using a hash table to help optimize these operations does take up linear space and also adds some overhead to the binary heap implementation.

Turning Min PQ into Max PQ

Problem: Often the standard library of most programming languages only provide a min PQ which sorts by smallest elements first, but sometimes we need a Max PQ.

Since elements in a priority queue are comparable they implements some sort of comparable interface which we can simply negate to achieve a Max heap.

Let x, y be numbers in the PQ. For a min PQ, if x <= y then x comes out of the PQ before y, so the negation of this is if x >= y then y comes out before x.

An alternative method for numbers is to negate the numbers as you insert them into the PQ and negate them again when they are taken out. This has the same effect as negating the comparator.

Ways of Implementing a Priority Queue

Priority queues are usually implemented with heaps since this gives them the best possible time complexity.

The Priority Queue is an Abstract Data Type, hence heaps are not the only way to implement PQs. As an example, we could use an unsorted list, but this would not give us the best possible time complexity.

Priority Queue With Binary Heap

A binary heap is a binary tree that supports the heap invariant. In a binary tree every node has exactly two children.

A complete binary tree is a tree in which at every level, except possibly the last is completely filled and all the nodes are as far left as possible.

Binary Heap Representation

Using Array!

Let i be the parent node index
Left child index: 2i +1
Right child index: 2i + 2
(Zero based)

Removing Elements From Binary Heap in O(log(n))

The inefficiency of the removal algorithm comes from the fact that we have to perform a linear search to find out where an element is indexed at. What if instead we did a lookup using a ​ to find out where a node is indexed at?

A ​ provides a constant time lookup and update for a mapping from a key (the node value) to a value (the index).

Caveat: What if there are two or more nodes with the same value? What problems would that cause?

Dealing with the multiple value problem:

Instead of mapping one value to one position we will map one value to multiple positions. We can maintain a ​ or ​ of indexes for which a particular node value (key) maps to.

Implementation of Priority Queue

/**
 * A min priority queue implementation using a binary heap.
 **/
​
import java.util.*;
​
public class PQueue<T extends Comparable<T>> {
    
    //The number of elements currently inside the heap
    private int heapSize = 0;
    
    //The internal capacity of the heap
    private int heapCapacity = 0;
    
    //A dynamic list to track the elements inside the heap
    private List<T> heap = null;
    
    //This map keeps track of the possible indices a particular
    //node value is found in the heap. Having this mapping lets
    //us have O(log(n)) removals and O(1) element containment check
    //at the cost of some additional space and minor overhead
    private Map<T, TreeSet<Integer>> map = new HashMap<>();
    
    //Construct an initially empty priority queue
    public PQueue() {
        this(1);
    }
    
    //Construct a priority queue with an initial capacity
    public PQueue(int sz) {
        heap = new ArrayList<>(sz);
    }
    
    //Construct a priority queue using heapify in O(n) time
    public PQueue(T[] elems) {
        heapSize = heapCapacity = elems.length;
        heap = new ArrayList<T>(heapCapacity);
        
        //Place all element in heap
        for (int i = 0; i < heapSize; i++) {
            mapAdd(elems[i], i);
            heap.add(elems[i]);
        }
        
        //Heapify process, O(n)
        for (int i = Math.max(0, (heapSize/2)-1); i >= 0; i--)
            sink(i);
    }
    
    //Priority queue construction, O(nlog(n))
    public PQueue(Collection<T> elems) {
        this(elems.size());
        for (T elem : elems)
            add(elem);
    }
    
    //Returns true/false depending on if the priority queue is empty
    public boolean isEmpty() {
        return heapSize == 0;
    }
    
    //Clears everything inside the heap, O(n)
    public void clear() {
        for (int i = 0; i < heapCapacity; i++) 
            heap.set(i, null);
        heapSize = 0;
        map.clear();
    }
    
    //Return the size of the heap
    public int size() {
        return heapSize;
    }
    
    //Returns the value of the element with the lowest
    //Priority in this priority queue. If the priority
    //queue is empty null is returned.
    public T peek() {
        if (isEmpty()) return null;
        return heap.get(0);
    }
    
    //Removes the root of the heap. O(log(n))
    public T poll() {
        return removeAt(0);
    }
    
    //Test if an element is in heap, O(1)
    public boolean contains(T elem) {
        //Map lookup to check containment, O(1)
        if (elem == null) return false;
        return map.containsKey(elem);
        
        //Linear scan to check containment, O(n)
        // for (int i = 0; i < heapSize; i++)
        //     if (heap.get(i).equals(elem))
        //         return true;
        // return false;
    }
    
    //Adds an element to the priority queue, the
    //element must not be null, O(log(n))
    public void add(T elem) {
        if (elem == null) throw new IllegalArgumentException();
        
        if (heapSize < heapCapacity) {
            heap.set(heapSize, elem);
        } else {
            heap.add(elem);
            heapCapacity++;
        }
        
        mapAdd(elem, heapSize);
        swim(heapSize);
        heapSize++;
    }
    
    //Tests if the value of node i <= node j
    //This method assumes i & j are valid indices, O(1)
    private boolean less(int i, int j) {
        T node1 = heap.get(i);
        T node2 = heap.get(j);
        return node1.compareTo(node2) <= 0;
    }
    
    //Bottom up node swim, O(log(n))
    private void swim(int k) {
        //Grab the index of the next parent node WRT to k
        int parent = (k-1) / 2;
        
        //Keep swimming while we have not reached the 
        //root and while we're less than our parent
        while (k > 0 && less(k, parent)) {
            //Exchange k with the parent
            swap(parent, k);
            k = parent;
            //Grab the index of the next parent node WRT to k
            parent = (k -1) / 2;
        }
    }
    
    //Top down node sink, O(log(n))
    private void sink(int k) {
        while (true) {
            int left = 2 * k + 1;//left node
            int right = 2 * k + 2;//right node
            int smallest = left; //assume left is the smallest node of the two children
            
            //Find which is smaller left or right
            //If right is smaller set smallest to be right
            if (right < heapSize && less(right, left))
                smallest = right;
            
            //Stop if we're outside the bounds of the tree
            //or stop early if we cannot sink k anymore
            if (left >= heapSize || less(k, smallest)) break;
            
            //Move down the tree following the smallest node
            swap(smallest, k);
            k = smallest;
        }
    }
    
    //Swap two nodes. Assumes i & j are valid, O(1)
    private void swap(int i, int j) {
        T i_elem = heap.get(i);
        T j_elem = heap.get(j);
        
        heap.set(i, j_elem);
        heap.set(j, i_elem);
        
        mapSwap(i_elem, j_elem, i, j);
    }
    
    //Removes a particular element in the heap, O(log(n))
    public boolean remove(T element) {
        if (element == null) return false;
        
        //Linear removal via search, O(n)
        //for (int i = 0; i < heapSize; i++) {
        //    if (element.equals(heap.get(i))) {
        //        removeAt(i);
        //        return true;
        //    }
        //}
        
        //Logarithmic removal with map, O(log(n))
        Integer index = mapGet(element);
        if (index != null) removeAt(index);
        return index != null;
    }
    
    //Removes a node at particular index, O(log(n))
    private T removeAt(int i) {
        if (isEmpty()) return null;
        
        heapSize--;
        T removed_data = heap.get(i);
        swap(i, heapSize/*now the 'heapSize' is the largest index of the tree(array)*/);
        
        //Obliterate the value
        heap.set(heapSize, null);
        mapRemove(removed_data, heapSize);
        
        //Removed last element
        if (i == heapSize) return removed_data;
        
        T elem = heap.get(i);
        
        //Try sinking element
        sink(i);
        //If sinking did not work try swimming
        if (heap.get(i).equals(elem))
            swim(i);
        return removed_data;
    }
    
    //Recursively checks if this heap is a min heap
    //This method is just for testing purposes to make
    //sure the heap invariant is still being maintained
    //Called this method with k=0 to start at the root
    public boolean isMinHeap(int k) {
        //If we are outside the bounds of the heap return true
        if (k >= heapSize) return true;
        
        int left = 2 * k + 1;
        int right = 2 * k + 2;
        
        //Make sure that the current node k is less than
        //both of its children left, and right if they exist
        //return false otherwise to indicate an invalid heap
        if (left < heapSize && !less(k, left)) return false;
        if (right < heapSize && !less(k, right)) return false;
        
        //Recurse on both children to make sure they're alse valid heaps
        return isMinHeap(left) && isMinHeap(right);
    }
    
    //Add a node value and its index to the map
    private void mapAdd(T value, int index) {
        TreeSet<Integer> set = map.get(value);
        
        //New value being inserted in map
        if (set == null) {
            set = new TreeSet<>();
            set.add(index);
            map.put(value, set);
        } else { /* value already exists in map */
            set.add(index);
        }
    }
    
    //Removes the index at a given value, O(log(n))
    private void mapRemove(T value, int index) {
        TreeSet<Integer> set = map.get(value);
        set.remove(index); //TreeSets take O(log(n)) removal time
        if (set.size() == 0) map.remove(value);
    }
    
    //Extract an index position for the given value
    //NOTE: If a value exists multiple times in the heap the highest
    //index is returned (this has arbitrarily been chosen)
    private Integer mapGet(T value) {
        TreeSet<Integer> set = map.get(value);
        if (set != null) return set.last();
        return null;
    }
    
    //Exchange the index of two nodes internally within the map
    private void mapSwap(T val_1, T val_2, int val_1Index, int val_2Index) {
        Set<Integer> set_1 = map.get(val_1);
        Set<Integer> set_2 = map.get(val_2);
        
        set_1.remove(val_1Index);
        set_2.remove(val_2Index);
        
        set_1.add(val_2Index);
        set_2.add(val_1Index);
    }
    
    @Override public String toString() {
        return heap.toString();
    }
}

Union Find (Disjoint Set)

What is Union Find?

​ ​ is a data structure that keeps track of elements which are split into one or more disjoint sets. Its has two primary operations: ​ and ​.

When and where is a Union Find used?

  • Kruskal's minimum spanning tree algorithm.

  • Grid percolation.

  • Network connectivity

  • Least common ancestor in trees

  • Image processing

Complexity

ConstructionO(n)
Unionalpha(n)
Findalpha(n)
Get component sizealpha(n)
Check if connectedalpha(n)
Count componentsO(1)

alpha(n) - Amortized constant time

Kruskal's Minimum Spanning Tree

Given a graph G = (V, E) we want to find a Minimum Spanning Tree in the graph (it may not be unique).

A minimum spanning tree is a sub set of the edges which connect all vertices in the graph with the minimal total edge cost.

1) Sort edges by ascending edge weight.

2) Walk through the sorted edges and look at the two nodes the edge belongs to, if the nodes are already unified we don't include this edge, otherwise we include it and unify the nodes.

3) The algorithm terminates when every edge has been processed or all the vertices have been unified.

Union Find Operation

To begin using Union Find, first construct a bijection (a mapping) between your objects and the integers in the range [0, n).

NOTE: This step is not necessary in general, but it will allow us to construct an array-based union find.

Store Union Find information in an array. Each index has an associated object (letter in this example) we can lookup through our mapping.

Find Operation

To ​ which component a particular element belongs to find the root of that component by following the parent nodes until a self loop is reached (a node who's parent is itself)

Union Operation

To ​ two elements find which are the root nodes of each component and if the root nodes are different make one of the root nodes be the parent of the other.

Remarks

In this data structure, we do not "un-union" elements. In general, this would be very inefficient to do since we would have to update all the children of a node.

The number of components is equal to the number of roots remaining. Also, remark that the number of root nodes never increases.

Implementation of Union Find

public class UnionFind {
    
    //The number of elements in this union find
    private int size;
    
    //Used to track the sizes of each of the components
    private int[] sz;
    
    //id[i] points to the parent of i, if id[i] == i then i is root node
    private int[] id;
    
    //Track the number of components in the union find
    private int numComponents;
    
    public UnionFind(int size) {
        if (size <= 0)
            throw new IllegalArgumentException("Size <= 0 is not allowed");
        this.size = numComponents = size;
        sz = new int[size];
        id = new int[size];
        for (int i = 0; i < size; i++) {
            id[i] = i;//Link to itself (self root)
            sz[i] = i;//Each component is originally of size one
        }
    }
    
    //Find which component/set 'p' belongs to, takes amortized constant time.
    public int find(int p) {
        //Find the root of the component/set
        int root = p;
        while (root != id[root])
            root = id[root];
        
        //Compress the path leading back to the root.
        //Doing this operation is called "path compression"
        //and is what gives us amortized constant time complexity
        while (p != root) {
            int next = id[p];
            id[p] = root;
            p = next;
        }
        return root;
    }
    
    //Return whether or not the elements 'p' and
    //'q' are in the same components/set
    public boolean connected(int p, int q) {
        return find(p) == find(q);
    }
    
    //Return the size of the components/set 'p' belongs to
    public int componentSize(int p) {
        return sz[find(p)];
    }
    
    //Return the number of elements in this UnionFind/Disjoint set
    public int size() {
        return size;
    }
    
    //Returns the number of remaining components/sets
    public int components() {
        return numComponents;
    }
    
    //Unify the components/sets containing elements 'p' and 'q'
    public void unify(int p, int q) {
        int root1 = find(p);
        int root2 = find(q);
        
        //These elements are already in the same group!
        if (root1 == roo2) return;
        
        //Merge two components/sets together.
        //Merge smaller component/set into the larger one.
        if (sz[root1] < sz[root2]) {
            sz[root2] += sz[root1];
            id[root1] = root2;
        } else {
            sz[root1] += sz[root2];
            id[root2] = root1;
        }
        //Since the roots found are different we know that the
        //number of components/sets has decreased by one
        numComponents--;
    }
}

Binary Trees and Binary Search Trees (BST)

Quick terminology crash course

A ​ is an undirected graph which satisfies any of the following definitions:

  • An acyclic connected graph

  • A connected graph with N nodes and N - 1 edges.

  • A graph in which any two vertices are connected by exactly one path.

If we have a rooted tree then we will want to have a reference to the root node of our tree.

It does not always matter which node is selected to be the root node because any node can root the tree!

A ​ is a node extending from another node. A ​ is the inverse of this.

Q: What is the parent of the root node?

A: It has not parent, although it may be useful to assign the parent of the root node to be itself (e. g. filesystem tree).

A leaf node has no children.

A subtree is a tree entirely contained within another. They are usually denoted using triangles.

NOTE: Subtrees may consist of a single node.

What is a Binary Tree?

A binary tree is a tree for which every node has at most two child nodes.

What is a Binary Search Tree?

A binary search tree is a binary tree that satisfies the BST invariant: left subtree has smaller elements and right subtree has larger elements.

When and where are Binary Trees used?

  • Binary Search Trees

    • Implementation of some map and set ADTs

    • Red Black Trees

    • AVL Trees

    • Splay Trees

    • etc...

  • Used in the implementation of binary heaps

  • Syntax trees (used by compiler and calculators)

  • Treap - a probabilistic DS (uses a randomized BST)

Complexity of BSTs

OperationAverageWorst
InsertO(log(n))O(n)
DeleteO(log(n))O(n)
RemoveO(log(n))O(n)
SearchO(log(n))O(n)

Adding elements to a BST

Binary Search Tree elements must be comparable so that we can order them inside the tree.

When inserting an element we want to compare its value to the value stored in the current node we're considering to decide on one of the following:

  • Recurse down left subtree (< case)

  • Recurse down right subtree (> case)

  • Handle finding a duplicate value (= case)

  • Create a new node (found a null leaf)

Removing elements from a BST

Removing elements from a Binary Search Tree can be seen as a two step process.

1) ​ the element we wish to remove (if it exists)

2) ​ the node we want to remove with its successor (if any) to maintain the BST invariant.

Recall the BST invariant: left subtree has smaller elements and right subtree has larger elements.

Find Phase

When searching our BST for a node with a particular value one of four things will happen:

1) We hit null node at which point we know the value does not exist within out BST

2) Comparator value equal to 0 (found it!)

3) Comparator value less than 0 (the value, if it exists, is in the left subtree)

4) Comparator value greater than 0 (the value, if it exists, is in the right subtree)

Remove phase

Four cases:

  • Node to remove is leaf node

  • Node to remove has a right subtree but no left subtree

  • Node to remove has a left subtree but no right subtree

  • Node to remove has a both left subtree and a right subtree

Case 1: Leaf node

If the node we wish to remove is a leaf node then we may do so without side effect :)

Case 2 & 3: either the left/right child node is a subtree

The successor of the node we are trying to remove in these cases will be the root node of the left/right subtree.

It may be the case that you are removing the root node of the BST in which case its immediate child becomes the new root as you would expect.

Case 4: Node to remove has both a left subtree and right subtree

Q: In which subtree will the successor of the node we are trying to remove be?

A: The answer is both! The successor can either be the ​ value in the left subtree or the ​ value in the right subtree.

A justification for why there could be more than one successor is:

The largest value in the left subtree satisfies the BST invariant since it:

1) Is larger than everything in left subtree. This follows immediately from the definition of being the largest.

2) Is smaller than everything in right subtree because it was found in the left subtree.

The smallest value in the right subtree satisfies the BST invariant since it:

1) Is smaller than everything in right subtree. This follows immediately from the definition of being the smallest.

2) Is larger than everything in left subtree because it was found in the right subtree

So there are two possible successors!

Tree Traversals (Preorder, Inorder, Postorder & Level order)

preorder(node):
	if node == null: return
    print(node.value)
    preorder(node.left)
    preorder(node.right)       preorder prints before the recursive calls
        
inorder(node):
	if node == null: return
    inorder(node.left)
    print(node.value)
    inorder(node.right)        inorder prints between the recursive calls
        
postorder(node):
	if node == null: return
    postorder(node.left)
    postorder(node.ritht)
    print(node.value)          postorder prints after the recursive calls

Level order Traversal

In a level order traversal we want to print the nodes as they appear one layer at a time.

To obtain this ordering we want to do a ​ ​ ​ (BFS) from the root node down to the leaf nodes.

To do a BFS we will need to maintain a Queue of the nodes left to explore.

Begin with the root inside of the queue and finish the queue is empty.

At each iteration we add the left child and than the right child of the current node to our Queue.

Implementation of Binary Search Tree

public class BinarySearchTree<T extends Comparable<T>> {
    
    //Tracks the number of nodes in this BST
    private int nodeCount = 0;
    
    //This BST is rooted tree so we maintain a handle on the root node
    private Node root = null;
    
    //Internal node containing node references
    //and the actual node data
    private class Node {
        T data;
        Node left;
        Node right;
        public Node(Node left, Node right, T elem) {
            this.data = elem;
            this.left = left;
            this.right = right;
        }
    }
    
    //Check if this binary tree is empty
    public boolean isEmpty() {
        return size() == 0;
    }
    
    //Get the number of nodes in this binary tree
    public int size() {
        return nodeCount;
    }
    
    //Add an element to this binary tree. Returns true if we successfully perform an insertion
    public boolean add(T elem) {
        //Check if the value already exists in this 
        //binary tree, if it does ignore adding it
        if (contains(elem)) {
            return false;
        } else { /*Otherwise add this element to the binary tree*/
            root = add(root, elem);
            nodeCount++;
            return true;
        }
    }
    
    //Private method to recursively add a value in the binary tree
    private Node add(Node node, T elem) {
        //Base case: found a leaf node
        if (node == null) {
            node = new Node(null, null, elem);
        } else {
            //Place lower elements values in left subtree
            if (elem.compareTo(node.data) < 0) {
                node.left = add(node.left, elem);
            } else {
                node.right = add(node.right, elem);
            }
        }
        return node;
    }
    
    //Remove a value from this binary tree, if it exists
    public boolean remove(T elem) {
        //Make sure the node we want to remove
        //actually exists before we remove it
        if (contains(elem)) {
            root = remove(root, elem);
            nodeCount--;
            return true;
        }
        return false;
    }
    
    private Node remove(Node node, T elem) {
        if (node == null) return null;
        
        int cmp = elem.compareTo(node.data);
        
        //Dig into left subtree, the value we're looking
        //for is smaller than the current node
        if (cmp < 0) {
            node.left = remove(node.left, elem);
        } else if (cmp > 0) {
            //Dig into right subtree, the value we're looking
            //for is greater than the current value
            node.right = remove(node.right, elem);
        } else {/* find the value(node) that we want to remove */
            //This is the case with only a right subtree or 
            //no subtree at all. In this situation just
            //swap the node we wish to remove with its right child
            if (node.left == null) {
                Node rightChild = node.right;
                node.data = null;
                node = null;
                return rightChild;
            } else if (node.right == null) {
                //This is the case with only a left subtree or 
                //no subtree at all. In this situation just
                //swap the node we wish to remove with its left child.
                Node leftChild = node.left;
                node.data = null;
                node = null;
                return leftChild;
            } else {
                //When removing a node from a binary tree with two links the 
                //successor of the node being removed can either be the largest
                //value in the left subtree or the smallest value in the right
                //subtree. In this implementation I have decided to find the
                //smallest value in the right subtree which can be found by
                //traversing as far left as possible in the right subtree
                
                //Find the leftmost node in the right subtree
                Node tmp = digLeft(node.right);
                
                //Swap the data
                node.data = tmp.data;
                
                //Go into the right subtree and remove the leftmost node we
                //found and swapped data with. This prevents us from having
                //two nodes in our tree with the same value.
                node.right = remove(node.right, tmp.data);
                
                //If instead we wanted to find the largest node in the left
                //subtree as opposed to smallest node in the right subtree
                //here is what we would do:
                
                //Node tmp = digRight(node.left);
                //node.data = tmp.data;
                //node.left = remove(node.left, tmp.data);
            }
        }
        return node;
    }
    
    //Helper method to find the leftmost node
    private Node digLeft(Node node) {
        Node cur = node;
        while (cur.left != null)
            cur = cur.left;
        return cur;
    }
    
    //Helper method to find the rightmost node
    private Node digRight(Node node) {
        Node cur = node;
        while (cur.right != null)
            cur = cur.right;
        return cur;
    }
    
    //private recursive method to find an element in the tree
    private boolean contains(Node node, T elem) {
        //Base case: reached bottom, value not found
        if (node == null) return false;
        
        int cmp = elem.compareTo(node.data);
        
        //Dig into the left subtree because the value we're
        //looking for is smaller than the current value
        if (cmp < 0) return contains(node.left, elem);
        
        //Dig into the right subtree because the value we're
        //looking for is greater than the current value
        else if (cmp > 0) return contains(node.right, elem);
        
        //We found the value we were looking for
        else return true;
    }
    
    //Computes the height of the tree, O(n)
    public int height() {
        return height(root);
    }
    
    //Recursive helper method to compute the height of the tree
    private int height(Node node) {
        if (node == null) return 0;
        return Math.max(height(node.left), height(node.right)) + 1;
    }
    
    //This method returns an iterator for a given TreeTraversalOrder.
    //The ways in which you can traverse the tree are in four different ways:
    //preorder, inorder, postorder and levelorder.
    public java.util.Iterator<T> traverse(TreeTraversalOrder order) {
        switch (order) {
            case PRE_ORDER: return preOrderTraversal();
            case IN_ORDER: return inOrderTraversal();
            case POST_ORDER: return postOrderTraversal();
            case LEVEL_ORDER: return levelOrderTraversal();
            default: return null;
        }
    }
    
  // Returns as iterator to traverse the tree in pre order
  private java.util.Iterator<T> preOrderTraversal() {

    final int expectedNodeCount = nodeCount;
    final java.util.Stack<Node> stack = new java.util.Stack<>();
    stack.push(root);

    return new java.util.Iterator<T>() {
      @Override
      public boolean hasNext() {
        if (expectedNodeCount != nodeCount) throw new java.util.ConcurrentModificationException();
        return root != null && !stack.isEmpty();
      }

      @Override
      public T next() {
        if (expectedNodeCount != nodeCount) throw new java.util.ConcurrentModificationException();
        Node node = stack.pop();
        if (node.right != null) stack.push(node.right);
        if (node.left != null) stack.push(node.left);
        return node.data;
      }

      @Override
      public void remove() {
        throw new UnsupportedOperationException();
      }
    };
  }

  // Returns as iterator to traverse the tree in order
  private java.util.Iterator<T> inOrderTraversal() {

    final int expectedNodeCount = nodeCount;
    final java.util.Stack<Node> stack = new java.util.Stack<>();
    stack.push(root);

    return new java.util.Iterator<T>() {
      Node trav = root;

      @Override
      public boolean hasNext() {
        if (expectedNodeCount != nodeCount) throw new java.util.ConcurrentModificationException();
        return root != null && !stack.isEmpty();
      }

      @Override
      public T next() {

        if (expectedNodeCount != nodeCount) throw new java.util.ConcurrentModificationException();

        // Dig left
        while (trav != null && trav.left != null) {
          stack.push(trav.left);
          trav = trav.left;
        }

        Node node = stack.pop();

        // Try moving down right once
        if (node.right != null) {
          stack.push(node.right);
          trav = node.right;
        }

        return node.data;
      }

      @Override
      public void remove() {
        throw new UnsupportedOperationException();
      }
    };
  }

  // Returns as iterator to traverse the tree in post order
  private java.util.Iterator<T> postOrderTraversal() {
    final int expectedNodeCount = nodeCount;
    final java.util.Stack<Node> stack1 = new java.util.Stack<>();
    final java.util.Stack<Node> stack2 = new java.util.Stack<>();
    stack1.push(root);
    while (!stack1.isEmpty()) {
      Node node = stack1.pop();
      if (node != null) {
        stack2.push(node);
        if (node.left != null) stack1.push(node.left);
        if (node.right != null) stack1.push(node.right);
      }
    }
    return new java.util.Iterator<T>() {
      @Override
      public boolean hasNext() {
        if (expectedNodeCount != nodeCount) throw new java.util.ConcurrentModificationException();
        return root != null && !stack2.isEmpty();
      }

      @Override
      public T next() {
        if (expectedNodeCount != nodeCount) throw new java.util.ConcurrentModificationException();
        return stack2.pop().data;
      }

      @Override
      public void remove() {
        throw new UnsupportedOperationException();
      }
    };
  }

  // Returns as iterator to traverse the tree in level order
  private java.util.Iterator<T> levelOrderTraversal() {

    final int expectedNodeCount = nodeCount;
    final java.util.Queue<Node> queue = new java.util.LinkedList<>();
    queue.offer(root);

    return new java.util.Iterator<T>() {
      @Override
      public boolean hasNext() {
        if (expectedNodeCount != nodeCount) throw new java.util.ConcurrentModificationException();
        return root != null && !queue.isEmpty();
      }

      @Override
      public T next() {
        if (expectedNodeCount != nodeCount) throw new java.util.ConcurrentModificationException();
        Node node = queue.poll();
        if (node.left != null) queue.offer(node.left);
        if (node.right != null) queue.offer(node.right);
        return node.data;
      }

      @Override
      public void remove() {
        throw new UnsupportedOperationException();
      }
    };
  }
}

public enum TreeTraversalOrder {
    PRE_ORDER,
    IN_ORDER,
    POST_ORDER,
    LEVEL_ORDER
}

Hash Table

What is a Hash table?

A ​ ​ is a data structure that provides a mapping from keys to values using a technique called ​.

We refer to these as key-value pairs. Keys must be unique, but values can be reneated.

To be able to understand how a mapping is constructed between key-value pairs we first need to talk about hash functions.

A hash function H(x) is a function that maps a key 'x' to a whole number in a fixed range.

We can also define hash functions for arbitrary objects such as strings, lists, tuples, multi data objects, etc...

For a string s let H(s) be a hash function defined below where ASCII(x) returns the ASCII value of the character x

function H(s):
	sum := 0
	for char in s:
		sum = sum + ASCII(char)
	return sum mod 50

Properties of Hash functions

If H(x) == H(y) then objects x and y might be equal, but if H(x) != H(y) then x and y are certainly not equal.

A hash function H(x) must be deterministic. This means that if H(x) == y then H(x) must always produce y and never another value. This may seen obvious, but it is critical to the functionality of a hash function.

We try very hard to make ​ hash functions to minimize the number of hash collisioins.

A hash collision is when two objects x, y hash to the same value (i. e. H(x) == H(y)).

Q: What makes a key of type T hashable?

A: Since we are going to use hash functions in the implementation of our hash table we need our hash functions to be deterministic. To enforce this behavior, we demand that the keys used in our hash table are immutable data types. Hence, if a key of type T its immutable, and we have a hash function H(k) defined for all keys of type T then we say a key of type T is hashable.

How does a hash table work?

Ideally we would like to have a very fast insertion, lookup and removal time for the data we are placing within our hash table.

Remarkably, we can achieve all this in O(1)* time using a hash function as a way to index into a hash table.

*The constant time behavior attributed to hash tables is only true if you have a good uniform hash function!

How does a hash table work?

Think of the hash table as an indexable block of memory (an array) and we can only access its entries using the value given to us by our hash function H(x).

Separate chaining deals with hash collisions by maintaining a data structure (usually a linked lit) to hold all the different values which hashed to a particular value.

Open addressing deals with hash collisions by defining another place within the hash table for the object to go by offsetting it from the position to which it hashed to.

Complexity

OperationAverageWorst
InsertionO(1)*O(n)
RemovalO(1)*O(n)
SearchO(1)*O(n)

*The constant time behavior attributed to hash tables is only true if you have a good uniform hash function!

What is Separate Chaining?

Separate chaining is one of many strategies to deal with hash collisions by maintaining a data structure (usually a linked list) to hold all the different values which hashed to a particular value.

In hash table (indexable memory) every slot is a linked list, when we want to find an object in this list, we compare their keys.

Q: How do I maintain O(1) insertion and lookup time complexity once my HT gets really full and I have long linked list chains?

A: Once the HT contains a lot of elements you should create a new HT with a larger capacity and rehash all the items inside the old HT and disperse them throughout the new HT at different locations.

Q: How do I remove key-value pairs from my HT?

A: Apply the same procedure as doing a lookup for a key, but this time instead of returning the value associated with the key remove the node in the linked list data structure.

Implementation of Separate Chaining Hash Table

class Entry<K, V> {
    
    int hash;
    K key;
    V value;
    
    public Entry(K key, V value) {
        this.key = key;
        this.value = value;
        this.hash = key.hashCode();
    }
    
    //We are not overriding the Object equals method
    //No casting is required with this method.
    public boolean equals(Entry<K, V> other) {
        if (hash != other.hash) return false;
        return key.equals(other.key);
    }
    
    @Override
    public String toString() {
        return key + " => " + value;
    }
}

public class HashTableSeparateChaining<K, V> implements Iterable<K> {
    
    private static final int DEFAULT_CAPACITY = 3;
    private static final double DEFAULT_LOAD_FACTOR = 0.75;
    
    private double maxLoadFactor;
    private int capacity = 0;
    private int threshold = 0;
    private int size = 0;
    private Linklist<Entry<K, V>>[] table;
    
    public HashTableSeparateChaining() {
        this(DEFAULT_CAPACITY, DEFAULT_LOAD_FACTOR);
    }
    
    public HashTableSeparateChaining(int capacity) {
        this(capacity, DEFAULT_LOAD_FACTOR);
    }
    
    public HashTableSeparateChaining(int capacity, double maxLoadFactor) {
        if (capacity < 0)
            throw new IllegalArgumentException("Illegal capacity");
        if (maxLoadFactor <= 0 || Double.isNaN(maxLoadFactor)) || Double.isInfinite(maxLoadFactor))
            throw new IllegalArgumentException("Illegl maxLoadFactor");
        this.maxLoadFactor = maxLoadFactor;
        this.capacity = Math.max(DEFAULT_CAPACITY, capacity);
        threshold = (int) (this.capacity * maxLoadFactor);
        table = new LinkList[this.capacity];
    }
    
    //Returns the number of elements currently inside the hash-table
    public int size() { return size; }
    
    //Returns true/false depending on whether the hash table is empty
    public boolean isEmpty() { return size == 0; }
    
    //Converts a hash value to an index, Essentially, this strips the negative 
    //sign and places the hash value in the domain [0, capacity)
    private int normalizeIndex(int keyHash) {
        return (keyHash & 0x7FFFFFFF) % capacity;
    }
    
    //Clears all the contents of the hash-table
    public void clear() {
        Arrays.fill(table, null);
        size = 0;
    }
    
    public boolean containsKey(K key) { return hasKey(key); }
    
    //Returns true/false depending on whether a key is in the hash table
    public boolean hasKey(K key) {
        int bucketIndex = normalizeIndex(key.hashCode());
        return bucketSeekEntry(bucketIndex, key) != null;
    }
    
    //Insert, put and add all place a value in the hash-table
    public V put(K key, V value) { return insert(key, value); }
    public V add(K key, V value) { return insert(key, value); }
    
    public V insert(K key, V value) {
        if (key == null) throw new IllegalArgumentException("Null key");
        Entry<K, V> newEntry = new Entry<>(key, value);
        int bucketIndex = normalizeIndex(newEntry.hash);
        return bucketInsertEntry(bucketIndex, newEntry);
    }
    
    //Gets a key's values from the map and returns the value.
    //NOTE: returns null if the value is null AND also returns 
    //null if the key does not exists, so watch out
    public V get(K key) {
        if (key == null) return null;
        int bucketIndex = normalizeIndex(key.hashCode());
        Entry<K, V> entry = bucketSeekEntry(bucketIndex, key);
        if (entry != null) return entry.value;
        return null;
    }
    
    //Removes a key from the map and returns the value.
    //NOTE: returns null if the value is null AND also returns
    //null if the key does not exists.
    public V remove(K key) {
        if (key == null) return null;
        int bucketIndex = normalizeIndex(key.hashCode());
        return bucketRemoveEntry(bucketIndex, key);
    }
    
    //Removes an entry from a given bucket if it exists
    private V bucketRemoveEntry(int bucketIndex, K key) {
        Entry<K, V> entry = bucketSeekEntry(bucketIndex, key);
        if (entry != null) {
            LinkedList<Entry<K, V>> lisks = table[bucketIndex];
            links.remove(entry);
            --size;
            return entry.value;
        }
        return null;
    }
    
    //Inserts an entry in given bucket only if the entry does not already
    //exist in the given bucket, but if it does then update the entry value
    private V bucketInsertEntry(int bucketIndex, Entry<K, V> entry) {
        LinkedList<Entry<K, V>> bucket = table[bucketIndex];
        if (bucket == null) table[bucketIndex] = bucket = new LinkedList<>();
        
        Entry<K, V> existentEntry = bucketSeekEntry(bucketIndex, entry.key);
        if (existentEntry = null) {
            bucket.add(entry);
            if (++size > threshold) resizeTable();
            return null; //Use null to indicate that there was no previous entry
        } else {
            V oldValue = existentEntry.value;
            existentEntry.value = entry.value;
            return oldValue;
        }
    }
    
    //Finds and returns a particular entry in a given bucket if it exists, returns null otherwise
    private Entry<K, V> bucketSeekEntry(int bucketIndex, K key) {
        if (key == null) return null;
        LinkedList <Entry<K, V>> bucket = table[bucketIndex];
        if (bucket == null) return null;
        for (Entry<K, V> entry : bucket)
            if (entry.key.equals(key))
                return entry;
        return null;
    }
    
    //Resizes the internal table holding buckets of entries
    private void resizeTable() {
        capacity *= 2;
        threshold = (int) (capacity * maxLoadFactor);
        
        LinkedList<Entry<K, V>>[] newTable = new LinkedList[capacity];
        
        for (int i = 0; i < table.length; i++) {
            if (table[i] != null) {
                for (Entry<K, V> entry : table[i]) {
                    int bucketIndex = normalizeIndex(entry.hash);
                    LinkedList<Entry<K, V>> bucket = newTable[bucketIndex];
                    if (bucket == null) newTable[bucketIndex] = bucket = new LinkedList<>();
                    bucket.add(entry);
                }
                
                //Avoid memory leak, Help the GC
                table[i].clear();
                table[i] = null;
            }
        }
        
        table = newTable;
    }
    
    //Returns the list of keys found within the hash table
    public List<K> keys() {
        List<K> keys = new ArrayList<>(size());
        for (LinkedList<Entry<K, V>> bucket : table)
            if (bucket != null)
                for (Entry<K, V> entry : bucket)
                    keys.add(entry.key);
        return keys;
    }
    
    //Returns the list of values found within the hash table
    public List<V> values() {
        List<V> values = new ArrayList<>(size());
        for (LinkedList<Entry<K, V>> bucket : table)
            if (bucket != null)
                for (Entry<K, V> entry : bucket)
                    values.add(entry.value);
        return values;
    }
    
    //Return an iterator to iterate over all the keys in this map
    @Overridee public java.util.Iterator<K> iterator() {
        final int elementCount = size();
        return new java.util.Iterator<K>() {
            int bucketIndex = 0;
            java.util.Iterator<Entry<K, V>> bucketItr = (table[0] == null) ? null : table[0].iterator();
            
            @Override 
            public boolean hasNext() {
                //An item was added or removed while iterating
                if (elementCount != size) throw new java.util.ConcurrentModificatioinException();
                //No iterator or the current iterator is empty
                if (bucketIter == null || !bucketIter.hasNext()) {
                    //Search next buckets until a valid iterator is found
                    while (++bucketIndex < capacity) {
                        if (table[bucketIndex] != null) {
                            //Make sure this iterator actually has elements -_-
                            java.util.Iterator<Entry<K,V>> nextIter = table[bucketIndex].iterator();
                            if (nextIter.hasNext()) {
                                bucketIter = nextIter;
                                break;
                            }
                        }
                    }
                }
                return bucketIndex < capacity;
            }
            
            @Override
            public K next() {
                return bucketIter.next.key;
            }
            
            @Override
            public void remove() {
                throw new UnsupportedOperationException();
            }
        };
    }
    
    //Returns a string representation of this hash table
    @Override
    public String toString() {
        StringBuilder sb = new StringBuilder();
        sb.append("{");
        for (int i = 0; i < capacity; i++) {
            if (table[i] == null) continue;
            for (Entry<K, V> entry : table[i])
                sb.append(entry + ", ");
        }
        sb.append("}");
        return sb.toString();
    }
}

Open addressing basics

The goal of the Hash Table is to construct a mapping from keys to values.

Keys must be hashable and we need a hash function that converts keys to whole numbers.

We use the hash function defined on our key set to index into an array (the hash table).

Hash functions are not perfect, therefore sometimes two keys k1, k2 (k1 != k2) hash to the same value. When this happens we have a hash collision (i. e H(k1) == H(k2))

Open addressing is a way to solve this issue.

When using open addressing as a collision resolution technique the key-value pairs are stored in the table (array) itself as opposed to a data structure like in separate chaining.

This means we need to care a great deal about the size of our hash table and how many elements are currently in the table.

​ ​ = (items in table) / (size of table)

The O(1) constant time behavior attributed to hash tables assumes the load factory (alpha) is kept below a certain fixed value. This means once alpha > threshold we need to grow the table size (ideally exponentially, e. g double)

When we want to insert a key-value pair (k, v) into the hash table we hash the key and obtain an original position for where this key-value pair belongs, i. e H(k).

If the position our key hashed to is occupied we try another position in the hash table by offsetting the current position subject to a probing sequence P(x). We keep doing this until an unoccupied slot is found.

There are an infinite amount of probing sequences you can come up with, here are a few:

Linear probing: P(x) = ax + b where a, b are constants

Quadratic probing: P(x) = ax2 + bx + c where a, b, c are constants

Double hashing: P(k, x) = x * H2(k) where H2(k) is a secondary hash function

Pseudo random number generator: P(k, x) = x * RNG(H(k), x) where RNG is a random number generator function seeded with H(k).

General insertion method for open addressing on a table of size N goes as follows:

x := 1
keyHash := H(k)
index := keyHash
    
while table[index] != null:
	index = (keyHash + P(k, x)) mod N
    x = x + 1
insert (k, v) at table[index]
        
Where H(k) is the hash for the key k and P(k, x) is the probing function

Chaos with cycles

Most randomly selected probing sequences modulo N will produce a cycle shorter than the table size.

This becomes problematic when you are trying to insert a key-value pair and all the buckets on the cycle are occupied because you will get stuck in an infinite loop!

Q: So that's concerning, how do we handle probing functions which produce cycles shorter than the table size?

A: In general the consensus is that we don't handle this issue, instead we avoid it altogether by restricting our domain of probing functions to those which produce a cycle of exactly length N*

*There are a few exceptions with special properties that can produce shorter cycles.

Q: Which value(s) of the constant a in P(x) = ax produce a full cycle modulo N?

A: This happens when a and N are relatively prime. Two numbers are relatively prime if their Greatest Common Denominator (GCD) is equal one. Hence, when GCD(a, N) = 1 the probing function P(x) be able to generate a complete cycle and we will always be able to find an empty bucket!

A common choice for P(x) is P(x) = 1x since GCD(N,) = 1 no matter the choice of N (table size).

Solution to removing

The solution is to place a unique marker called a ​ instead of null to indicate that a (k, v) pair has been deleted and that the bucket should be skipped during a search.

Q: I have a lot of tombstones cluttering my HT how do I get rid of them?

A: Tombstones count as filled slots in the HT so they increase the load factor and will be removed when the table is resized. Additionally, when inserting a new (k, v) pair you can replace buckets with tombstones with the new key-value pair.

An optimization we can do it replace the earliest tombstone encountered with the value we did a lookup for. The next time we lookup the key, it'll be found much faster! We call this lazy deletion

Implementation of Hash Table with Quadratic Probing

import java.util.*;

@SuppressWarnings("unchecked")
public class HashTableQuadraticProbing<K, V> implements Iterable<K> {
    
    private double loadFactor;
    private int capacity;
    private int threshold;
    private int modificationCount = 0;
    
    //'usedBuckets' counts the total number of used buckets inside the
    //hash table (includes cells marked as deleted). While 'keyCount'
    //tracks the number of unique keys currently inside the hash table
    private int usedBuckets = 0;
    private int keyCount = 0;
    
    //These arrays store the key-value pairs
    private K[] keyTable;
    private V[] valueTable;
    
    //flag used to indicate whether an item was found in the hash table
    private boolean containsFlag = false;
    
    //Special marker token used to indicate the deletion of a key-value pair
    private final K TOMBSTONE = (K) (new Object());
    
    private static final int DEFAULT_CAPACITY = 8;
    private static final double DEFAULT_LOAD_FACTOR = 0.45;
    
    public HashTableQuadraticProbing() {
        this(DEFAULT_CAPACITY, DEFAULT_LOAD_FACTOR);
    }
    
    public HashTableQuadraticProbing(int capacity) {
        this(capacity, DEFAULT_LOAD_FACTOR);
    }
    
    public HashTableQuadraticProbing(int capacity, double loadFactor) {
        if (capacity <= 0)
            throw new IllegalArgumentException("Illegal capacity: " + capacity);
        if (loadFactor <= 0 || Double.isNaN(loadFactor) || Double.isInfinite(loadFactor))
            throw new IllegalArgumentException("Illegal loadFactor: " + loadFactor);
        this.loadFactor = loadFactor;
        this.capacity = Math.max(DEFAULT_CAPACITY, next2Power(capacity));
        threshold = (int) (this.capacity * loadFactor);
        
        keyTable = (K[]) new Object[this.capacity];
        valueTable = (V[]) new Object[this.capacity];
    }
    
    //Given a number this method finds the next 
    //Power of two above this value
    private static int next2Power(int n) {
        return Integer.highestOneBit(n) << 1;
    }
    
    //Quadratic probing function (x^2 + x)/2
    private static int P(int x) {
        return (x*x + x) >> 1;
    }
    
    //Converts a hash value to an index. Essentially, this strips the 
    //negative sign and places the hash value in the domain [0, capacity)
    private int normalizeIndex(int keyHash) {
        return (keyHash & 0x7fffffff) % capacity;
    }
    
    //Clears all the contents of the hash-table
    public void clear() {
        for (int i = 0; i < capacity; i++) {
            keyTable[i] = null;
            valueTable[i] = null;
        }
        keyCount = usedBuckets = 0;
        modificationCount++;
    }
    
    //Returns the number of keys currently inside the hash table
    public int size() { return keyCount; }
    
    //Returns true/false depending on whether the hash table is empty
    public boolean isEmpty() { return keyCount == 0; }
    
    //Insert, put and add all place a value in the hash table
    public V put(K key, V value) { return insert(key, value); }
    public V add(K key, V value) { return insert(key, value); }
    
    //Place a key-value pair into the hash-table. If the value already
    //exists inside the hash table then the value is updated
    public V insert(K key, V val) {
        if (key == null) throw new IllegalArgumentException("Null key");
        if (usedBuckets >= threshold) resizeTable();
        
        final int hash = normalizeIndex(key.hashCode());
        int i = hash;// key index, it's gonna changed a lot
        int j = -1; //the index of the first tombstone that we encounter
        int x = 1;
        
        do {
            //The current slot was previously deleted
            if (keyTable[i] == TOMBSTONE) {
                if (j == -1) j = i;
            } else if (keyTable[i]) != null) {
                //The current cell already contains a key
                
                //The key we're trying to insert already exists in the hash table
                //so update its value with the most recent value
                if (keyTable[i].equals(key)) {
                    V oldValue = valueTable[i];
                    if (j == -1) {
                        valueTable[i] = val;
                    } else {
                        keyTable[i] = TOMBSTONE;
                        valueTable[i] = null;
                        keyTable[j] = key;
                        valueTable[j] = val;
                    }
                    modificationCount++;
                    return oldValue;
                }
            } else {
                //Current cell is null so an insertion/update can occur
                
                //No previously encounter deleted buckets
                if (j == -1) {
                    usedBuckets++;
                    keyCount++;
                    keyTable[i] = key;
                    valueTable[i] = val;
                } else {
                    //Previously seen deleted bucket. Instead of inserting
                    //the new element at i wherer the null is 
                    //insert it where the deleted token was found
                    keyCount++;
                    keyTable[j] = key;
                    valueTable[j] = val;
                }
                modificationCount++;
                return null;
            }
            i = normalizeIndex(hash + P(x++));
        } while (true);
    }
    
    //Returns true/false on whether a given key exists within the hash table
    public boolean containsKey(K key) {
        return hasKey(key);
    }
    
    //Returns true/false on whether a given key exists within the hash table
    public boolean hasKey(K key) {
        //sets the 'containsFlay'
        get(key);
        return containsFlag;
    }
    
    //Get the value associated with the input key.
    //NOTE: returns null if the value is null AND also returns
    //null if the key does not exists.
    public V get(K key) {
        it (key == null) throw new IllegalArgumentException("Null key");
        final int hash = normalizeIndex(key.hashCode());
        int i = hash;
        int j = -1; 
        int x = 1;
        
        //Starting at the origianl hash index quadratically probe until we find a spot where
        //our key is or we hit a null element in which case our element does not exists.
        do {
            //Ignore deleted cells, but record where the first index
            //of a deleted cell is found to perform lazy relocation later.
            if (keyTable[i] == TOMBSTONE) {
                if (j == -1) j = i;
            } else if (keyTable[i] != null) {
                //We hit a non-null key, perhaps it's the one we're looking for.
                //The key we want is in the hash table!
                if (keyTable[i].equals(key)) {
                    containsFlag = true;
                    
                    //If j != -1 this means we previously encountered a deleted cell.
                    //We can perform an optimization by swapping the entries in cells
                    //i and j so that the next time we search for this key it will be
                    //found faster. This is called lazy deletion/relocation.
                    if (j != -1) {
                        //Copy values to where deleted bucket is
                        keyTable[j] = keyTable[i];
                        valueTable[j] = valueTable[i];
                        //Clear the contents in bucket i and mark it as deleted
                        keyTable[i] = TOMBSTONE;
                        valueTable[i] = null;
                        return valueTable[j];
                    } else {
                        return valueTable[i];
                    }
                }
            } else {
                //Element was not found in the hash table
                containsFlag = false;
                return null;
            }
            i = normalizeIndex(hash + P(x++));
        } while (true);
    }
    
    //Removes a key from the map and returns the value.
    //NOTE: returns null if the value is null AND also returns
    //null if the key does not exists.
    public V remove(K key) {
        if (key == null) throw new IllegalArgumentException("Null key");
        final int hash = normalizeIndex(key.hashCode());
        int i = hash;
        int x = 1;
        
        //Starting at the hash index quadratically probe until we find a spot
        //where our key is or we hit a null element in which case our element does not exist.
        for (; ; i = normalizeIndex(hash + P(x++))) {
            //Ignore deleted cells
            if (keyTable[i] == TOMBSTONE) continue;
            
            //Key was not found in hash table.
            if (keyTable[i] == null) return null;
            
            //The key we want to remove is in the hash table!
            if (keyTable[i].equals(key)) {
                keyCount--;
                modificationCount++;
                V oldValue = valueTable[i];
                keyTable[i] = TOMBSTONE;
                valueTable[i] = null;
                return oldValue;
            }
        }
    }
    
    //Returns a list of keys found in the hash table
    public List<K> keys() {
        List<K> keys = new ArrayList<>(size());
        for (int i = 0; i < capacity; i++)
            if (keyTable[i] != null && keyTable[i] != TOMBSTONE)
                keys.add(keyTable[i]);
        return keys;
    }
    
    //Returns a list of non-unique values found in the hash table
    public List<V> values() {
        List<V> values = new ArrayList<>(size());
        for (int i = 0; i < capacity; i++)
            if (keyTable[i] != null && keyTable[i] != TOMBSTONE)
                values.add(valueTable[i]);
        return values;
    }
    
    //Double the size of the hash table
    private void resizeTable() {
        capacity *= 2;
        threshold = (int) (capacity * loadFactor);
        
        K[] oldKeyTable = (K[]) new Object[capacity];
        V[] oldValueTable = (V[]) new Object[capacity];
        
        //Perform key table pointer swap
        k[] keyTableTmp = keyTable;
        keyTable = oldKeyTable;
        oldKeyTable = keyTableTmp;
        
        //Perform value table pointer swap
        V[] valueTableTmp = valueTable;
        valueTable = oldValueTable;
        oldValueTable = valueTable;
        
        //Reset the key count and buckets used since we are about to
        //re-insert all the keys into the hash table
        keyCount = usedBuckets = 0;
        
        for (int i = 0; i < oldKeyTable.length; i++) {
            if (oldKeyTable[i] != null && oldKeyTable[i] != TOMBSTONE)
                insert(oldKeyTable[i], oldValueTable[i]);
            oldValueTable[i] = null;
            oldKeyTable[i] = null;
        }
    }
    
    @Override
    public java.util.Iterator<K> iterator() {
        //Before the iteration begins record the number of modifications
        //done to the hash table. This value should not change as we iterate
        //otherwise a concurrent modification has occurred.
        final int MODIFICATION_COUNT = modificationCount;
        return new java.util.Iterator<K>() {
            int keysLeft = keyCount, index = 0;
            
            @Override
            public boolean hasNext() {
                //The contents of the table have been altered
                if (MODIFICATION_COUNT != modificationCount) throw new java.util.ConcurrentModificationException();
                return keysLeft != 0;
            }
            
            //Find the next element and return it 
            @Override
            public K next() {
                while (keyTable[index] == null || keyTable[index] == TOMBSTONE)
                    index++;
                keysLeft--;
                return keyTable[index++];
            }
            
            @Override
            public void remove() {
                throw new UnsupportedOperationException();
            }
        };
    }
    
    //Return a String version of this hash table
    @Override
    public String toString() {
        StringBuilder sb = new StringBuilder();
        sb.append("{");
        for (int i = 0; i < capacity; i++)
            if (keyTable[i] != null && keyTable[i] != TOMBSTONE)
                sb.append(keyTable[i] + " => " + valueTable[i] + ", ");
        sb.append("}");
        return sb.toString();
    }
}

Fenwick Tree (Binary Indexed Tree)

A Fenwick Tree (also called Binary Indexed Tree) is a data structure that supports sum range queries as well as setting values in a static array and getting the value of the prefix sum up some index efficiently.

Complexity

ConstructionO(N)
Point UpdateO(log(n))
Range SumO(log(n))
Range UpdateO(log(n))
Adding IndexN/A
Removing IndexN/A

Range queries

Unlike a regular array, in a Fenwick tree a specific cell is responsible for other cells as well.

The position of the least significant bit (LSB) determines the range of responsibility that cell has to the cells below itself.

In a Fenwick tree we may compute the prefix sum up to a certain index, which ultimately lets us perform range sum queries.

Idea: Suppose you want to find the prefix sum of [1, i], then you start at i and cascade downwards until you reach zero adding the value at each of the indices you encounter.

To do a range query from [i, j] both inclusive a Fenwick tree of size N:

function prefixSum(i):
	sum := 0
    while i != 0:
		sum = sum + tree[i]
        i = i - LSB(i)
    return sum
            
function rangeQuery(i, j):
	return prefixSum(j) - prefixSum(i - 1)
        
Where LSB returns the value of the least significant bit.

Fenwick tree point updates

Point updates are the opposite of prefix sum, we want to add the LSB to propagate the value up to the cells responsible for us.

Point update algorithm
    
To update the cell at index i in a Fenwick tree of size N:

function add(i, x):
	while i < N:
		tree[i] = tree[i] + x
        i = i + LSB(i)
            
Where LSB returns the value of the least significant bit. For example:
LSB(12) = 4 because 12 = 1100 and the least significant bit of 1100 is 100 or 4 base in ten

Naive Construction of Fenwick tree

Let A be an array of values. For each element in A at index i do point update on the Fenwick tree with a value of A[i]. There are n elements and each point update takes O(log(n)) for a total of O(nlog(n)), can we do better?

Linear Construction

Idea: Add the value in the current cell to the immediate cell that is responsible for us. This resembles what we did for point updates but only one cell at a time.

This will make the 'cascading' effect in range queries possible by propagating the value in each cell throughout the tree.

Let i be the current index

The immediate cell above us is at position j given by:

j := i + LSB(i) Where LSB is the Least Significant Bit of i. Ignore updating j if index is out of bounds.

Construction Algorithm
    
#Make sure values is 1-based!
function construct(values):
	N := length(values)
        
    #Clone the values array since we're
    #doing in place operations
    tree = deepCopy(values)
        
    for i = 1, 2, 3, ... N:
    	j := i + LSB(i)
        if j < N:
			tree[j] = tree[j] + tree[i]
                
    return tree

Implementation of Fenwick Tree

public class FenwickTree {
    
    //This array contains the Fenwick tree ranges
    private long[] tree;
    
    //Create an empty Fenwick Tree
    public FenwickTree(int sz) {
        tree = new long[sz + 1];
    }
    
    //Make sure the 'values' array is one based meaning
    //values[0] does not get used, O(n) construction
    public FenwickTree(long[] values) {
        if (values == null)
            throw new IllegalArgumentException("Values array connot be null!");
        
        //Make a clone of the values array since we manipulate
        //the array in place destroying all its original content
        this.tree = values.clone();
        
        for (int i = 1; i < tree.lenght; i++) {
            int j = i + lsb(i);
            if (j < tree.length) tree[j] += tree[i];
        }
    }
    
    //Returns the value of the least significant bit (LSB)
    //lsb(108) = lsb(0b1101100) = 0b100 = 4
    //lsb(104) = lsb(0b1101000) = 0b1000 = 8
    //lsb(96) = lsb(0b1100000) = 0b100000 = 32
    //lsb(64) = lsb(0b1000000) = 0b1000000 = 64
    private int lsb(int i) {
        //Isolates the lowest one bit value
        return i & -i;
        
        //An alternative method is to use the java's built in method 
        //return Integer.lowestOneBit(i)
    }
    
    //Computes the prefix sum from [1, i], one based
    public long prefixSum(int i) {
        long sum = 0L;
        while (i != 0) {
            sum += tree[i];
            i &= ~lsb(i);//equivalently, i -= lsb(i)
        }
        return sum;
    }
    
    //Returns the sum of the interval [i, j], one based
    public long sum(int i, int j) {
        if (j < i) throw new IllegalArgumentException("Make sure j >= i");
        return prefixSum(j) - prefixSum(i - 1);
    }
    
    //Add 'k' to index 'i', one based
    public void add(int i, long k) {
        while (i < tree.length) {
            tree[i] += k;
            i += lsb(i);
        }
    }
    
    //Set index i to be equal to k, one based
    public void set(int i, long k) {
        long value = sum(i, i);
        add(i, k - value);
    }
    
    @Override
    public String toString() {
        return java.util.Arrays.toString(tree);
    }
}

Suffix Array

What is a suffix?

A suffix is a substring at the end of a string of characters. For our purposes suffixes are non empty.

What is a SA?

A suffix array is an array which contains all the sorted suffixes of a string.

The actual 'suffix array' is the array of sorted indices.

This provides a compressed representation of the sorted suffixes without actually needing to store the suffixes.

(skip this part...)

Balanced Binary Search Trees (BBSTs)

A Balanced Binary Search Tree BBST is a self-balancing binary search tree. This type of tree will adjust itself in order to maintain a low (logarithmic) height allowing for faster operations such as insertions and deletions.

Tree rotations

The secret ingredient to most BBST algorithms is the clever usage of a tree invariant and tree rotations.

A tree invariant is a property/rule you impose on your tree that it must meet after every operation. To ensure that the invariant is always satisfied a series of tree rotations are normally applied.

It does not matter what the structure of the tree looks; all we care about is that the BST invariant holds. This means we can shuffle/transform/rotate the values and nodes in the tree as we please as long as the BST invariant remains satisfied!

function rightRotate(A):
	B := A.left
	A.left = B.right
	B.right = A
	return B
    
function leftRotate(B):
	A := B.right
	B.right = A.left
    A.left = B
    return A

NOTE: It's possible that before the rotation node A had a parent whose left/right pointer referenced it. It's very important that this link be updated to reference B. This is usually done on the recursive callback using the return value of rotateRight.

rotate method of a node when it has a reference to its parent

function rightRotate(A);
	P := A.parent
    B := A.left
    A.left = B.right
    if B.right != null:
		B.right.parent = A
    B.right = A
    A.parent = B
    B.parent = P
    # Update parent down link
    if P != null
        if P.left == A
            P.left = B
        else
            P.right = B
    return B

An AVL tree is one of many types of Balanced Binary Search Trees which allow for logarithmic O(log(n)) insertion, deletion and search operations.

In fact, it was the first type of BBST to be discovered. Soon after, many other types of BBSTs started to emerge including the 2-3 tree, the AA tree, the scapegoat tree, and its main rival, the red-black tree.

The property which keeps an AVL tree balanced is called the Balanced Factor (BF).

BF(node) = H(node.right) - H(node.left)

Where H(x) is the height of node x. Recall that H(x) is calculated as the number of edges between x and the furthest leaf.

The invariant in the AVL which forces it to remain balanced is the requirement that the balance factor is always -1, 0 or +1.

Node Information to Store

  • The actual value we're storing in the node. NOTE: This value must be comparable so we know how to insert it.

  • A value storing this node's balance factor.

  • The height of this node in the tree.

  • Pointers to the left/right child nodes.

#Public facing insert method. Returns true 
#on successful insert and false otherwise.

function insert(value):
	if value == null
		return false
		
	#Only insert unique values
	if !contains(root, value):
		root = insert(root, value)
		nodeCount = nodeCount + 1
		return true
		
	#Value already exists in tree
	return false
            
#private insert method
function insert(node, value):
	if node == null: return Node(value)
	
	#Invoke the comparator function in whatever
	#programming language you're using
	cmp := compare(value, node.value)
	
	if cmp < 0:
		node.left = insert(node.left, value)
	else 
		node.right = insert(node.right, value)
		
	#update balance factor and height values.
	update(node)
	
	#Rebalance tree
	return balance(node)
	
function update(node):
	
	#Variables for left/right subtree heights
	lh := -1
	rh := -1
	if node.left != null: lh = node.left.height
	if node.right != null: rh = node.right.height
	
	#update this node's height
	node.height = 1 + max(lh, rh)
	
	#update balance factor
	node.bf = rh -lh
	
function balance(node):
	#Left heavy subtree.
	if node.bf == -2:
		if node.left.bf <= 0:
			return leftLeftCase(node)
		else:
			return leftRightCase(node)
			
	#Right heavy subtree
	else if node.bf == +2:
		if node.right.bf >= 0:
			return rightRightCase(node)
		else:
			return rightLeftCase(node)
			
	#Node has balance factor of -1, 0 or +1
	#which we do not need to balance.
	return node
	
function leftLeftCase(node):
	return rightRotation(node)
	
function leftRightCase(node):
	node.left = leftRotation(node.left)
	return leftLeftCase(node)
	
function rightRightCase(node):
	return leftRotation(node)
	
function rightLeftCase(node):
	node.right = rightRotation(node.right)
	rerturn rightRightCase(node)
	
function rightRotate(A):
	B := A.left
	A.left = B.right
	B.right = A
	#After rotation update balance
	#factor and height values
	update(A)
	update(B)
	return B
	
AVL tree rotations require you to call the update method! The left rotation is symmetric

Implementation of AVL tree

public class AVLTreeRecursive<T extends Comparable<T>> implements Iterable<T> {
    
    class Node implements TreePrinter.PrintableNode {
        
        //'bf' is short for balance factor
        int bf;
        
        //The value/data contained within the node.
        T value;
        
        //The height of this node in the tree.
        int height;
        
        //The left and the right children of this node.
        Node left;
        Node right;
        
        public Node(T value) {
            this.value = value;
        }
        
        @Override
        public Node getLeft() {
            return left;
        }
        
        @Override
        public Node getRight() {
            return right;
        }
        
        @Override
        public String getText() {
            return String.valueOf(value);
        }
    }
    
    //The root node of the AVL tree
    /* private */Node root;
    
    //Tracks the number of nodes inside the tree
    private int nodeCount = 0;
    
    //The height of a rooted tree is the number of edges between the tree's
    //root and its furthest leaf. This means that a tree containing a single
    //node has a height of 0
    public int height() {
        if (root == null) return 0;
        return root.height;
    }
    
    //Returns the number of nodes in the tree
    public int size() {
        return nodeCount;
    }
    
    //Returns wether or not the tree is empty.
    public boolean isEmpty() {
        return size() == 0;
    }
    
    //Prints a visual representatioin of the tree to the console
    public void display() {
        TreePrinter.print(root);
    }
    
    //Returns true/false depending on whether a value exists in the tree.
    public boolean contains(T value) {
        return contains(root, value);
    }
    
    //Recurive contains helper method.
    private boolean contains(Node node, T value) {
        if (node == null) return false;
        
        //Compare current value to the value in the node
        int cmp = value.compareTo(node.value);
        
        //Dig into left subtree
        if (cmp < 0) return contains(node.left, value);
        
        //Dig into right subtree
        if (cmp > 0) return contains(node.right, value);
        
        //Found value in tree
        return true;
    }
    
    //Insert/add a value to the AVL tree. The value must not be null, O(log(n))
    public boolean insert(T value) {
        if (value == null) return false;
        if (!contains(root, value)) {
            root = insert(root, value);
            nodeCount++;
            return true;
        }
        return false;
    }
    
    //Inserts a value inside the AVL tree.
    private Node insert(Node node, T value) {
        //Base case.
        if (node == null) return new Node(value);
        
        //compare current value to the value in the node
        int cmp = value.compareTo(node.value);
        
        //insert node in left subtree
        if (cmp < 0) {
            node.left = insert(node.left, value);
        } else {
            //Insert node in right subtree
            node.right = insert(node.right, value);
        }
        
        //Update balance factor and height values.
        update(node);
        
        //Re-balance tree.
        return balance(node);
    }
    
    //Update a node's height and balance factor
    private void update(Node node) {
        int leftNodeHeight = (node.left == null) ? -1 : node.left.height;
        int rightNodeHeight = (node.right == null) ? -1 : node.right.height;
        
        //Update this node's height.
        node.height = 1 + Math.max(leftNodeHeight, rightNodeHeight);
        
        //Update balance factor.
        node.bf = rightNodeHeight - leftNodeHeight;
    }
    
    //Re-balance a node if its balance factor is +2 or -2
    private Node balance(Node node) {
        //Left heavy subtree
        if (node.bf == -2) {
            //Left-Left case.
            if (node.left.bf <= 0) {
                return leftLeftCase(node);
                
            //Left-Right case
            } else {
                return leftRightCase(node);
            }
            
        //Right heavy subtree needs balancing.
        } else if (node.bf == 2) {
            //Right-Right case
            if (node.right.bf >= 0) {
                return rightRightCase(node);
                
            //Right-Left case
            } else {
                return rightLeftCase(node);
            }
        }
        
        //Node either has a balance factor of 0 1 or -1 which is fine.
        return node;
    }
    
    private Node leftLeftCase(Node node) {
        return rightRotation(node);
    }
    
    private Node leftRightCase(Node node) {
        node.left = leftRotation(node.left);
        return leftLeftCase(node);
    }
    
    private Node rightRightCase(Node node) {
        return leftRotation(node);
    }
    
    private Node rightLeftCase(Node node) {
        node.right = rightRotation(node.right);
        return rightRightCase(node);
    }
    
    private Node leftRotation(Node node) {
        Node newParent = node.right;
        node.right = newParent.left;
        newParent.left = node;
        update(node);
        update(newParent);
        return newParent;
    }
    
    private Node rightRotation(Node node) {
        Node newParent = node.left;
        node.left = newParent.right;
        newParent.right = node;
        update(node);
        update(newParent);
        return newParent;
    }
    
    //Remove a value from this binary tree if it exists, O(log(n))
    public boolean remove(T elem) {
        if (elem == null) return false;
        
        if (contains(root, elem)) {
            root = remove(root, elem);
            nodeCount--;
            return true;
        }
        return false;
    }
    
    //Removes a value from the AVL tree.
    private Node remove(Node node, T elem) {
        if (node == null) return null;
        
        int cmp = elem.compareTo(node.value);
        
        //Dig into left subtree, the value we're looking
        //for is smaller than the current value
        if (cmp < 0) {
            node.left = remove(node.left, elem);
            
        //Dig into right subtree, the value we're looking
        //for is greater than the current value.
        } else if (cmp > 0) {
            node.right = remove(node.right, elem);
            
        //Found the node we wish to remove
        } else {
            
            //This is the case with only a right subtree or no subtree at all.
            //In this situation just swap the node we wish to remove
            //with its right child
            if (node.left == null) {
                return node.right;
                
            //This is the case with only a left subtree or 
            //no subtree at all. In this situation just
            //swap the node we wish to remove with its left child
            } else if (node.right == null) {
                return node.left;
                
            //When removing a node from a binary tree with two links the
            //successor of the node being removed can either be the largest
            //value in the left subtree or the smallest value in the right
            //subtree. As a heuristic, I will remove from the subtree with
            //the most nodes in hopes that this may help with balancing.
            } else {
                //Choose to remove from left subtree
                if (node.left.height > node.right.height) {
                    //Swap the value of the successor into the node.
                    T successorValue = findMax(node.left);
                    node.value = successorValue;
                    
                    //Find the largest node in the left subtree
                    node.left = remove(node.left, successorValue);
                } else {
                    //Swap the value of the successor into the node
                    T successorValue = findMin(node.right);
                    node.value = successorValue;
                    
                    //Go into the right subtree and remove the leftmost node
                    //we found and swapped data with. This prevents us from having
                    //two nodes in our tree with the same value
                    node.right = remove(node.right, successorValue);
                }
            }
        }
        
        //update balance factor and height values
        update(node);
        
        //Re-balance tree
        return balance(node);
    }
    
    //Helper method to find the leftmost node (which has the smallest value)
    private T findMin(Node node) {
        while (node.left != null)
            node = node.left;
        return node.value;
    }
    
    //Helper method to find the rightmost node (which has the largest value)
    private T findMax(Node node) {
        while (node.right != null) 
            node = node.right;
        return node.value;
    }
    
    //Returns as iterator to traverse the tree in order.
    public java.util.Iterator<T> iterator() {
        final int expectedNodeCount = nodeCount;
        final java.util.Stack<Node> stack = new java.util.Stack<>();
        stack.push(root);
        
        return new java.util.Iterator<T> () {
            Node trav = root;
            @Override
            public boolean hasNext() {
                if (expectedNodeCount != nodeCount) throw new java.util.ConcurrentModificationException();
                return root != null && !stack.isEmpty();
            }
            
            @Override
            public T next() {
				if (expectedNodeCount != nodeCount) throw new java.util.ConcurrentModificationException();
                
                while (trav != null && trav.left != null) {
                    stack.push(trav.left);
                    trav = trav.left;
                }
                
                Node node = stack.pop();
                
                if (node.right != null) {
                    stack.push(node.right);
                    trav = node.right;
                }
                return node.value;
            }
            
            @Override
            public void remove() {
                throw new UnsupportedOperationException();
            }
        };
    }
}

Indexed Priority Queue

What is an Indexed Priority Queue?

An Indexed Priority Queue is a traditional priority queue variant which on top of the regular PQ operations supports quick updates and deletions of key-value pairs.

#Inserts a value into the min indexed binary
#heap. The key index must not already be in
#the heap and the value must not be null.
function insert(ki, value):
	values[ki] = value
	#'sz' is the current size of the heap
	pm[ki] = sz
	im[sz] = ki
	swim(sz)
	sz = sz + 1
	
#Swims up node i (zero base) until heap
#invariant is satisfied.
function swim(i):
	for (p = (i-2)/2; i > 0 and less(i, p)):
		swap(i, p)
		i = p
		p = (i-1)/
		
function swap(i, j):
	pm[im[j]] = i
	pm[im[i]] = j
	tmp = im[i]
	im[i] = im[j]
	im[j] = tmp
	
function less(i, j):
	return values[im[i]] < values[im[j]]
        
#Deletes the node with the key index ki
#in the heap. The key index ki must exist
#and be present in the heap
function remove(ki)
	i = pm[ki]
	swap(i, sz)
	sz = sz - 1
	sink(i)
	swim(i)
	values[ki] = null
	pm[ki] = -1
	im[sz] = -1
	
#Sinks the node at index i by swapping
#itself with the smallest of the left
#or the right child node.
function sink(i):
	while true:
		left = 2*i + 1
		right = 2*i + 2
		smalest = left
		
		if right < sz and less(right, left):
			smaless = right
			
		if left >= sz or less(i, smallest):
			break
			
		swap(smallest, i)
		i = smallest
		
		swap(smallest, i)
		i = smallest
		
#Updates the value of a key in the binary
#heap. The key index must exist and the 
#value must not be null.
function update(ki, value):
	i = pm[ki]
	values[ki] = value
	sink(i)
	swim(i)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值