Review on knapsack problem: on what is definition of optimization problem (under certain constraints, max/min certain objective function) Also we have yes/no decision version.
In most cases, if we can solve the yes/no decision version in polynomial time, then we can solve the optimization version in polynomial time as well.
methodology for dealing with optimization problems: greedy algorithm (easy to implement, fast running; but usually not optimal solution) and searching algorithm (can find the optimal solution; but usually very slow).
Can we solve it by dived and conquer like algorithm? Subproblem: A is an array of size X, A(k, j) means the best you can do with first 0’th~k’th items and j volume.
The dependency: something like this: A[k+1, s] = max( value(k+1) + A[k, s-weight(k+1)], A[k, s] )
Dynamic programming: here programming means optimization/planning not the coding…
Claim: instead of using a table, we can just use a 1-D array!
Proof: Observe that, in fact, when we are looking at the i-th item, the only information we need is the last row, i.e. dp[i-1,…]. That is, most dp entries are kind of useless. In fact, we could find the maximum value (but this way cannot tell us the optimal way of packing, sadly), by just using a one-dimensional array. This reduces the space complexity.
Specifically, we do the following
// suppose there are n items, the maximum capacity is m
// also value vector V, weight vector W
dp = 1x(maximum capacity) array;
for i in 1:n
for j in 1:m
// of course, there should be some boundary condition checking
dp[j] = max(dp[j], V[i] + dp[j-W[i]]);
// compare it with original case:
// dp[i, j] = max(dp[i-1, j], V[i] + dp[i-1, j-W[i]])
Divide and conquer:
break up a problem into 2 subproblems, solve each indepenedently, and combine solutions to subproblems to form a solution for the original problem
Dynamic programming:
break up a problem into several combinations of sub-problems, and build up solutions by optimizaing the choice of the combinations and from smaller subproblems to larger and larger subproblems.
Homework: meeting schedule problem
Think about the methodology and math structure. Why this problem can have a polynomial algorithm, but knapsack problem can only (so far) have a pseudo-polynomial algorithm?
Associative Data Structure
Abstract relationship:
For example:
Set U, a collection of items; Cartesian product of U: UxU; Relationship set E ⊂ U × U E\subset U\times U E⊂U×U describing the relationship between items. Elements in E are called edge or relation. If we call (a,b) edge, we imagine that element a a a can reach b b b directly. If we call it relation, it’s more of an abstract concept.
A concrete example:
U = all students on campus
E = {(a,b) if a knows about b}
U = all accounts on Weibo
E = {(a,b) if a follows b}
U = all bus stations in town
E = {(a,b) if the next stop of a is b}
The relationship could be:
- If for any (a,b) in E, (b,a) is also in E, then E is symmetric, Otherwise E is asymmetric
- If (a,b) in E and (b,c) in E, then (a,c) is also in E, then E is transitive
Path from a to b is a sequence (ordered set) of edges satisfying
- If (u,v) is the immediate next element of (s,t) in the sequence, then t = u
- the first element of the sequence start with (a,x) and the last elment of the sequence end with (y,b)
- no edge appears twice in the sequence (not allowed going back and forth)
a relationship is called cyclic if there is a cycle in the relationship (a path of length at least two that a=b)
Abstract Linked List
[ − ] → [ − ] → → [ − ] → [ − ] → [ − ] [-]\to [-]\to \to[-] \to[-]\to[-] [−]→[−]→→[−]→[−]→[−]
Linked list ADT
A list of items, each item will have one successor except for the last item and from one item you are able to access the next item.
support next() operation, read the next item from the current item. Support find() operation.
templact <struct D>
struct linked_list_node {
linked_list_node * next;
D data;
};
linked_list_node * head = NULL, *tail = NULL;
// ask OS to allocate new memeory
head = new linked_list_node<D> (...);
tail = head;
// insert one more element to the list
linked_list_node *item = new linked_list_node<D>(...);
tail->next = item;
tail = item;
Traversal
void iter(Node * h) {
print(h->data);
if (h->next != NULL){
iter(h->next);
}
}
by switching print and recursion part, we can get a reversed printed traversal.
Queue implementation array vs. linked_list
Array: fix size; need to use cyclic array to have O(1) enqueue and dequeue
Linked List: variable size; enqueue and dequeue is O(1) naturally; lose the ability to peak into the queue.
[Improve: doubly linked list node]
Problem: how to print the middle element of a singly linked list in one pass?
(maintain 2 pointer! one moves 2 steps, one moves 1 step! ) !!! Amazing!
So in general, we could do it by one-pass
Tree
Has a unique node called root and other nodes
Parent branches to children
No node has more than 1 parent
Trees Examples:
- Directories
- Ancestor relationship
- Recursion iteration paths
- Covid-19 infection paths (likely to be a tree)
- Decision tree
- Phylogenetic Tree
Rigorously,
T = (r, V, E) V is a set of nodes, E is a set of edges between nodes, r is root
Nodes without child are called leaves
Tree is also a recursive data structure
Tree’s edge is undirected as a math concept; But directed as a data structure.
Terminologies (for rooted tree)
- root
- parent/children
- ancesotors (parent and their parents, and so on…) , predecessor(parent), successor (children)
- least common ancestors of two nodes: LCA(x, z) =y
- Leaves, internal nodes = nodes - leaves
- degree of node = number of children of the node
- depth/level of the node = distance from the root to the node
- height of the tree = number of edges from the root to the deepest leaf, root is at height 0
Math properties
P1: There is a unique path from root to any node
because no node can have 2 parents
P2: There is no cycle in the tree
because of P1
D-regular tree
Trees such that all internal nodes are having degree D
Special name for D = 2: binary tree
Unbounded degree tree
Class of trees such that the degree isn’t bounded as number of nodes N goes to infinity
E.g. directories in OS
Two perspective of trees
As a math object …
As concrete data structure… (red-black tree, AVL tree; competition tree; heap)
Time complexity really depends on the implementation and the tree structure. No generic definitions for delete, insert, find,…
template <struct D>
struct tree_node {
tree_node * left, * right;
D data;
}; // A binary tree ndoe
tree_node * root; // root of a tree
Complete binary tree
Each internal node of the binary tree has 2 children
Property:
let root be of height 0
for a complete tree of height H, there are ∑ 0 H 2 H = 2 H + 1 − 1 \sum_0^H 2^H = 2^{H+1}-1 ∑0H2H=2H+1−1 nodes
Tree and recursions
trees are naturally recursive structures
-
each left or right subtrees are self-repeating structures
-
when solving problems on/using trees, recursive and divide-and-conquer algorithms might be top choice
- Resolve the left tree subproblem
- resolve the right tree subproblem
- merge the result at the current node
Expressions and trees
Any expression can be put on the tree
- internal nodes are operators
- leaves are numbers
Pre-fix expression are actually easy to convert to the tree
- first op is the root
- then do the recursive on