1. Graph Primitives
1.1 Generic Graph Search
Goal:
1) find everything findable from a given start vertex.
2) don’t explore anything twice.Goal : O(m+n) times.
Generic Algorithm(given graph G, vertex s)
1)initially s explored, all other vertices unexplored
2) while possible :
choose an edge(u,v) with u explored and v unexplored. mark v explored.
1.2 BFS
Breadth-First Search (BFS)
1) explored nodes in “layers” (O(m+n) time)
2)can compute shortest paths (usinf a queue, FIFO)
3)can compute connected components of an undirected graph.
1.3 The Code
BFS (graph G, start vertex s)
[all nodes initially unexplored]
1) mark s as explored
2) let Qqueue data structure(FIFO),initialized with s.
3) while Q
≠∅
≠
∅
:
remove the first node of Q, call it v
for each edge(v, w):
if w unexplored
mark w as explored
add w to Q(at the end)
1.4 Basic BFS Properties
Claim #1: at the end of BFS, v explored <==>
G has a path from s to v.
Claim #2: running time of main while loop
=O(
ns
n
s
+
ms
m
s
),where
ns
n
s
=#of nodes reachable from s.
ms
m
s
=#of edges reachable from s.
1.5 Application: Shortest Paths
Goal: compute dist(v), the fewest # of edges on path from s to v.
Extra code: initialize
When considering edge(v,w):
if w unexplored, then set dist(w) = dist(v) + 1
Claim: at teermination dist(v) = i <==>v in ith layer.
(shortest s-v path has i edges) .
Proof Idea: every layer i node w is added to Q by a layer (i-1) node v via the edge(v,w).
1.6 Application: Undirected Connectivity
Let G = (V,E) be an undirected graph.
Connected components = the “pieces ” of G.
Formal Definition: equivalence classed of the relation u<->v <==> there exists u-v path in G.
Goal: compute all connected components.
1.7 Connected Components via BFS
To compute all components: (undirected case)
(1)initialized all nodes as unexplored (O(n))
(2) for i to n (O(n))
if i not yet explored
BFS(G,i)
Notes:finds every connected component.
Running time: O(m+n).
2.Depth-First Search(DFS): explore aggressively,only backtrack when necessary.
(1) also computes a topological ordering(拓扑排序) of directed acyclic grpah.
(2) and strongly connected components of directed graphs.
Running time: O(m+n)
The Code:mimic BFS code,use a stack instead of a queue.
Recursive version:DFS(graph G, start vertex s)
1) mark s as explored.
2) for every edge(s,v):
if v unexplored
DFS(G,v)
Basic DFS Properties
(1)# at the end of the algorithm, v marked as explored <==>
there exists a path from s to v in G.
(2)#2:running time is O( ns n s + ms m s ),
where ns n s = # of nodes reachable from s.
ms m s = #of edges reachable from s.
2.1 Application:Topological Sort(拓扑排序):
Definition: A topological ordering of a directed graph G is a labeling f of G’s nodes such that:
(1) The f(v)’s are set {1,2…,n}
(2) (u.v)∈G=>f(u)<f(v) ( u . v ) ∈ G => f ( u ) < f ( v ) (tail less than head)
Note: G has directed cycle => no topological ordering.
Theorem: no directed cycle => can compute topological ordering in O(n+m) time.
2.2 Straightforward solution.
Note: every directed acyclic graph has a sink vertex(no outgoing arcs).
To compute topological ordering:
(1) let v be a sink vertex of G.
(2) set f(v) =n
(3) recurse on G-{v}.(删除sink结点,在剩余的图中继续执行以上步骤)
2.3 Topological Sort via DFS (slick)
DFS-Loop (graph G)
(1)mark all nodes unexplored.
(2)current-label = n
for each vertex
if v not yet explored
DFS(G,v)
其中:
DFS(graph G, start vertex s)
for every edge(s,v)
if v not yet explored
{
mark v explored
DFS(G,v)
}
set f(s) = cuurent_label
current_label = current_label -1
2.4 Topological Sort via DFS
Running Time: O(m+n)
Reason: O(1) time per node, O(1) time per edge.
Case 1: u visited by DFS before v => recursive call corresponding to v finishes before that of u(since DFS)=>f(v) > f(u).
Case 2 : v visited before u=> v’s recursive call finishes before u’s even starts.=>f(v) > f(u)
3. An O(m+n) Algorithm for Computing Strong Components.
3.1.Strongly Connected Components
Formal Definition: the strongly connected components(SCCs) of a directed graph G are the equivalence classes of the relation.
u<–>v <==> there exists a path u->v and a path v->u in G.
3.2. Kosaraju’s Two-Pass Algorithm
Theorem : can compute SCCs in O(m+n) time.
Algorithm : (give directed graph G)
(1) Let Grev = G with all arcs reversed.
(2) Run DSF-Loop on Grev. Goal:compute “magical ordering” of nodes.
Let f(v) = “finishing time” of each v in V.
(1) Run DFS-Loop on G.Goal: discover the SCCs one-by-one.
processing nodes in decreasing order of finishing times.
[SCCs = nodes with the same “leader”]
The DFS-LOOP subroutine
Input: a directed graph g=(V,E) g = ( V , E ) , in adjacency list representation.
(1) Initialize a global variable t to 0.
[This keeps track of the number of vertices that have been fully explored.]
(2)Initialize a global variable s to NULL.
[This keeps track of the vertex from which the last DFS call was invoked.]
(3) For i=n i = n down to 1:
[In the first call, vertices are labeled 1,2,...,n 1 , 2 , . . . , n arbitrarily. In the second call, vertices are labeled by their f(v)−values f ( v ) − v a l u e s from the first call.]
(a)if i not yet explored:
{
set s:= i
DFS(G,i)
}
The DFS subroutine
Input: a directed graph
G=(V,E)
G
=
(
V
,
E
)
,in adjacency list representation, and a source vertex
i∈V
i
∈
V
.
(1) Mark i as explored.
[It remains explore for the entire duration of the DFS-LOOP call.]
(2)Set leadrer(i) := s
(3) For each arc
(i,j)∈G
(
i
,
j
)
∈
G
:
{
if j not yet explored:
DFS
(G,j)
(
G
,
j
)
}
(4) t++
(5)Set
f(i):=t
f
(
i
)
:=
t
The
f−values
f
−
v
a
l
u
e
s
onlyt be computed during the first call to DFS-Loop, and the leader values only need to be computed during the second call to DFS-Loop.