Algorithm Design and Analysis: Tarjan's Algorithm and its Proof

Hi there,

Today, I want to talk about Tarjan’s algorithm that finds the strongly connected component (SCC). Before we dive into the amazing algorithm, let’s define SCC.

SCC: in a directed graph, SCC is a self-contained cycle where every node in the cycle can find a path to every other node in the cycle.

We are especially interested in finding the SCC in the graph, because, by know SCC, we can better understand the structure of the graph. One specific application of the algorithm is to find groups of people who are closely related to each other in a social network. When everyone in a social group has interactions with every other person in the same group, we can generally assume that this group of people is tightly connected.

OK, enough for appetite-whetting stuff. Let me introduce the algorithm now. The main body of Tarjan’s algorithm is a depth-first search (DFS) through the graph. During the DFS, we assign a timestamp to each visited node. For example, the node x is the first node that is visited in DFS, then 1 is the timestamp for node x.

Based on the above, we define the concept “Low Link Value” (LLV) of a node x as “the smallest node timestamp that is reachable from node x by a DFS. The whole validity of Tarjan’s algorithm is based on the following proposition:

If two nodes share the same low link value, then they belong to the same SCC.

We can prove this proposition by contradiction. Suppose there exist two nodes that have different low link values and belong to the same SCC. Because their low link values are different, one of the LLV must be smaller than the other. Assume the LLV of node A is smaller than the LLV of node B. This means that, node A can reach a node that is not reachable by node B. By the definition of SCC, node A and node B doesn’t belong to same SCC. The contradiction forms.

OK, now we know that all nodes in the same SCC share the same low link value. The only thing that Tarjan’s algorithm is doing is to assign the timestamp to each node and update the low link value for each node during a single DFS pass. You could refer to the implementation details of the algorithm in the following, which is heavily documented. I hope you enjoy this short essay on Tarjan’s algorithm. Thanks.

package GraphRelated;

import java.util.LinkedList;

public class TarjanSCC {
static int timeStamp = 0;  //timestamp for DFS. 
						   //e.g. if node x is the first node visited in the DFS, the timestamp of node x is 1
	
	public static void main(String[] args) {
		
		int[] edges = {0, 1, 2, 0, 3, 4, 5, 4, 3, 6}; 
		//the number in this array is the index of the node in the given graph 
		//two adjacent numbers in this array represents an edge between two nodes
		//the array represents a series of edges in a graph: (0,1), (1,2), (2, 0)...
		//notice that we have 3 SCCs in this graph: 0-1-2, 3-4-5, 6 
		
		Graph g = new Graph(7); 
		//initialize a graph object
		
		
		//instantiate the graph with edges given by edges array
		for(int i = 0; i < edges.length - 1; i++) {
			addEdge(g, edges[i], edges[i+1]); 
		}
		
		//find strongly connected components in the given graph
		int[] scc = findSCC(g); 
		
		//print out the strongly connected components array;
		//output: 0003336; this means node 0, 1, 2 are in same scc and were marked with the pivot node 0;
		// node 3, 4, 5 are in same scc and were market with the pivot node 3; 
		// node 6 by itself forms an scc 
		for(int i: scc) System.out.printf("%d ", i);
		
		
	}
	
	static int[] findSCC(Graph g) {
		
		int n = g.V; //the number of nodes in the graph
		
		int[] visited = new int[n]; 
		//this array records whether a node is visited. it also records the timestamp of the visited node
		//e.g. if visited[0] < 0, it means "node 0 has not been visited"; 
		//if visited[0] = 3, it means "node 0 is the third node that was visited by DFS
		for(int i = 0; i < n; i++) visited[i] = -1; 
		//initialize the visited array
		
		int[] low = new int[n]; 
		//this array records the low link value of each node
		//low link value is the smallest timestamp of a node among all nodes that the current node can reach 
		//e.g. low[0] = 1 means that "1 is the smallest timestemp of a node among all nodes that node 0 can reach
		
		for(int i = 0; i < n; i++) low[i] = -1;
		//initialize the low array
		
		LinkedList<Integer> stack = new LinkedList();
		//stack records all nodes that are in current SSC
		
		boolean[] onStack = new boolean[n]; 
		//this array records whether a node is currently on the stack
		for(int i = 0; i < n; i++) onStack[i] = false;
		//initialize the stack 
		 
		
		//perform the DFS to visit all nodes
		for(int i = 0; i < n; i++) {
			if(visited[i] == -1) {
				dfs(g, i, visited, low, onStack, stack);
				 
			}
		}
		
		
		return low; 
	}
	
	// this is a recursive function implementing a depth first search
	static void dfs(Graph g, int node, int[] visited, int[] low, boolean[] onStack, LinkedList<Integer> stack) {
		
		if(visited[node] != -1)  return ; 
		//base case: when the dfs visits an already visited node, we stop going further
			
		visited[node] = timeStamp++; 
		//assign the timestamp to the node that dfs is currently visiting

		low[node] = visited[node];
		//the low link value of the current node is itself for the time being
		//we will update it during the callback of the recrusive functions
		
		stack.add(node); 		
		onStack[node] = true; 
		//add the current node to stack 
		
		//iterate recursively through all adjacent nodes of the current node (i.e. DFS process) 
		for(int adjNode: g.adjList[node]) {
			
			if(visited[adjNode] == -1) dfs(g, adjNode, visited, low, onStack, stack);
			
			//this is callback stage
			//during callback stage, we update the low link value of the current node with the min(low[adjNode], low[node])
			if(onStack[adjNode]) {
				low[node] = (low[adjNode] < low[node])? low[adjNode]:low[node];
			}
		}
		
		
		int stackPop  = -1 ;
		
		//if the low link value equals to the current node timestamp, we finish iterating through an SCC
		if(low[node] == visited[node]) {
			
			//then we pop out all nodes that belong to the current SCC 
			//this prepares the stack for the next SCC
			while(stackPop != node) {
				stackPop = stack.pollLast(); 
				onStack[stackPop] = false;
				
				//update the low link value of the nodes, which belong to the same SCC, to the timestamp of its pivot node
				low[stackPop] = low[node]; 
			}
		}
		
	}
	
	
	//the below is the basic infrastructure that implements a Graph object and a function that adds new edges to a graph
	static void addEdge(Graph g, int src, int dest) {
		g.adjList[src].add(dest); 
	}
	
	static class Graph{
		int V;
		LinkedList<Integer>[] adjList;
		
		Graph(int V){
			this.V = V; 
			adjList = new LinkedList[V]; 
			for(int i = 0; i < V; i++) {
				adjList[i] = new LinkedList(); 
			}
		}
	}
}

Best,

Ben

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值