1. 回顾
- Depth-first search is a generalization of preorder traversal.
- We implicitly assume that for undirected graphs every edge (v, w) appears twice in the adjacency lists: once as (v, w) and once as (w, v). The procedure in Figure 9.59 performs a depth-first search (and does absolutely nothing else) and is a template for the general style.
因为之前在树的篇章已经接触了无数次前序遍历,所以这里的概念接受起来比较快。以下是DFS的模板,因为二叉树中只有两个分支,所以没有写成for
循环的形式,其实两者本质一样。特别的,因为树一定是无环图,且不会重复遍历已经访问过的节点:
void dfs( vertex v ){
visited[v] = TRUE;
for each w adjacent to v
if( !visited[w] )
dfs( w );
}
2. 分析
- The (global) boolean array visited[ ] is initialized to FALSE. By recursively calling the procedures only on nodes that have not been visited, we guarantee that we do not loop indefinitely.
树是有向图,且一定无环,前序遍历不会发生loop.但是其他的图就不一定了。
- If the graph is undirected and not connected, or directed and not strongly connected, this strategy might fail to visit some nodes. We then search for an unmarked node, apply a depth-first traversal there, and continue this process until there are no unmarked nodes.
- Because this strategy guarantees that each edge is encountered only once, the total time to perform the traversal is O(|E| + |V|), as long as adjacency lists are used.
DFS注意用于遍历非连通图的时候,需要多次调用。对于使用邻接表表示的图,其时间复杂度为 O ( ∣ E ∣ + ∣ V ∣ ) O(|E| + |V|) O(∣E∣+∣V∣)。
3. 简单应用
- An undirected graph is connected if and only if a depth-first search starting from any node visits every node. If they are not, then we can find all the connected components and apply our algorithm on each of these in turn.
- The tree will simulate the traversal we performed. A preorder numbering of the tree, using only tree edges, tells us the order in which the vertices were marked. If the graph is not connected, then processing all nodes (and edges) requires several calls to dfs, and each generates a tree. This entire collection is a depth-first spanning forest, which is so named for obvious reasons.
第二点笔者觉得非常重要,使用DFS解决问题,可以使用树来描述整个遍历过程。如果该图不连通,那么就会产生多个DFS树。可以使用DFS找当前图的连通分支,也可以判断当前图的连通性。
4. < Leetcode > Restore IP Addresses
个人解法
class Solution {
public:
vector<string> restoreIpAddresses(string s) {
vector<string> answer;
dfs(answer,"",s,0,0,0);
return answer;
}
void dfs(vector<string> & answer,string ans,string & s,int idx,int len,int count){
int tmp=0;
if(count >= 4 )
return;
else if( s[idx] == '0' && len !=1 && len !=0)
return;
if(len !=0)
tmp=stoi(s.substr(idx,len));
if( tmp <= 255){
if(len != 0){
if(ans.size() == 0)
ans.insert(0,to_string(tmp));
else
ans.insert(ans.size(),to_string(tmp));
if(++count < 4)
ans.insert(ans.end(),'.');
}
if(count == 4 && ans.size()==s.size()+3){
answer.push_back(ans);
return ;
}
for(int i=1;idx+len-1+i<s.size() && i<4;++i)
dfs(answer,ans,s,idx+len,i,count);
}
else
return;
}
};
笔者的写法是按照标准的DFS模板写的,解空间如图,实际上就是回溯算法(实现的比较别扭):
优质解法
public List<String> restoreIpAddresses(String s) {
List<String> solutions = new ArrayList<String>();
restoreIp(s, solutions, 0, "", 0);
return solutions;
}
private void restoreIp(String ip, List<String> solutions, int idx, String restored, int count) {
//递归基准
if (count > 4) return;
//满足条件的时候idx==ip字符串长度
if (count == 4 && idx == ip.length()) solutions.add(restored);
//len长度有3种可:1,2,3
for (int i=1; i<4; i++) {
//剪枝函数
if (idx+i > ip.length()) break;
//获取当前拓展节点的解
String s = ip.substring(idx,idx+i);
//剪枝函数
//i==3多余,可以删除
if ((s.startsWith("0") && s.length()>1) || (i==3 && Integer.parseInt(s) >= 256)) continue;
//递归调用,此时根据count选择是否添加'.'
restoreIp(ip, solutions, idx+i, restored+s+(count==3?"" : "."), count+1);
}
}
实际上面的写法是比较标准的回溯递归实现。当然笔者还未记录到回溯,但是已经在Leetcode遇到太多回溯的题了。
5. 简单了解回溯
回溯方法一般和BFS和DFS结合,也有启发式搜索法,笔者在知乎看到有人将BFS和DFS归结为暴力搜索框架。那么回溯方法就是对暴力搜素的一种优化。
比如之前做的leetcode中的全排问题,此时单纯变成DFS问题。再比如生产n个’()'的问题,该问题同样可以使用DFS解决,但是有些解是无效的,所以将这些解剔除,在DFS搜索的时候遇到不满足左括号大于等于右括号的情况直接返回,这种行为被称为剪枝,相应的前面的性质称为剪枝函数。
现在使用Restore IP Addresses的例子,使用回溯法的时候,首先确定解空间,该问题的解空间为x.x.x.x
字符串组成的空间,x
为0-255
的字符。一般将解空间组织成树或者图的形式,方便使用BFS或者DFS,对于该题很显然树的形式更加适合(树也是图的一种),确定解向量(x1,x2,x3,x4),x1
为0-255
的字符串。该约束也为显约束,对于本题隐约束是x1、x2、x3、x4
之间的相互关系(如:x1
包含的字符,x2
不能再次包含等)。特别的,解空间不一定全满足最优解,需要取舍。
为了形象描述回溯执行情况,将当前处理的xn
的节点称为扩展节点,将未处理的(xn
的其他可能取值和xn-1
的其他节点等等)节点称为活节点,将当前节点的子节点已经全部处理的节点称为死节点。然后一般回溯问题的解决框架为:
- 针对所给问题,定义问题的解空间
- 确定如何表示该解空间(一般概念上,在某些时候需要实际实现)
- DFS搜索,并使用剪枝函数
- 常用剪枝函数
- 约束函数在当前扩展节点处剪去不满足约束的子树
- 限界函数剪去得不到最优解的子树
6. 总结
很简单的接触了下回溯,更多体会运用回溯算法还应该在做题实践中。另外因为其是暴力求解法的优化,所以很多问题都可以使用回溯解决。