LeetCode #126 Word Ladder II

最新推荐文章于 2024-11-09 21:51:15 发布

SquirrelYuyu

最新推荐文章于 2024-11-09 21:51:15 发布

阅读量291

点赞数

分类专栏： LeetCode 文章标签： LeetCode 算法

本文链接：https://blog.csdn.net/SquirrelYuyu/article/details/82931232

版权

LeetCode 专栏收录该内容

11 篇文章 0 订阅

订阅专栏

文章目录

题目
分析
测试代码

因为国庆假和调课，直到现在本人才把上一周的算法博客发出来。不过我在这道题上也走了不少弯路，只能说自己的思维确实还需要锻炼。

这篇博客详细地描述了我的思路过程，当是一个记录，也希望它能提醒我以后进行更为全面高效的思考。

下面来看看题目：

题目

Word Ladder II

Given two words (beginWord and endWord), and a dictionary’s word list, find all shortest transformation sequence(s) from beginWord to endWord, such that:

Only one letter can be changed at a time
Each transformed word must exist in the word list. Note that beginWord is not a transformed word.

Note:

Return an empty list if there is no such transformation sequence.
All words have the same length.
All words contain only lowercase alphabetic characters.
You may assume no duplicates in the word list.
You may assume beginWord and endWord are non-empty and are not the same.

Example 1:

Input:
beginWord = "hit",
endWord = "cog",
wordList = ["hot","dot","dog","lot","log","cog"]

Output:
[
  ["hit","hot","dot","dog","cog"],
  ["hit","hot","lot","log","cog"]
]

Example 2:

Input:
beginWord = "hit"
endWord = "cog"
wordList = ["hot","dot","dog","lot","log"]

Output: []

Explanation: The endWord "cog" is not in wordList, therefore no possible transformation.

分析

我们先定义如何检测两个字符串是否只有一个不同处：

inline bool cmp(const string& s1, const string& s2){
	int len = s1.length();
	int diff = 0;
	for(int i = 0; i < len; i++){
		if(s1[i] != s2[i]) diff++;
	}
	return diff == 1;
}

构建两端之间的最短路径图

因为要找 begin 与 end 之间的最短路径，所以要用 BFS 算法，即广度优先搜索算法。

那么，就可以把每个单词视作一个节点，这个问题本质上是一个图论问题。

为了方便，我们用 vector 的索引代表每个单词。使用 vector<int> 表示 vertex ，即节点的前驱节点列表，vector<vertex>[i] 表示第 i 个节点的前驱节点列表。

由于要找所有最短路径，且可能有多个节点通往 end ，所以在找到 end 之后，还不能停下来，要继续找到其他在前一层的、同样通往 end 的节点。

同理，对其他节点，可能有多个前驱节点，所以在进行 BFS 时，如果某一个节点有已经加入 BFS 搜索队列（即 distance 值已更改）的后继节点，也要把该节点加入这个后继节点的前驱节点列表。只有未加入 BFS 搜索队列的后继节点，才需要加入 BFS 搜索队列、更新 distance 值。

typedef vector<int> vertex;	// prev-list

class Solution{
private:
	vector<vertex> buildGraph(bool& accessible, const int& begin, const int& end, 
			const int& size, const vector<string>& dictionary);
	void findPaths(...);	// 具体参数因算法不同而有所改变，参见后面代码块
public:
	vector<vector<string>> Solution::findLadders(string beginWord, string endWord, vector<string>& wordList);
};

vector<vertex> Solution::buildGraph(bool& accessible, const int& begin, const int& end, 
			const int& size, const vector<string>& dictionary){
	vector<vertex> vers;	// saves prev-lists of vertexes
	vers.assign(size, vertex());
	
	int distance[size];
	for(int i = 0; i < size; i++){
		distance[i] = INT_MAX;
	} 
	distance[begin] = 0;
	
	queue<int> bfs;
	bfs.push(begin);
	
	int levelFlag = -1;		// records the distance from the end to the begin
	
	// builds a graph and makes bfs
	while(!bfs.empty()){
		int index = bfs.front();
		bfs.pop();
		
		if(distance[index] == levelFlag) break;
		else if(cmp(dictionary[index], dictionary[end])){
			vers[end].push_back(index);
			
			// cout << "[" << dictionary[index] << " " << dictionary[end] << "]" << endl;
			
			if(distance[end] > distance[index] + 1){	// end is found for the first time
				levelFlag = distance[end] = distance[index] + 1;
				bfs.push(end);
				accessible = true;
			}
			
			continue;
		}
		
		for(int i = 0; i < size; i++){
			if(i != index && cmp(dictionary[index], dictionary[i])){
				if(distance[i] >= distance[index] + 1){	// index -> i
					
					// cout << "[" << index << " " << dictionary[index] << " " 
					// << i << " " << dictionary[i] << "]" << endl;
					
					vers[i].push_back(index);
					
				}
				if(distance[i] > distance[index] + 1){
					distance[i] = distance[index] + 1;
					bfs.push(i);
				}
			}
		}
	}
	
	return vers;
}

记录所有路径

通过从 end 开始，使用前驱节点列表向前回溯，就不用考虑那些不在最短路径上的节点。这样，产生的路径与期望路径是相反的，我们需要在最后对每条路径进行逆序操作。

举个比较复杂的、可能的图的例子：

在这里插入图片描述

图中存在环，需要考虑在遍历时选择分支和分支汇合的问题。

歧路：节点分裂

我曾经考虑过，把有多个入度（在这里，如果 A 在 B 的前驱节点列表里，则在从 end 开始的回溯过程中有从 B 指向 A 的边，这会产生一个入度）的节点分裂出多个副本，以此得到没有分支汇合的树。在 end 和 begin 之间，将产生多条独立而无交叉的路径。

类似下图：

在这里插入图片描述

但其实这是歧路，难以实现。我们需要考虑到更复杂的情况，即有多个入度大于1的节点。

① 如下图，若在进行 BFS 从 begin 走到 end 的过程中，对有多个后继节点的节点进行分裂，那么到了后面依旧会再现嵌套环的结构。

（ A 为 begin ，H 为 end ，从 u 指向 v 的箭头表示 u 的前驱节点列表中有 v ，u 是 BFS 过程中 v 的后继节点，从 end 开始回溯时会从 u 走到 v 。）

在这里插入图片描述

② 如果在进行从 end 回溯到 begin 的过程时，将遇到的入度大于1的节点分裂，保持每个节点只有一个入度。

这样虽然可以消除嵌套环，但是就需要更复杂的数据结构，比如每个节点维持一个前驱节点列表和后继节点列表。空间耗费比起①又会更大一些。

而且，进行节点分裂后一样需要进行 DFS 遍历，因为仍然存在分支，只是分支不会汇合了。

正途：不删除边的DFS算法

老老实实地重新审视这些图。

自己重新走一遍图，可以发现遵循的仍是 DFS 算法。但以前做对树的 DFS 遍历时，通常会选择在出栈时把已访问的节点从树中删除。然而对这个有分支汇合形成环的图，不能把节点从图中删掉，因为可能之后还要再回来访问。

那么，就可能需要辅助的数据结构来记录分支的访问情况。

递归实现

利用递归的函数栈，来实现 DFS 访问。只改变记录路径的 words ，不改变图 graph。

void Solution::findPaths(const int& index, vector<vector<string>>& result, vector<string>& words,
			const vector<vertex>& graph, const vector<string>& dictionary){
	words.push_back(dictionary[index]);
	const vertex& list = graph[index];
	for(const int& v : list){
		findPaths(v, result, words, graph, dictionary);
	}
	if(index == graph.size() - 1){	// index == begin
		result.push_back(words);
	}
	words.pop_back();
}
	
vector<vector<string>> Solution::findLadders(string beginWord, string endWord, vector<string>& wordList){
	
	// finds if endWord is in dictionary
	int e = -1;
	for(int i = 0; i < wordList.size(); i++){
   		if(wordList[i] == endWord){
   			e = i;
   			break;
		}
	}
	if(e == -1) return vector<vector<string>>();
	
   	vector<string> dictionary = wordList;
   	dictionary.push_back(beginWord);
	const int size = dictionary.size();
	const int begin = size - 1;
	const int end = e;
	
	bool accessible = false;	// presents if the end is accessible
	
	// builds a graph by bfs
	const vector<vertex> graph = buildGraph(accessible, begin, end, size, dictionary);
	
	if(!accessible) return vector<vector<string>>();
	
	// finds all the shortest paths by dfs
	vector<vector<string>> result;
	vector<string> words;
	findPaths(end, result, words, graph, dictionary);
	
	for(vector<string>& v : result){
		reverse(v.begin(), v.end());
	}
	
	return result;
}

用时 $496 m s$ 。

整个算法的时间复杂度为 $O (V + E)$ ， $V$ 指节点数， $E$ 指边数。

循环&栈实现

递归的缺点是，调用函数时的栈操作比较耗时。现在尝试使用循环来实现 DFS 算法。

在循环中，不像递归那样可以直接扫描前驱节点列表进入不同的分支。我们需要一个栈，来记录在每个岔路口选择的分支。

以下面这张图为例，

在这里插入图片描述

有如下分支栈的变化：

在这里插入图片描述

在以下算法中，我们有两个int 类型的栈 dfs 和 branch 。

dfs 栈记录遍历时遇到的节点，branch 栈记录遇到岔路口（有多个前驱节点的节点）时，选择的分支。

遍历时，将 dfs 栈顶节点的第一个前驱节点推入 dfs 栈，如果 dfs 栈顶节点有多个前驱节点，则将 0 推入 branch 栈，表示选择第一个分支。一直入 dfs 栈直到 begin 入栈。
然后对 dfs 进行出栈操作，期间如果遇到有多个前驱节点的节点，则将 branch 栈的栈顶与该节点的前驱节点个数比较，看是否已经走过了最后一个分支。

如果是最后一个分支，那么 branch 出栈，dfs 继续出栈，直到遇到下一个有未遍历的分支的岔路口。这个节点不出栈。
停止出栈操作，将 branch 的栈顶自增，记结果为 top 。表示选择 dfs 栈顶节点的第 top+1 个分支。

将 dfs 栈顶节点的第 top+1 个前驱节点推入 dfs 栈，如果该节点也有多个前驱节点，则将 0 推入 branch 栈。
重复以上过程，直到 dfs 栈为空。

void Solution::findPaths(vector<vector<string>>& result, const int end,
		const vector<vertex>& graph, const vector<string>& dictionary){
	const int begin = graph.size() - 1;
	vector<string> words;
	stack<int> dfs;
	stack<int> branch;
	dfs.push(end);
	// cout << "push " << end << " " << dictionary[end] << endl;
	words.push_back(dictionary[end]);
	if(graph[end].size() > 1){
		branch.push(0);
		// cout << "branch push 0" << endl;
	}
	while(1){
		// push
		while(dfs.top() != begin){
			const int& prev = dfs.top();
			const int& next = graph[prev][0];
			dfs.push(next);
			words.push_back(dictionary[next]);
			// cout << "push " << next << " " << dictionary[next] << endl;
			if(graph[next].size() > 1){
				branch.push(0);
				// cout << "branch push 0" << endl;
			}
		}
		
		result.push_back(words);
		
		// pop till meets a vertex with unchoosed branch
		while(!dfs.empty()){
			const int& prev = dfs.top();
			if(graph[prev].size() > 1){
				if(graph[prev].size() != branch.top() + 1){
					break;
				}
				else{
					// cout << "branch pop" << branch.top();
					branch.pop();
				}
			}
			// cout << " pop " << dfs.top() << " " << words.back() << endl;
			dfs.pop();
			words.pop_back();
		}
		
		if(dfs.empty()) break;
		// cout << dfs.top() << endl; 
		
		// turn to another branch
		int top = branch.top();
		top++;
		branch.pop();
		branch.push(top);
		// cout << branch.top() << " branch top" << endl; 
		
		const int& prev = dfs.top();
		const int& next = graph[prev][top];
		dfs.push(next);
		words.push_back(dictionary[next]);
		// cout << "push " << next << " " << dictionary[next] << endl;
		if(graph[next].size() > 1) {
			branch.push(0);
			// cout << "branch push 0" << endl;
		}
	}
}

vector<vector<string>> Solution::findLadders(string beginWord, string endWord, vector<string>& wordList){
	
	// finds if endWord is in dictionary
	int e = -1;
	for(int i = 0; i < wordList.size(); i++){
   		if(wordList[i] == endWord){
   			e = i;
   			break;
		}
	}
	if(e == -1) return vector<vector<string>>();
	
   	vector<string> dictionary = wordList;
   	dictionary.push_back(beginWord);
	const int size = dictionary.size();
	const int begin = size - 1;
	const int end = e;
	
	bool accessible = false;	// presents if the end is accessible
	
	// builds a graph by bfs
	const vector<vertex> graph = buildGraph(accessible, begin, end, size, dictionary);
	
	if(!accessible) return vector<vector<string>>();
	
	// finds all the shortest paths by dfs
	vector<vector<string>> result;
	vector<string> words;
	findPaths(result, end, graph, dictionary);
	
	for(vector<string>& v : result){
		reverse(v.begin(), v.end());
	}
	
	return result;
}

用时 $316 m s$ 。

整个算法的时间复杂度为 $O (V + E)$ 。

测试代码

现附上测试代码，方便后来人：

void print(vector<vector<string>> result){
	for(vector<string>& v : result){
		for(string& s : v){
			cout << s << " ";
		}
		cout << endl;
	}
	cout << endl;
}

int main(){
	
	string b1 = "hit", e1 = "cog";
	vector<string> l1;
	l1.push_back("hot"); l1.push_back("dot"); l1.push_back("dog");
	l1.push_back("lot"); l1.push_back("log"); l1.push_back("cog");
	
	string b2 = "hit", e2 = "cog";
	vector<string> l2;
	l2.push_back("hot"); l2.push_back("dot"); l2.push_back("dog"); 
	l2.push_back("lot"); l2.push_back("log");
	
	string b3 = "red", e3 = "tax";
	vector<string> l3;
	l3.push_back("ted"); l3.push_back("tex"); l3.push_back("red");
	l3.push_back("tax"); l3.push_back("tad"); l3.push_back("den");
	l3.push_back("rex"); l3.push_back("pee");
	
	string b4 = "magic", e4 = "pearl";
	vector<string> l4;
	string strs[20] = {"magic","manic","mania","maria","marta","maris","marty","paris","marks","party",
		"marry","parks","parry","merry","perks","perry","peaks","peary","pears","pearl"};
	for(int i = 0; i < 20; i++){
		l4.push_back(strs[i]);
	}
	
	string b5 = "qa", e5 = "sq";
	vector<string> l5;
	string qq[95] = {"si","go","se","cm","so","ph","mt","db","mb","sb",
		"kr","ln","tm","le","av","sm","ar","ci","ca","br",
		"ti","ba","to","ra","fa","yo","ow","sn","ya","cr",
		"po","fe","ho","ma","re","or","rn","au","ur","rh",
		"sr","tc","lt","lo","as","fr","nb","yb","if","pb",
		"ge","th","pm","rb","sh","co","ga","li","ha","hz",
		"no","bi","di","hi","qa","pi","os","uh","wm","an",
		"me","mo","na","la","st","er","sc","ne","mn","mi",
		"am","ex","pt","io","be","fm","ta","tb","ni","mr",
		"pa","he","lr","sq","ye"};
	for(int i = 0; i < 95; i++){
		l5.push_back(qq[i]);
	}
	
	Solution s;
	print(s.findLadders(b1,e1,l1));
	print(s.findLadders(b2,e2,l2));
	print(s.findLadders(b3,e3,l3));
	print(s.findLadders(b4,e4,l4));
	print(s.findLadders(b5,e5,l5));
	
	return 0;
}

正确的输出如下：

hit hot dot dog cog
hit hot lot log cog

red ted tex tax
red rex tex tax
red ted tad tax

magic manic mania maria marta marty party parry perry peary pearl
magic manic mania maria marta marty marry parry perry peary pearl
magic manic mania maria marta marty marry merry perry peary pearl
magic manic mania maria maris paris parks perks peaks pears pearl
magic manic mania maria maris marks parks perks peaks pears pearl

qa ca cm sm sq
qa fa fm sm sq
qa ta tm sm sq
qa pa pm sm sq
qa ca ci si sq
qa ba bi si sq
qa ma mi si sq
qa ha hi si sq
qa na ni si sq
qa la li si sq
qa ta ti si sq
qa pa pi si sq
qa ca cr sr sq
qa ba br sr sq
qa fa fr sr sq
qa ma mr sr sq
qa la lr sr sq
qa ca co so sq
qa ya yo so sq
qa ma mo so sq
qa ga go so sq
qa ha ho so sq
qa na no so sq
qa la lo so sq
qa ta to so sq
qa pa po so sq
qa ba be se sq
qa ra re se sq
qa fa fe se sq
qa ya ye se sq
qa ma me se sq
qa ga ge se sq
qa ha he se sq
qa na ne se sq
qa la le se sq
qa ra rn sn sq
qa ma mn sn sq
qa la ln sn sq
qa ra rh sh sq
qa ta th sh sq
qa pa ph sh sq
qa ra rb sb sq
qa ya yb sb sq
qa ma mb sb sq
qa na nb sb sq
qa ta tb sb sq
qa pa pb sb sq
qa ma mt st sq
qa la lt st sq
qa pa pt st sq
qa ta tc sc sq