【DP】【DFS】LeetCode - Word Break I - II、Concatenated Words

LeetCode - 139. Word Break

Given a non-empty string s and a dictionary wordDict containing a list of non-empty words, determine if s can be segmented into a space-separated sequence of one or more dictionary words.

Note:
The same word in the dictionary may be reused multiple times in the segmentation.
You may assume the dictionary does not contain duplicate words.

Example 1:
Input: s = “leetcode”, wordDict = [“leet”, “code”] Output: true

Example 2:
Input: s = “applepenapple”, wordDict = [“apple”, “pen”] Output: true
Explanation: Return true because “applepenapple” can be segmented as “apple pen apple”.
Note that you are allowed to reuse a dictionary word.

Example 3:
Input: s = “catsandog”, wordDict = [“cats”, “dog”, “sand”, “and”, “cat”] Output: false

  就是用一个 vector 中的单词拼接出给定字符串 s,每个单词可以重复用无限次。
  开设一个 n + 1 大小的 dp 数组 dp[i] 表示字符串 s 前 i 个字符能否被拼接出来,可见 dp[0] = true,其他初始设为 false。对于每一个位置 dp[i],如果对于 0 到 i 中的一个位置 j 有 dp[i - j] 为 true 且 s.substr(i - j, j) 存在于字典中,说明可以用 dp[i - j] 的拼接方式再拼接上 s.substr(i - j, j) 来拼接出 dp[i],所以 dp[i] = true。
  在对于每一个 dp[i] 计算时,不需要计算 0 到 i 中所有位置,因为比如字典中最小的单词长度为 4,那么 i,i - 1,i - 2,i - 3 这几个位置就不需要计算了。(beats 100%)

bool wordBreak(string s, vector<string>& wordDict) {
    unordered_set<string> dic;
    const int n = s.length();
    size_t max_len = 0, min_len = INT_MAX;
    vector<bool> dp(n + 1, false);
    dp[0] = true;
    for(const string& w : wordDict) {
    	dic.insert(w);
        max_len = max(max_len, w.length());
        min_len = min(min_len, w.length());
    }
    for(size_t i = min_len; i <= n; ++i)
        for(size_t j = min_len; j <= i && j <= max_len; ++j)
            if(dp[i - j] && dic.find(s.substr(i - j, j)) != dic.end()) {
                dp[i] = true;
                break;
            }
    return dp[n];
}
}
LeetCode - 140. Word Break II

Given a non-empty string s and a dictionary wordDict containing a list of non-empty words, add spaces in s to construct a sentence where each word is a valid dictionary word. Return all such possible sentences.
The same word in the dictionary may be reused multiple times in the segmentation.

Example 1:
Input: s = “catsanddog” wordDict = [“cat”, “cats”, “and”, “sand”, “dog”]
Output: [ “cats and dog”, “cat sand dog” ]

Example 2:
Input: s = “pineapplepenapple” wordDict = [“apple”, “pen”, “applepen”, “pine”, “pineapple”]
Output: [ “pine apple pen apple”, “pineapple pen apple”, “pine applepen apple” ]

Example 3:
Input: s = “catsandog” wordDict = [“cats”, “dog”, “sand”, “and”, “cat”]
Output: []

  这道题就是刚才上边的题的返回值从是不是可以拼接,变成返回所有拼接可能情况。用的是 DFS + DP,代码如下 (beats 100%)

size_t max_len = 0, min_len = INT_MAX;
unordered_set<string> dict;
    
void dfs(const string& s, vector<string>& retain, int idx, vector<string>& res) {
	if (idx == s.size()) {
		string tmp;
		for (const string& str : retain)
			tmp += str + " ";
		tmp.pop_back();
		res.push_back(tmp);
		return;
	}

	for (size_t len = min_len; len <= max_len && idx + len <= s.size(); len++)
		if (dict.find(s.substr(idx, len)) != dict.end()) {
			retain.push_back(s.substr(idx, len));
			dfs(s, retain, idx + len, res);
			retain.pop_back();
		}
}

vector<string> wordBreak(string s, vector<string>& wordDict) {
	const size_t N = s.length();
    vector<bool> dp(N + 1, false);
	dp[0] = true;
	for (const auto& w : wordDict) {
        dict.insert(w);
		max_len = max(max_len, w.length());
        min_len = min(min_len, w.length());
	}
	for (size_t i = min_len; i <= N; i++)
		for (size_t j = min_len; j <= i && j <= max_len; ++j)  // j: 查询单词长度, i >= j 不能写成 i - j >= 0 !!!
			if (dp[i - j] && dict.find(s.substr(i - j, j)) != dict.end()) {
				dp[i] = true;
				break;
			}

    vector<string> res, retain;
	if(dp[N])
		dfs(s, retain, 0, res);
	return res;
}

  值得注意的就是,size_t 不能用 i - j >= 0 判断,因为 size_t 永远是 >= 0 的数。

LeetCode - 472. Concatenated Words

Given a list of words (without duplicates), please write a program that returns all concatenated words in the given list of words. A concatenated word is defined as a string that is comprised entirely of at least two shorter words in the given array.

Example:
Input: [“cat”,“cats”,“catsdogcats”,“dog”,“dogcatsdog”,“hippopotamuses”,“rat”,“ratcatdogcat”]
Output: [“catsdogcats”,“dogcatsdog”,“ratcatdogcat”]

  就是在一个字符串数组 words 中找出所有能够用 words 中两个或以上的单词拼接出来的单词。
  使用 DFS 进行递归判断一个字符串是否可以被其他字符串拼接而成,因为一定可以被自己组成,所以设置 cur_words 记录被几个字符串拼接而成,> 1 才能返回 true。网上有做法是先把这个字符串自身 erase 出去,然后判断完再 insert 进来,没有这样做的原因有两个:

  • 删除和插入还是需要时间的,多加一次自身的判断并不会增加很多时间
  • 有了 cur_words 这个值,题目要求至少被几个字符串拼接而成,都可以做了
unordered_set<string> dict;
size_t max_len = 0, min_len = INT_MAX, start;

bool dfs_check(const string& word, int cur_words) {
    if(word.empty() && cur_words > 1) return true;
    for(int i = start; i <= min(word.size(), max_len); ++i)
        if(dict.find(word.substr(0, i)) != dict.end()
          && dfs_check(word.substr(i), cur_words + 1)) 
            return true;
    return false;
}

vector<string> findAllConcatenatedWordsInADict(vector<string>& words) {
    for(const string& w : words){
        dict.insert(w);
        max_len = max(max_len, w.length());
        min_len = min(min_len, w.length());
    }
    start = max(min_len, size_t(1));    // 因为有 "" 存在
    vector<string> res;
    for(const string& w : words) 
        if(dfs_check(w, 0))
            res.push_back(w);
    return res;
}
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值