LeetCode 139. Word Break

wenyq7

于 2019-12-03 04:02:01 发布

阅读量142

点赞数

分类专栏： LeetCode

本文链接：https://blog.csdn.net/qq_37333947/article/details/103348165

版权

LeetCode 专栏收录该内容

287 篇文章 1 订阅

订阅专栏

题目：

Given a non-empty string s and a dictionary wordDict containing a list of non-empty words, determine if s can be segmented into a space-separated sequence of one or more dictionary words.

Note:

The same word in the dictionary may be reused multiple times in the segmentation.
You may assume the dictionary does not contain duplicate words.

Example 1:

Input: s = "leetcode", wordDict = ["leet", "code"]
Output: true
Explanation: Return true because "leetcode" can be segmented as "leet code".

Example 2:

Input: s = "applepenapple", wordDict = ["apple", "pen"]
Output: true
Explanation: Return true because "applepenapple" can be segmented as "apple pen apple".
             Note that you are allowed to reuse a dictionary word.

Example 3:

Input: s = "catsandog", wordDict = ["cats", "dog", "sand", "and", "cat"]
Output: false

又是一道令我非常头大的题。

第一种做法是最普通的简单粗暴的方法，采用递归进行操作（感觉对于处理子字符串的问题，递归都是很经典的做法）。比如我们先check一个字符串的前面一部分存在在dict里，那么我们就需要判断后面一部分是否也能够分割。因此，我们需要写一个helper function，函数的参数里要加上当前判断的位置。由于题目给出的dict是vector，我们需要把它存成hashset以提高查找的速度。在递归函数中，如果位置已经到了字符串的末尾，则匹配成功；否则，遍历从这个位置开始的substring，如果substring存在于字典中则递归调用这个函数，判断后一部分是否能够break。但是这样做的时间复杂度比较高，因为递归会存在大量重复的计算。于是我们可以采用memo数组，存放某个index是否能够分割的结果，我们将其初始化为-1，如果能分割就赋值为1，不能就赋值为0，这样就可以在memo存在的情况下直接返回，而不需要重复计算了。

Runtime: 20 ms, faster than 30.84% of C++ online submissions for Word Break.

Memory Usage: 15.7 MB, less than 26.41% of C++ online submissions for Word Break.

class Solution {
public:
    bool helper(string s, unordered_set<string>& dict, int index, vector<int>& memo) {
        if (index == s.size()) {
            return true;
        }
        if (memo[index] != -1) {
            return memo[index];
        }
        for (int i = index; i < s.size(); i++) {
            if (dict.count(s.substr(index, i - index + 1)) && helper(s, dict, i + 1, memo)) {
                memo[index] = 1;
                return true;
            }
        }
        memo[index] = 0;
        return false;
    }
    bool wordBreak(string s, vector<string>& wordDict) {
        unordered_set<string> dict(wordDict.begin(), wordDict.end());
        vector<int> memo(s.size(), -1);
        return helper(s, dict, 0, memo);
    }
};

2020.10.6 Java版dp解法

我们通过dp[i]表示s[0, i)这个substring可以被broken，所以整个问题就是dp[len]是否可以被broken，即s[0, len)是否能被broken。因此我们声明一个长度为len + 1的数组，来存放dp[0] - dp[len]的结果。对于每个dp[i]，我们可以把它拆分成两个部分，假设是j，那dp[i] == true的条件就是dp[j] == true （s[0, j)可以被broken）&& s[j, i)在wordlist里。主要就是注意处理边界条件。然后就可以快乐写代码了。

Runtime: 6 ms, faster than 65.16% of Java online submissions for Word Break.

Memory Usage: 39.2 MB, less than 72.98% of Java online submissions for Word Break.

class Solution {
    public boolean wordBreak(String s, List<String> wordDict) {
        Set<String> wordSet = new HashSet<>(wordDict);
        boolean[] dp = new boolean[s.length() + 1];
        dp[0] = true;
        for (int i = 0; i < dp.length; i++) {
            for (int j = 0; j < i; j++) {
                // dp[j]: [0, j) can be broken
                // s.substring(j ,i): [j, i) is in set
                if (dp[j] && wordSet.contains(s.substring(j, i))) {
                    dp[i] = true;
                    break;
                }
            }
        }
        return dp[dp.length - 1];
    }
}

以下是曾经的cpp笔记：

看了下dp解法，dp[i]表示s[0, 1, ..., i - 1]是否可以拆分。对于dp[i]来说，判断它是否可以拆分，我们还需要在中间插入一个循环，用来拆分这个子字符串是否能够在不同的位置被拆分。需要注意的几点就是，首先是dp数组的大小要是s.size() + 1，因为要将空字符串考虑在内，所有的子字符串dp好像都是这个套路。for循环遍历时也要遍历到dp的size而不是s的size。在内层循环中，取substring的时候，substring的长度是i - j，不需要再+1，这里还没完全想透。另外也可以在if里面true了就直接break掉，不需要重复计算了。下面是break前后的时空消耗：

Runtime: 16 ms, faster than 47.49% of C++ online submissions for Word Break.

Memory Usage: 14.1 MB, less than 52.83% of C++ online submissions for Word Break.

Runtime: 8 ms, faster than 76.77% of C++ online submissions for Word Break.

Memory Usage: 14.4 MB, less than 43.40% of C++ online submissions for Word Break.

class Solution {
public:
    bool wordBreak(string s, vector<string>& wordDict) {
        unordered_set<string> dict(wordDict.begin(), wordDict.end());
        vector<int> dp(s.size() + 1, 0);  // should be size + 1 to consider empty string
        dp[0] = true;  // make empty string to be true
        for (int i = 0; i < dp.size(); i++) {  // use dp.size(), not s.size()
            for (int j = 0; j < i; j++) {
                if (dp[j] && dict.count(s.substr(j, i - j))) {  // substr len is i - j, no + 1
                    dp[i] = true;
                    break;
                }
            }
        }
        
        return dp.back();
    }
};