JAVA程序设计：恢复空格（LeetCode：面试题 17.13）

最新推荐文章于 2022-07-02 10:00:32 发布

信仰..

最新推荐文章于 2022-07-02 10:00:32 发布

阅读量320

点赞数

分类专栏： LeetCode刷题记录(基于java语言)

本文链接：https://blog.csdn.net/haut_ykc/article/details/107220039

版权

LeetCode刷题记录(基于java语言) 专栏收录该内容

620 篇文章 38 订阅

订阅专栏

哦，不！你不小心把一个长篇文章中的空格、标点都删掉了，并且大写也弄成了小写。像句子"I reset the computer. It still didn’t boot!"已经变成了"iresetthecomputeritstilldidntboot"。在处理标点符号和大小写之前，你得先把它断成词语。当然了，你有一本厚厚的词典dictionary，不过，有些词没在词典里。假设文章用sentence表示，设计一个算法，把文章断开，要求未识别的字符最少，返回未识别的字符数。

注意：本题相对原题稍作改动，只需返回未识别的字符数

示例：

输入：
dictionary = ["looked","just","like","her","brother"]
sentence = "jesslookedjustliketimherbrother"
输出： 7
解释：断句后为"jess looked just like tim her brother"，共7个未识别字符。
提示：

0 <= len(sentence) <= 1000
dictionary中总字符数不超过 150000。
你可以认为dictionary和sentence中只包含小写字母。

思路：动态规划，定义dp[i]表示句子前i个字符中最少未识别的字符数量。

class Solution {
    public int respace(String[] dictionary, String sentence) {

        int n = sentence.length();
        int[] dp = new int[n + 1];
        String[][] s = new String[n][n];
        Map<String, Boolean> map = new HashMap<>();

        for (int i = 0; i < dictionary.length; i++)
            map.put(dictionary[i], true);

        for (int i = 0; i < n; i++) {
            String tmp = "";
            for (int j = i; j < n; j++) {
                tmp += sentence.charAt(j);
                s[i][j] = tmp;
            }
        }

        for (int i = 1; i <= n; i++) {
            for (int j = i; j <= n; j++) {
                dp[j] = Math.max(dp[j], dp[j - 1]);
                if (map.containsKey(s[i - 1][j - 1])) {
                    int len = s[i - 1][j - 1].length();
                    dp[j] = Math.max(dp[j], dp[j - len] + len);
                }
            }
        }

        return n - dp[n];
        
    }
}

进一步优化，对于我的前一个方法，hashMap查找会额外带来一个log的复杂度，因此我们需要考虑如何优化单词查找这一步，我们可以考虑采用字典树优化。差距确实有点大呀。。

class Solution {

    class Trie {

        Trie[] next;
        boolean isEnd;

        public Trie() {
            next = new Trie[26];
            isEnd = false;
        }

        public void insert(String s) {
            Trie root = this;

            for (int i = s.length() - 1; i >= 0; i--) {
                int t = s.charAt(i) - 'a';
                if (root.next[t] == null)
                    root.next[t] = new Trie();
                root = root.next[t];
            }
            root.isEnd = true;
        }
    }

    public int respace(String[] dictionary, String sentence) {

        Trie root = new Trie();
        int n = sentence.length();
        int[] dp = new int[n + 1];

        for (String word : dictionary)
            root.insert(word);

        Arrays.fill(dp, Integer.MAX_VALUE);

        dp[0] = 0;
        for (int i = 1; i <= n; i++) {
            dp[i] = dp[i - 1] + 1;
            Trie cur = root;
            for (int j = i; j >= 1; j--) {
                int t = sentence.charAt(j - 1) - 'a';
                if (cur.next[t] == null)
                    break;
                else if (cur.next[t].isEnd)
                    dp[i] = Math.min(dp[i], dp[j - 1]);
                if (dp[i] == 0)
                    break;
                cur = cur.next[t];
            }
        }

        return dp[n];

    }
}

上述字典树的方法能够使得我们在时间复杂度上进一步优化，但是空间复杂度仍然很高，这主要是因为引入了树形结构，在每开创一个节点时都会引入26大小的开销，则总的时间复杂度变成了O(|dictionary|*26+n)，为了进一步优化，我们需要找到一个能够替代字典树优良查找性能的方法，而字符串哈希是我们在处理字符串子串问题经常会考虑的，因此我们借助Rabin-Karp算法的思想解决这道题，对这个算法不清楚的可以简单百度一下，这算法不难。

class Solution {

    static final long P = Integer.MAX_VALUE;
    static final long BASE = 41; //进制数

    public int respace(String[] dictionary, String sentence) {

        Set<Long> hashValues = new HashSet<>();

        for (String word : dictionary)
            hashValues.add(getHash(word));

        int n = sentence.length();
        int[] dp = new int[n + 1];

        Arrays.fill(dp, n);

        dp[0] = 0;
        for (int i = 1; i <= n; i++) {
            dp[i] = dp[i - 1] + 1;
            long hashValue = 0;
            for (int j = i; j >= 1; j--) {
                int t = sentence.charAt(j - 1) - 'a' + 1;
                hashValue = (hashValue * BASE + t) % P;
                if (hashValues.contains(hashValue))
                    dp[i] = Math.min(dp[i], dp[j - 1]);
            }
        }

        return dp[n];

    }

    private long getHash(String s) {
        long hashValue = 0;
        for (int i = s.length() - 1; i >= 0; i--)
            hashValue = (hashValue * BASE + s.charAt(i) - 'a' + 1) % P;
        return hashValue;
    }
}

信仰..

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
JAVA程序设计：恢复空格（LeetCode：面试题 17.13）

哦，不！你不小心把一个长篇文章中的空格、标点都删掉了，并且大写也弄成了小写。像句子"I reset the computer. It still didn’t boot!"已经变成了"iresetthecomputeritstilldidntboot"。在处理标点符号和大小写之前，你得先把它断成词语。当然了，你有一本厚厚的词典dictionary，不过，有些词没在词典里。假设文章用sentence表示，设计一个算法，把文章断开，要求未识别的字符最少，返回未识别的字符数。注意：本题相对原题稍作改动，只需返
复制链接

扫一扫

专栏目录