字典树应用及用哈希表代替

最新推荐文章于 2023-03-20 16:07:43 发布

weixin_30765505

最新推荐文章于 2023-03-20 16:07:43 发布

阅读量167

点赞数

文章标签：数据结构与算法

原文链接：http://www.cnblogs.com/echie/p/9598569.html

版权

[LeetCode 30] Substring with Concatenation of All Words

题目

You are given a string s, and a list of words words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters.

测试案例

Input:
  s = "barfoothefoobarman",
  words = ["foo","bar"]
Output: [0,9]
Explanation: Substrings starting at index 0 and 9 are "barfoor" and "foobar" respectively.
The output order does not matter, returning [9,0] is fine too.

Input:
  s = "wordgoodstudentgoodword",
  words = ["word","student"]
Output: []

思路

假设字典长度为 num，单词长度为 strlength，字符串长度为 length。为了描述方便，记字典中的单词为 word，字符串中长度为 strlength 的子串为 str。

先将字典中的所有 word 放入字典树或者哈希表中(两者查询的时间复杂度相同，用哈希表的优点是：创建简单。字典树的创建太复杂)。key 为 word 本身，value 为 word 在字典中的下标。存在相同 word 时，只第一次放入，后续的 word 不放入，且不修改 value。
在放入的过程中，用数组记录字典中各 word 的重复次数，保存在 record[value]。value 为 word 在哈希表中的 value。
由于字典中 word 长度相同，可以将字符串分为 strlength 个序列。第 i 个序列含有 k 个 str：
\[ s[i, i + strlength), \dots,s[i + (k - 1)strlength,i + kstrlength),\;//s[i,j)表示子串s.substring(i,j)\\ s.t.\;0 \leq i \lt strlength,\;i + kstrlength \leq length \]
查找 s 中所有的 concatenation 就是分别在每个序列中查找 concatenation 的问题。而在序列中查找 concatenation 就是判断序列中是否存在连续的 str 序列，各 str 的种类和个数均与字典中 word 对应相等。
通过查询哈希表，可以将序列中的每个 str 转变为其在字典中的下标，当 str 不在字典时，下标为0。因为字典中不同的 word 对应的 value 不相同。
从而问题转变为：给定一个整数序列 s 和一个整数数组 dict，在 s 找出所有的连续子列，满足子列中各数字的种类和个数和集合中的元素对应相同，不要求次序也对应相同。基于 record 数组，此问题可在线性时间求解。具体过程不写了。

代码如下

class Solution {5
    public List<Integer> findSubstring(String s, String[] words) {
        int num = words.length, length = s.length();
        List<Integer> res = new LinkedList<>();
        if(num == 0 || length == 0){
            return res;
        }
        int  strlen = words[0].length();
        //初始时为各word的个数，找到时减1，
        int[] record = new int[num]; 
        HashMap<String, Integer> dict = new HashMap<>(num * 4/3 + 1);
        //将每个 word 放入哈希表中
        for(int i = 0; i < num; i++){
            Integer temp;
            //查找是否存在相同 word，不存在时放入，value为word下标，更新record[i]
            if((temp = dict.get(words[i])) == null){
                dict.put(words[i], i);
                record[i]++;
            }
            //word 存在，从哈希表中取出word对应的value，更新record[value]
            else{
                record[temp]++;
            }
        }
        //统计当前队列中
        int count = 0;
        
        //把已匹配的word下标保存到队列，当 当前str的value 为0时，需要通过出队，清除之前的记录。
        Queue<Integer> queue = new LinkedList<>();
        //考虑 strlen 个序列
        for(int start = 0; start < strlen && start < length; start++){
            //循环开始时，queue 应为空队列，count = 0, record为空数组
            while(count > 0){
                count--;
                record[queue.poll()]++;
            }
            //考虑从i开始的str是否存在word匹配
            for(int i = start; i + strlen <= length; i += strlen){
                //查找str的value
                Integer index = dict.get(s.substring(i, i + strlen));
                //不匹配任何word，那么重置record，queue，count
                if(index == null){
                    while(count > 0){
                        count--;
                        record[queue.poll()]++;
                    }
                }
                //匹配的word已被匹配完，必须找到上一次匹配该字符串的位置，将其及其前面的匹配都清除掉。
                else if(record[index] == 0){
                    int temp;
                    while((temp = queue.poll()) != index){
                        count--;
                        record[temp]++;
                    }                                 
                    queue.offer(index);
                    if(count == num){
                        res.add(i + strlen - num * strlen);
                    }
                }
                //找到一个有效匹配
                else{
                    if(++count == num){
                        res.add(i + strlen - num * strlen);
                    }
                    record[index]--;
                    queue.offer(index);
                }
            }                  
        }
        return res;
    }
}

转载于:https://www.cnblogs.com/echie/p/9598569.html

weixin_30765505

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
字典树应用及用哈希表代替

[LeetCode 30] Substring with Concatenation of All Words题目You are given a string s, and a list of words words, that are all of the same length. Find all starting indices of substring(s) in s that is ...
复制链接

扫一扫