395. Longest Substring with At Least K Repeating Characters

最新推荐文章于 2021-07-08 02:31:34 发布

chiping6006

最新推荐文章于 2021-07-08 02:31:34 发布

阅读量65

点赞数

原文链接：https://my.oschina.net/cofama/blog/873084

版权

原题链接

Find the length of the longest substring T of a given string (consists of lowercase letters only) such that every character in T appears no less than k times.

这题可以用分治法，递归实现。首先把整个字符串遍历一次，记录每个字符出现的次数。然后第二次遍历，目标是找出所有出现次数小于k的字符所在的位置，由于这些字符不可能出现在我们要找的子串中，所以可以把它们作为分割点，由此分割而得的子串再进行递归处理。

具体分割子串的方法：用first、last跟踪记录子串的开头和结尾，一开始两者都等于字符串的首位。如果当前字符出现次数大于等于k，那么last移到当前字符的下一位置。如果当前字符出现次数小于k，说明碰到了一个分割点，这是要判断由first、last标记的子串长度是否大于等于k，是的话就递归处理这个子串，否则表明这个子串不可能包含重复k次的字符，忽略它。递归返回的结果要与当前的最大子串长度进行比较、更新。不论是否递归子串，碰到分割点时都要把first移到当前字符的后一位，表明上一个子串处理完毕，新的子串开始了。

根据我的算法，只有在碰到出现次数小于k的字符时才有可能触发子串递归，所以存在一种情况，就是字符串遍历到的最后一位是出现次数大于等于k的字符时，会遗漏掉一个从first开始一直延伸到字符串结尾的子串。因此遍历结束后要再进行一个额外的判断，如果last-first>=k，那么这个子串也要递归；这种情况下还有更特殊的情况，就是first在整个遍历中都没有移动过，这说明整个字符串都是符合要求的。

class Solution {
public:
    int longestSubstring(string s, int k) {
        return helper(s, 0, s.size(), k);
    }
    
    int helper(string &s, int start, int end, int k) {
        if(end-start<k) return 0;
        int count[26] = {0};
        int i;
        int first=start, last=start;
        int maxlen = 0;
        for(i=start; i<end; i++) count[s[i]-'a']++;
        for(i=start; i<end; i++) {
            if(count[s[i]-'a']<k) {
                if(last-first>=k) maxlen = max(maxlen, helper(s,first,last,k));
                first = i+1;
            }
            else if(count[s[i]-'a']>=k) {
                last = i+1;
            }
        }
        if(last-first>=k) {
            if(first==start) maxlen = last-first;
            else maxlen = max(maxlen, helper(s,first,last,k));
        }
        return maxlen;
    }
};

第二种算法（这个不是我写的，但提供了不同思路）：two pointer

public int longestSubstring(String s, int k) {
    int d = 0;
    
    for (int numUniqueTarget = 1; numUniqueTarget <= 26; numUniqueTarget++)
        d = Math.max(d, longestSubstringWithNUniqueChars(s, k, numUniqueTarget));
    
    return d;
}

private int longestSubstringWithNUniqueChars(String s, int k, int numUniqueTarget) {
    int[] map = new int[128];
    int numUnique = 0; // counter 1
    int numNoLessThanK = 0; // counter 2
    int begin = 0, end = 0;
    int d = 0;
    
    while (end < s.length()) {
        if (map[s.charAt(end)]++ == 0) numUnique++; // increment map[c] after this statement
        if (map[s.charAt(end++)] == k) numNoLessThanK++; // inc end after this statement
        
        while (numUnique > numUniqueTarget) {
            if (map[s.charAt(begin)]-- == k) numNoLessThanK--; // decrement map[c] after this statement
            if (map[s.charAt(begin++)] == 0) numUnique--; // inc begin after this statement
        }
        
        // if we found a string where the number of unique chars equals our target
        // and all those chars are repeated at least K times then update max
        if (numUnique == numUniqueTarget && numUnique == numNoLessThanK)
            d = Math.max(end - begin, d);
    }
    
    return d;
}

所谓的两指针，也是用两个变量来维护子串的头和尾。但他的思路和我不同，不用分割字符串。由于我们不知道要找的子串中有多少种字符，可能值是从1到26，所以我们可以遍历所有可能的字符种数。每一趟遍历都要把字符串从头到尾扫一遍，并且一边扫一边记录每个字符出现的次数，注意这个次数是指在子串中出现的次数，而不是整个字符串中的次数。numUnique记录当前子串中的字符种数，numNoLessThanK记录重复k次以上的字符种数。

例如在规定字符种数为2的遍历中，begin和end标记子串的头尾，而且要确保这个字符串中只有两种字符。如果当前子串为aabbb，下一个字符为c，如果把c加入子串就超过种数限制了，所以要把begin前移，当子串变为bbb后，才可以把c加入。同时，每当子串中字符种数为2，且所有字符都出现了至少k次（numUnique==numNoLessThanK），说明这可能是我们最终要找的子串，需要记录它的长度并更新最大值。

转载于:https://my.oschina.net/cofama/blog/873084

chiping6006

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
395. Longest Substring with At Least K Repeating Characters

原题链接Find the length of the longest substring T of a given string (consists of lowercase letters only) such that every character in T appears no l...
复制链接

扫一扫