本文为Python算法题集之一的代码示例
题目76:最小覆盖子串
说明:给你一个字符串 s
、一个字符串 t
。返回 s
中涵盖 t
所有字符的最小子串。如果 s
中不存在涵盖 t
所有字符的子串,则返回空字符串 ""
。
注意:
- 对于
t
中重复字符,我们寻找的子字符串中该字符数量必须不少于t
中该字符数量。 - 如果
s
中存在这样的子串,我们保证它是唯一的答案。
示例 1:
输入:s = "ADOBECODEBANC", t = "ABC"
输出:"BANC"
解释:最小覆盖子串 "BANC" 包含来自字符串 t 的 'A'、'B' 和 'C'。
示例 2:
输入:s = "a", t = "a"
输出:"a"
解释:整个字符串 s 是最小覆盖子串。
示例 3:
输入: s = "a", t = "aa"
输出: ""
解释: t 中两个字符 'a' 均应包含在 s 的子串中,
因此没有符合条件的子字符串,返回空字符串。
提示:
-
m == s.length
-
n == t.length
-
1 <= m, n <= 105
-
s
和t
由英文字母组成
- 问题分析
- 本题为求连续的字符串子串,t是字符串的子集
- 主要的计算为三个,1字符串子串遍历,2是字符串子串和t的集合比较,3是字符串子串的长度比较
- 基本的遍历为双层循环,从第一个元素开始,计算从此元素开始有多少次和为K,所以基本的时间算法复杂度为(On2),不过这个算法复杂度还要乘以t的集合元素多少,所以极限而言是(On3)
- 优化思路
- 优化的思路,一是简化字符串子串和t的集合比较,二是减少字符串子串的长度比较循环次数
- 因为t是固定的,所以字符串子串和t的集合比较可以分解为t的各字符的数量比
- 字符串子串长度的比较,可以用滑动窗口【双指针】的形式进行
- 从推导过程中可以知道,控制最小的满足条件的滑动窗口从左到右,就可以计算出最小覆盖子串
CheckFuncPerf
是我写的函数用时和内存占用模块,地址在这里:测量函数运行用时、内存占用的代码单元CheckFuncPerf.py以及使用方法- 测试的超长字符串文件是官网的,已上传到CSDN,地址在这里:LeetCode:最小覆盖子串测试用例,10W长度字符串1W长度子串(估计是1月31日过审)
-
标准求解,优化一层,倒在黎明前夜,超时失败
import CheckFuncPerf as cfp def minWindow(s, t): dict_t, dict_s = {}, {} list_buffer = [] for achar in t: dict_t[achar] = dict_t.get(achar, 0) + 1 ineedMeet = len(dict_t.keys()) imeet, iright = 0, 0 for iIdx in range(len(s)): if s[iIdx] in dict_t: list_buffer.append(iIdx) dict_s[s[iIdx]] = dict_s.get(s[iIdx], []) dict_s[s[iIdx]].append(iIdx) if len(dict_s.values()) < len(dict_t.values()): return "" dict_check = {} for key, value in dict_t.items(): dict_check[key] = len(dict_s[key]) if dict_check[key] >= value: imeet += 1 if imeet < ineedMeet: return "" iminlen = len(s) bcanleft = True minleft, minright = 0, 0 while bcanleft: lcharidx = list_buffer[0] ileft = lcharidx tmpdict = dict_check.copy() tmplist_buffer = list_buffer.copy() bcanright = True while bcanright: rcharidx = tmplist_buffer[-1] iright = rcharidx if tmpdict[s[rcharidx]] == dict_t[s[rcharidx]]: bcanright = False else: tmpdict[s[rcharidx]] -= 1 tmplist_buffer.pop(-1) if iminlen > iright - ileft: iminlen = iright - ileft + 1 minleft = ileft minright = iright if dict_check[s[lcharidx]] == dict_t[s[lcharidx]]: bcanleft = False else: list_buffer.pop(0) dict_check[s[lcharidx]] -= 1 return s[minleft:minright+1] s = open(r'testcase/hot12_big.txt', mode='r', encoding='utf-8').read() t = open(r'testcase/hot12_big_t.txt', mode='r', encoding='utf-8').read() result = cfp.getTimeMemoryStr(minWindow, s, t) print(result['msg'], '执行结果 = {}'.format(len(result['result']))) # 运行结果 函数 minWindow 的运行时间为 1233597.94 ms;内存使用量为 176.00 KB 执行结果 = 10742
-
优化版【过滤t子字符集+滑动窗口】,马马虎虎,超越64%
def minWindow_ext1(s, t): dict_t, dict_s = {}, {} list_buffer = [] for achar in t: dict_t[achar] = dict_t.get(achar, 0) + 1 ineedMeet = len(dict_t.keys()) imeet, iright = 0, 0 for iIdx in range(len(s)): if s[iIdx] in dict_t: list_buffer.append(iIdx) dict_s[s[iIdx]] = dict_s.get(s[iIdx], []) dict_s[s[iIdx]].append(iIdx) if len(dict_s.values()) < len(dict_t.values()): return "" dict_check = {} for key, value in dict_t.items(): dict_check[key] = len(dict_s[key]) if dict_check[key] >= value: imeet += 1 if imeet < ineedMeet: return "" iminlen, imaxright = len(s), list_buffer[-1] minleft, minright, imeet, ilistpos, ileftpos = 0, 0, 0, 0, 0 ileft, iright = list_buffer[0], list_buffer[0] dict_check = {} while ilistpos < len(list_buffer): iright = list_buffer[ilistpos] acharidx = list_buffer[ilistpos] dict_check[s[acharidx]] = dict_check.get(s[acharidx], 0) + 1 if dict_check[s[iright]] == dict_t[s[iright]]: imeet += 1 while imeet == ineedMeet: if iminlen > iright - ileft: iminlen = iright - ileft minleft = ileft minright = iright dict_check[s[ileft]] -= 1 if dict_check[s[ileft]] < dict_t[s[ileft]]: imeet -= 1 ileftpos += 1 ileft = list_buffer[ileftpos] ilistpos += 1 return s[minleft:minright + 1] s = open(r'testcase/hot12_big.txt', mode='r', encoding='utf-8').read() t = open(r'testcase/hot12_big_t.txt', mode='r', encoding='utf-8').read() result = cfp.getTimeMemoryStr(minWindow_ext1, s, t) print(result['msg'], '执行结果 = {}'.format(len(result['result']))) # 运行结果 函数 minWindow_ext1 的运行时间为 84.02 ms;内存使用量为 1036.00 KB 执行结果 = 10742
-
加强版【滑动窗口+字典分解集合比较】,有所改善,超越77%
def minWindow_ext2(s, t): dict_t = {} for tchar in t: dict_t[tchar] = dict_t.get(tchar, 0) + 1 dict_window = {} imeet = 0 ineedmeet = len(dict_t) ileft, iright, istartpos = 0, 0, 0 iminlen = len(s)+1 while iright < len(s): achar = s[iright] if achar in dict_t: dict_window[achar] = dict_window.get(achar, 0) + 1 if dict_window[achar] == dict_t[achar]: imeet += 1 iright += 1 while imeet == ineedmeet: if iright - ileft < iminlen: istartpos = ileft iminlen = iright - ileft tmpChar = s[ileft] if tmpChar in dict_window: if dict_window[tmpChar] == dict_t[tmpChar]: imeet -= 1 dict_window[tmpChar] -= 1 ileft += 1 if iminlen == len(s)+1: return "" return s[istartpos:istartpos + iminlen] s = open(r'testcase/hot12_big.txt', mode='r', encoding='utf-8').read() t = open(r'testcase/hot12_big_t.txt', mode='r', encoding='utf-8').read() result = cfp.getTimeMemoryStr(minWindow_ext2, s, t) print(result['msg'], '执行结果 = {}'.format(len(result['result']))) # 运行结果 函数 minWindow_ext2 的运行时间为 77.02 ms;内存使用量为 8.00 KB 执行结果 = 10742
一日练,一日功,一日不练十日空
may the odds be ever in your favor ~