子串子序列问题

最新推荐文章于 2024-07-24 12:50:47 发布

-柚子皮-

最新推荐文章于 2024-07-24 12:50:47 发布

阅读量6.8k

点赞数 1

分类专栏： OJ 文章标签： c++

本文链接：https://blog.csdn.net/pipisorry/article/details/39434403

版权

OJ 专栏收录该内容

15 篇文章 0 订阅

订阅专栏

http://blog.csdn.net/pipisorry/article/details/39434403

最长公共子序列

给定两个字符串 text1 和 text2，返回这两个字符串的最长公共子序列的长度。如果不存在公共子序列，返回 0 。

一个字符串的子序列是指这样一个新的字符串：它是由原字符串在不改变字符的相对顺序的情况下删除某些字符（也可以不删除任何字符）后组成的新字符串。

例如，"ace" 是 "abcde" 的子序列，但 "aec" 不是 "abcde" 的子序列。

两个字符串的公共子序列是这两个字符串所共同拥有的子序列。

示例 1：

输入：text1 = "abcde", text2 = "ace"
输出：3
解释：最长公共子序列是 "ace" ，它的长度为 3 。

示例 2：

输入：text1 = "abc", text2 = "abc"
输出：3
解释：最长公共子序列是 "abc" ，它的长度为 3 。

示例 3：

输入：text1 = "abc", text2 = "def"
输出：0
解释：两个字符串没有公共子序列，返回 0 。

[力扣（LeetCode）最长公共子序列]

解法非常类似编辑距离[编辑距离Edit distance]

java

    /**
     * 两个字符串的最长公共子串长度
     */
    private static int lcs(String text1, String text2) {
        char[] chars1 = text1.toCharArray();
        char[] chars2 = text2.toCharArray();
        int[][] dp = new int[chars1.length + 1][chars2.length + 1];
        for (int i = 0; i < chars1.length; i++) {
            for (int j = 0; j < chars2.length; j++) {
                dp[i + 1][j + 1] = chars1[i] == chars2[j] ? dp[i][j] + 1 : Math.max(dp[i][j + 1], dp[i + 1][j]);
            }
        }
        return dp[chars1.length][chars2.length];
    }

python递归

def longestCommonSubsequence(self, text1: str, text2: str) -> int:
    len1 = len(text1)
    len2 = len(text2)
    his = dict()

    def lcs(text1, text2, i, j):
        if (i, j) in his:
            return his[(i, j)]
        if i >= len1:
            return 0
        if j >= len2:
            return 0
        if text1[i] == text2[j]:
            lcs0 = lcs(text1, text2, i + 1, j + 1) + 1
        else:
            lcs0 = max(lcs(text1, text2, i + 1, j), lcs(text1, text2, i, j + 1))
        his[(i, j)] = lcs0
        return lcs0

    return lcs(text1, text2, 0, 0)

python迭代

def longestCommonSubsequence(self, text1: str, text2: str) -> int:

    def lcs(text1, text2):
        len1 = len(text1)
        len2 = len(text2)
        lcs0 = [[0 for _ in range(len2 + 1)] for _ in range(len1 + 1)]
        for i in range(1, len1 + 1):
            for j in range(1, len2 + 1):
                if text1[i - 1] == text2[j - 1]:
                    lcs0[i][j] = lcs0[i - 1][j - 1] + 1
                else:
                    lcs0[i][j] = max(lcs0[i - 1][j], lcs0[i][j - 1], lcs0[i - 1][j - 1])
        return lcs0[len1][len2]

    return lcs(text1, text2)

Minimum Window Subsequence 最小窗口子序列

leetcode[https://leetcode-cn.com/problems/minimum-window-subsequence]

给定字符串 S and T，找出 S 中最短的（连续）子串 W ，使得 T 是 W 的子序列。

如果 S 中没有窗口可以包含 T 中的所有字符，返回空字符串 “”。
如果有不止一个最短长度的窗口，返回开始位置最靠左的那个。

示例 1：
输入：
S = "abcdebdde", T = "bde"
输出："bcde"
解释：
"bcde" 是答案，因为它在相同长度的字符串 "bdde" 出现之前。
"deb" 不是一个更短的答案，因为在窗口中必须按顺序出现 T 中的元素。

注：
所有输入的字符串都只包含小写字母。
S 长度的范围为 [1, 20000]。
T 长度的范围为 [1, 100]。

解法1：

动态规划DP, 二维数组dp[i][j]表示T[0...i]在S中找到的起始下标index，使得S[index, j]满足目前T[0...i]。首先找到能满足满足T中第一个字符T[0]的S中的字符下标存入dp[0][j]，也就是满足第一个字符要求一定是从这些找到的字符开始的。然后在开始找第二个字符T[1]，扫到的字符dp[j]存有index，说明可以从这里记录的index开始，找到等于T[1]的S[j]就把之前那个index存进来，说明从这个index到j满足T[0..1]，一直循环，直到T中的i个字符找完。如果此时dp[i][j]中有index，说明S[index, j]满足条件，如有多个输出最先找到的。

[[LeetCode] 727. Minimum Window Subsequence 最小窗口子序列 ]

[[LeetCode] 727. Minimum Window Subsequence 最小窗口序列 ]

解法2：

动态规划DP, 二维数组dp[i][j]表示T[0...i]在S中找到的起始下标index，使得S[index, j]满足目前T[0...i]。具体解法看python3代码。

python3（未测试）

w1 = "bde"
w2 = "abcdebdde"


class match:
    def minWindow(self, s: str, t: str) -> str:
        lens = len(s)
        lent = len(t)
        matrix = [[-1 for _ in range(lent + 1)] for _ in range(lens + 1)]
        for j in range(-1, lent):
            matrix[0][j + 1] = j
        for i in range(1, lens + 1):
            for j in range(1, lent + 1):
                if s[i - 1] == t[j - 1]:
                    matrix[i][j] = matrix[i - 1][j]
                else:
                    matrix[i][j] = matrix[i][j - 1]

        minl = lent + 2
        minwin = ''
        for id, i in enumerate(matrix[lens][1:]):
            if i > -1:
                minl0 = id - i + 1
                if minl0 < minl:
                    minl = min(minl0, minl)
                    minwin = t[i:id + 1]
        return minwin


r = match().minWindow(w1, w2)
print(r)

最小覆盖子串

给你一个字符串 s 、一个字符串 t 。返回 s 中涵盖 t 所有字符的最小子串。如果 s 中不存在涵盖 t 所有字符的子串，则返回空字符串 "" 。

注意：如果 s 中存在这样的子串，我们保证它是唯一的答案。

示例 1：

输入：s = "ADOBECODEBANC", t = "ABC"
输出："BANC"

示例 2：

输入：s = "a", t = "a"
输出："a"

提示：

1 <= s.length, t.length <= 105
s 和 t 由英文字母组成

解析：

滑动窗口的思想：

用i,j表示滑动窗口的左边界和右边界，通过改变i,j来扩展和收缩滑动窗口，可以想象成一个窗口在字符串上游走，当这个窗口包含的元素满足条件，即包含字符串T的所有元素，记录下这个滑动窗口的长度j-i+1，这些长度中的最小值就是要求的结果。
步骤一

不断增加j使滑动窗口增大，直到窗口包含了T的所有元素
步骤二

不断增加i使滑动窗口缩小，因为是要求最小字串，所以将不必要的元素排除在外，使长度减小，直到碰到一个必须包含的元素，这个时候不能再扔了，再扔就不满足条件了，记录此时滑动窗口的长度，并保存最小值
步骤三

让i再增加一个位置，这个时候滑动窗口肯定不满足条件了，那么继续从步骤一开始执行，寻找新的满足条件的滑动窗口，如此反复，直到j超出了字符串S范围。

python实现

    def minWindow(self, s: str, t: str) -> str:
        def if_still_need(need):
            still_need = False
            for cnt in need.values():
                if cnt > 0:
                    still_need = True
                    break
            return still_need

        need_dict = dict()
        for i in t:
            if i in need_dict:
                need_dict[i] += 1
            else:
                need_dict[i] = 1
        minstart = minend = start = end = 0
        lens = len(s)
        minlen = lens + 1
        lent = len(t)
        while start <= lens - lent:
            while end < lens and if_still_need(need_dict):
                if s[end] in need_dict:
                    need_dict[s[end]] -= 1
                end += 1
            while start < lens - lent and s[start] not in need_dict:
                start += 1
            if not if_still_need(need_dict) and end - start < minlen:
                minlen = end - start
                minstart = start
                minend = end

            if s[start] in need_dict:
                need_dict[s[start]] += 1
            start += 1
        return s[minstart:minend]

上面代码不可以改进，通过if need[c] > 0: needCnt -= 1取代if_still_need判断。

[leetcode最小覆盖子串]

from:http://blog.csdn.net/pipisorry/article/details/39434403

ref: