题目
Given two strings text1 and text2, return the length of their longest common subsequence.
A subsequence of a string is a new string generated from the original string with some characters(can be none) deleted without changing the relative order of the remaining characters. (eg, “ace” is a subsequence of “abcde” while “aec” is not). A common subsequence of two strings is a subsequence that is common to both strings.
If there is no common subsequence, return 0.
Example 1:
Input: text1 = “abcde”, text2 = “ace”
Output: 3
Explanation: The longest common subsequence is “ace” and its length is 3.
Example 2:
Input: text1 = “abc”, text2 = “abc”
Output: 3
Explanation: The longest common subsequence is “abc” and its length is 3.
Example 3:
Input: text1 = “abc”, text2 = “def”
Output: 0
Explanation: There is no such common subsequence, so the result is 0.
动态规划
首先说明一下子序列和子串的区别:
子序列是字符串中可以不连续的字符组成的,而子串必须是连续的。
首先定义状态:
dp[i][j]:表示对 text1[1:i] 和 text2[1:j],它们的 LCS 长度是 dp[i][j]。
初始化:
dp[i][0]=0,dp[0][j]=0,可以理解为当其中一个字符串为空时,最长公共子序列的长度为0。
状态转移方程:由前一状态推出dp[i][j]。
如果某个字符在 lcs 中,那么这个字符肯定同时存在于 s1 和 s2 中,因为 lcs 是最长公共子序列嘛。
dp[i][j] = dp[i-1][j-1]+1
否则,s1[i] 和 s2[j] 这两个字符至少有一个不在 lcs 中,需要丢弃一个。
dp[i][j] = max(dp[i][j-1],dp[i-1][j],dp[i-1][j-1])
由于dp[i-1][j-1]总是小于dp[i][j-1]和dp[i-1][j],这里是求最大值,所以可以简化为:dp[i][j] = max(dp[i][j-1],dp[i-1][j])
结果就是dp[n1][n2]。
python代码
class Solution(object):
def longestCommonSubsequence(self, text1, text2):
"""
:type text1: str
:type text2: str
:rtype: int
"""
n1 = len(text1)
n2 = len(text2)
dp = [[0 for j in range(n2+1)] for i in range(n1+1)]
for i in range(1,n1+1):
for j in range(1,n2+1):
if text1[i-1] == text2[j-1]:
dp[i][j] = dp[i-1][j-1]+1
else:
dp[i][j] = max(dp[i][j-1],dp[i-1][j])
return dp[n1][n2]