DP问题：以最长公共子序列为例的理解

Zih_An

于 2020-03-04 22:59:12 发布

阅读量332

点赞数

分类专栏：程序设计（算法向）

本文链接：https://blog.csdn.net/HANZY72/article/details/104662460

版权

程序设计（算法向）专栏收录该内容

40 篇文章 0 订阅

订阅专栏

一、题意

字符串s和t，长度分别为n和m，寻找一个最长的公共子序列si,si+1,…,si+n，要求每个字符都存在于s与t，且下标大小均为由小到大。

二、思路

问题定位

DP问题（dynamic programming）
DP问题

1. 问题识别：DP问题通常以寻找最优解（minimize or maximize）为目的。
2. 普通的暴力方案（brute force）：列举所有的可能，但是列举的过程中会产生多种重复的中间结果，浪费大量时间。
3. DP方案：使用某种存储方案（矩阵、数组等等）来保存中间结果，以节省计算时间。这就要求我们正确拆分问题成步骤可重复的子问题，而存储方案中保存的就是每一个子问题的答案，且子问题之间存在相互依赖关系。
4. DP方案的要求：各子问题间的相互依赖关系是无环的（acyclic），换句话说，各子问题之间的关系组成一个DAG图（有向无环图），它有一个拓扑顺序（topological order）。
5. DP问题难点：拆分子问题，并理清子问题之间的依赖关系。
6. DP方案的表现形式：
1. top down（memoize）（递归）：从上层子问题开始调用所依赖的下层子问题的结果，同时记录子问题的结果。
2. bottom up（循环）：从下层子问题开始逐步向上填充各个子问题的结果，上层依赖下层，直接调取下层结果。
  //说明：鉴于记忆化（memoize），每次递归节省调用时间
7. DP问题解决步骤：
1. define the subproblems
2. guess（part of the solution）
3. relate subproblem solutions
4. recurse&memoize or build dp table to bottom up
5. solve the original problems

//关于dp问题及该题解答思路的未整理草稿
//参考资料：MIT 6.006 Introduction to Algorithms, Fall 2011

/*
1. subproblem: 
I want to get max substring in the end, and I want to get the max length of the substring now.
the range of the max substring: s->[0,i], t->[0,j]  ==maxSubstring[i][j]
-find the max length substring from s->[0,i]&t->[0,j] for i∈n,j∈m
-there are n*m subproblem in total

2. guess: wether character in s&t was added to the substring or not.

3. 
if (i,j) can be added to the substring: 
maxSubstring[i][j]=max(maxSubstring[i][j-1]+1,maxSubstring[i-1][j]+1)
else: #the range can't be expand
maxSubstring[i][j]=max(maxSubstring[i-1][j-1],maxSubstring[i][j-1],maxSubstring[i-1][j])
--try every posibility of each character(added or not), and must satisfy the conditions.

4. top down or bottom up

5. solve the original problem.

*/




/*
5 steps to dp problems: 
1. define subproblems ->(how many subproblems)
->in fibonacci: 1,...,n ,each number is a subproblem, and there are n subproblems.
->in shortest path(s,v): find the sortest path δk(s,v) for v∈V,0<=k<|V|.
--k means the path from s to v used at most k edges, which is less than the numbers of vertexes except s.
--there are V^2 subproblems, becouse we have to do to each vertex except s.

2. guess(part of solutions) ->choices of the problem(how many different posibilities)
====define the things you want to enumerate(brute force to get all results to guess the answers)====
->in fibonacci: nothing to guess(the formula is explicit)
->in shortest path(s,v): indgree(v) choices in total.
--the guess is for (s,v) which is the last edge(u,v), =>indgree(v) posibilities

3. relate subproblem solutions 
->in fibonacci: the formula fn=fn-1+fn-2
->in shortest path(s,v): δk(s,v)=min{δk-1(s,u)+W(u,v)| (u,v)∈E}
--try every edge (u,v) which is exsit in the graph, W(u,v)+the k-1 edges(s,u)' weights need to be minimum.
--Combine this procedure with guess:  
We try every edge(u,v)(2nd), and each of them need to satisfy the condition(3rd). 

->write some formula like the fibonacci.
->maximum or minimize something(dp is alwayes used in optimization).
write it like a sequense, and it can use 'max' and 'min'.

4. recurse&memoize or build dp table to bottom up ->the actual solution to the problem
->check: is the subproblem recurense is acyclic? -it has a topological order. DAG

5. solve the original problems
*/

三、总结

四、代码

#include<cstdio>
#include<algorithm>
using namespace std;

const int MAXN=100001;
char s[MAXN],t[MAXN];
int n,m;
int maxLength[MAXN][MAXN];


//bottom up
void solve(){
	for(int i=1;i<=n;i++)
		for(int j=1;j<=m;j++)
			if(s[i-1]==t[j-1])
				maxLength[i][j]=maxLength[i-1][j-1]+1；
			else
				maxLength[i][j]=max(maxLength[i][j-1], maxLength[i-1][j]));
}

//memoization
int solve2(int i,int j){
	if(i<=0||j<=0){
		return 0;
	}
	if(maxLength[i][j]!=0){
		return maxLength[i][j];
	}
	int result;
	if(s[i]==t[j])
		result=max(solve2(i-1,j)+1,solve2(i,j-1)+1);
	else
		result=max(solve2(i-1,j),max(solve2(i,j-1),solve2(i-1,j-1)));
	maxLength[i][j]=result;
	return result;
}


int main(void){
	scanf("%d%d",&n,&m);
	scanf("%s%s",s,t);
	//solve();
	//printf("%d\n",maxLength[n][m]);
	printf("%d\n",solve2(n,m));


	return 0;
}