菜鸟学算法——动态规划（一）

最新推荐文章于 2022-07-04 22:07:26 发布

VIP文章 qulay

最新推荐文章于 2022-07-04 22:07:26 发布

阅读量1.3k

点赞数

分类专栏：算法

本文链接：https://blog.csdn.net/klqulei123/article/details/52781626

版权

参考： http://www.avatar.se/molbioinfo2001/dynprog/dynamic.html

Dynamic Programming

The following is an example of global sequence alignment using Needleman/Wunsch techniques. For this example, the two sequences to be globally aligned are

G A A T T C A G T T A (sequence #1)
G G A T C G A (sequence #2)

So M = 11 and N = 7 (the length of sequence #1 and sequence #2, respectively)

A simple scoring scheme is assumed where

S_i,j = 1 if the residue at position i of sequence #1 is the same as the residue at position j of sequence #2 (match score); otherwise
S_i,j = 0 (mismatch score)
w = 0 (gap penalty)

Three steps in dynamic programming

Initialization
Matrix fill (scoring)
Traceback (alignment)

Initialization Step

The first step in the global alignment dynamic programming approach is to create a matrix with M + 1 columns and N + 1 rows where M and N correspond to the size of the sequences to be aligned.

Since this example assumes there is no gap opening or gap extension penalty, the first row and first column of the matrix can be initially filled with 0.

Matrix Fill Step

One possible (inefficient) solution of the matrix fill step finds the maximum global alignment score by starting in the upper left hand corner in the matrix and finding the maximal score M_i,j for each position in the matrix. In order to find M_i,j for any i,j it is minimal to know the score for the matrix positions to the left, above and diagonal to i, j. In terms of matrix positions, it is necessary to know M_i-1,j, M_i,j-1 and M_{i-1, j-1}.

For each position, M_i,j is defined to be the maximum score at position i,j; i.e.

M_i,j = MAXIMUM[
     M_{i-1, j-1} + S_i,j (match/mismatch in the diagonal),
     M_i,j-1 + w (gap in sequence #1),
     M_i-1,j + w (gap in sequence #2)]

Note that in the example, M_i-1,j-1 will be red, M_i,j-1 will be green and M_i-1,j will be blue.

Using this information, the score at position 1,1 in the matrix can be calculated. Since the first residue in both sequences is a G, S_1,1= 1, and by the assumptions stated at the beginning, w = 0. Thus, M_1,1 = MAX[M_0,0 + 1, M_{1, 0} + 0, M_0,1+ 0] = MAX [1, 0, 0] = 1.

A value of 1 is then placed in position 1,1 of the scoring matrix.

Since the gap penalty (w) is 0, the rest of row 1 and column 1 can be filled in with the value 1. Take the example of row 1. At column 2, the value is the max of 0 (for a mismatch), 0 (for a vertical gap) or 1 (horizontal gap). The rest of row 1 can be filled out similarly until we get to column 8. At this point, there is a G in both sequences (light blue). Thus, the value for the cell at row 1 column 8 is the maximum of 1 (for a match), 0 (for a vertical gap) or 1 (horizontal gap). The value will again be 1. The rest of row 1 and column 1 can be filled with 1 using the above reasoning.

Now let's look at column 2. The location at row 2 w

最低0.47元/天解锁文章

qulay

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
菜鸟学算法——动态规划（一）

参考：http://www.avatar.se/molbioinfo2001/dynprog/dynamic.htmlDynamic ProgrammingThe following is an example of global sequence alignment using Needleman/Wunsch techniques. For this example, the two sequ
复制链接

扫一扫