DP31 取大小游戏中的最优策略(附:如何对DP[0][n]进行对角线递推) Optimal Strategy for a Game @geeksforgeeks...

Problem statement: Consider a row of n coins of values v1 . . . vn, where n is even. We play a game against an opponent by alternating turns. In each turn, a player selects either the first or last coin from the row,removes it from the row permanently, and receives the value of the coin. Determine the maximum possible amount of money we can definitely win if we move first.

Note: The opponent is as clever as the user.

Let us understand the problem with few examples:

1.5, 3, 7, 10 : The user collects maximum value as 15(10 + 5)

2.8, 15, 3, 7 : The user collects maximum value as 22(7 + 15)

Does choosing the best at each move give an optimal solution?

No. In the second example, this is how the game can finish:

1.
…….User chooses 8.
…….Opponent chooses 15.
…….User chooses 7.
…….Opponent chooses 3.
Total value collected by user is 15(8 + 7)

2.
…….User chooses 7.
…….Opponent chooses 8.
…….User chooses 15.
…….Opponent chooses 3.
Total value collected by user is 22(7 + 15)

So if the user follows the second game state, maximum value can be collected although the first move is not the best.

There are two choices:
1.The user chooses the ith coin with value Vi: The opponent either chooses (i+1)th coin or jth coin. The opponent intends to choose the coin which leaves the user with minimum value.
i.e. The user can collect the value Vi + min(F(i+2, j), F(i+1, j-1) )
coinGame1

2.The user chooses the jth coin with value Vj: The opponent either chooses ith coin or (j-1)th coin. The opponent intends to choose the coin which leaves the user with minimum value.
i.e. The user can collect the value Vj + min(F(i+1, j-1), F(i, j-2) )
coinGame2

Following is recursive solution that is based on above two choices. We take the maximum of two choices.

F(i, j)  represents the maximum value the user can collect from 
         i'th coin to j'th coin.

    F(i, j)  = Max(Vi + min(F(i+2, j), F(i+1, j-1) ), 
                   Vj + min(F(i+1, j-1), F(i, j-2) )) 
Base Cases
    F(i, j)  = Vi           If j == i
    F(i, j)  = max(Vi, Vj)  If j == i+1

Why Dynamic Programming?
The above relation exhibits overlapping sub-problems. In the above relation, F(i+1, j-1) is calculated twice.


很有意思的博弈问题,重要的一点是贪心法在这里是不行的,前面已经举了反证。

另外很重要的一点是对求dp[0][n]而不是dp[m][m]的问题,如何写for循环。文章里说要从对角线开始向右上角递推。

比如:

The below table represents the stored values for the string abcde.

a b c d e
----------
0 1 2 3 4
0 0 1 2 3 
0 0 0 1 2 
0 0 0 0 1 
0 0 0 0 0

How to fill the table?
The table should be filled in diagonal fashion. For the string abcde, 0….4, the following should be order in which the table is filled:

Gap = 1:
(0, 1) (1, 2) (2, 3) (3, 4)

Gap = 2:
(0, 2) (1, 3) (2, 4)

Gap = 3:
(0, 3) (1, 4)

Gap = 4:
(0, 4)

可以用以下代码得到:

用一个gap变量作为l和h之间的距离,然后l和h一起前进,直到h碰到边界n,因为l在左上方,h在右下方,所以肯定h先碰到右边界!

// Fill the table
    for (gap = 1; gap < n; ++gap)
        for (l = 0, h = gap; h < n; ++l, ++h)
            table[l][h] = (str[l] == str[h])? table[l+1][h-1] :
                          (min(table[l][h-1], table[l+1][h]) + 1);



package DP;

public class OptimalGameStrategy {

	public static void main(String[] args) {
		int[] A = {8, 15, 3, 7};
		int[] B = {2, 2, 2, 2};
		int[] C = {20, 30, 2, 2, 2, 10};
		System.out.println(optimalStrategyOfGameRec(A, 0, A.length-1));
		System.out.println(optimalStrategyOfGame(A, A.length));
		System.out.println(optimalStrategyOfGameRec(B, 0, B.length-1));
		System.out.println(optimalStrategyOfGame(B, B.length));
		System.out.println(optimalStrategyOfGameRec(C, 0, C.length-1));
		System.out.println(optimalStrategyOfGame(C, C.length));
		
	}
	
	
	public static int optimalStrategyOfGameRec(int[] A, int begin, int end){
		if(begin == end){		// 只剩下一枚硬币时,只能取那枚
			return A[begin];
		}
		if(begin+1 == end){	// 只剩下两枚时,取大的那枚
			return Math.max(A[begin], A[end]);
		}
		
		// 如果我去开头那枚,因为对手会拿较大那枚,所以留下的是较小的选择,看图
		int a = A[begin] + Math.min(optimalStrategyOfGameRec(A, begin+2, end), 
											   optimalStrategyOfGameRec(A, begin+1, end-1));
		// 如果我去拿结尾那枚,同理
		int b = A[end] + Math.min(optimalStrategyOfGameRec(A, begin+1, end-1), 
											 optimalStrategyOfGameRec(A, begin, end-2));
		return Math.max(a, b);		// 做最合算的选择
	}
	
	
	// Returns optimal value possible that a player can collect from
	// an array of coins of size n. Note than n must be even
	public static int optimalStrategyOfGame(int[] A, int n){
		int[][] dp = new int[n][n];	//dp[begin][end]保存从序号为begin到end的coin序列能取得的最大值
		
		// Fill table using above recursive formula. Note that the table
	    // is filled in diagonal fashion (similar to http://goo.gl/PQqoS),
	    // from diagonal elements to table[0][n-1] which is the result.
		for(int gap=0; gap<n; gap++){
			int begin=0, end=gap;
			while(end < n){
				// 检测条件为右上角
				int x = begin+2<=end ? dp[begin+2][end] : 0;
				int y = begin+1<=end-1 ? dp[begin+1][end-1] : 0;
				int z = begin<end-2 ? dp[begin][end-2] : 0;
				
				dp[begin][end] = Math.max(A[begin]+Math.min(x, y), 
										   A[end]+Math.min(y, z));
				begin++;
				end++;
			}
		}
		
		return dp[0][n-1];
	}

}


http://www.geeksforgeeks.org/dynamic-programming-set-31-optimal-strategy-for-a-game/

http://www.geeksforgeeks.org/dynamic-programming-set-28-minimum-insertions-to-form-a-palindrome/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值