Predict the Winner问题及解法

最新推荐文章于 2018-09-04 17:16:09 发布

我们要爱学习

最新推荐文章于 2018-09-04 17:16:09 发布

阅读量285

点赞数

分类专栏： LeetCode算法练习（Medium）文章标签： LeetCode Medium DP Minimax

本文链接：https://blog.csdn.net/u011809767/article/details/77892859

版权

LeetCode算法练习（Medium）专栏收录该内容

238 篇文章 1 订阅

订阅专栏

问题描述：

Given an array of scores that are non-negative integers. Player 1 picks one of the numbers from either end of the array followed by the player 2 and then player 1 and so on. Each time a player picks a number, that number will not be available for the next player. This continues until all the scores have been chosen. The player with the maximum score wins.

Given an array of scores, predict whether player 1 is the winner. You can assume each player plays to maximize his score.

示例:

Input: [1, 5, 2]
Output: False
Explanation: Initially, player 1 can choose between 1 and 2. 
If he chooses 2 (or 1), then player 2 can choose from 1 (or 2) and 5. If player 2 chooses 5, then player 1 will be left with 1 (or 2). 
So, final score of player 1 is 1 + 2 = 3, and player 2 is 5. 
Hence, player 1 will never be the winner and you need to return False.

Input: [1, 5, 233, 7]
Output: True
Explanation: Player 1 first chooses 1. Then player 2 have to choose between 5 and 7. No matter which number player 2 choose, player 1 can choose 233.
Finally, player 1 has more score (234) than player 2 (12), so you need to return True representing player1 can win.

问题分析：

直接上大牛的英文解释：

①Let first outline the DP state:

dp[i][j] means that for a sub-game in between [i, j] inclusive, the maximum score that Player 1 could get.

Our final goal is to find out whether Player 1 could score more than half of the total score in the game between [0, n-1], or in other words the dp[0][n-1]. Another thing to notice is that, because Player 1 and Player 2 pick numbers one after each other, this means:

If dp[i][j] means maximum score Player 1 could get between [i, j] then dp[i-1][j] could mean the maximum score Player 2 could get between [i-1, j], and same thing for dp[i][j-1].

Another more important thing based on the above statement is that:

The sum[i-1][j] - dp[i-1][j] means the maximum score Player 1 can get between [i-1, j] after he picks nums[i] in between [i, j]. Also the same rule applies to dp[i][j-1].

Thus we have the following induction rule for this DP solution:

pickLeft = nums[i] + sum[i-1][j] - dp[i-1][j] //if left number is picked
pickRight = nums[j] + sum[i][j-1] - dp[i][j-1] //if right number is picked
dp[i][j] = max(pickLeft, pickRight)

② If we do more deduction, we can eliminate the sum(i, j) from the formula :

Instead of storing the maximum score that player 1 can get in each sub array, we can store the diff between player1 and player 2. For example: if player 1 get A, player 2 get B, we can use dp' to store A-B.

if A = dp(i, j), then B = sum(i, j) - dp(i, j)

So dp'(i, j) = dp(i, j) - ( sum(i, j) - dp(i, j) ) = 2*dp(i, j) - sum(i, j), so
2*dp(i, j) = dp'(i, j) + sum(i, j) (this will be used below)

dp'(i, j) = dp(i, j) - ( sum(i, j) - dp(i, j) ) = 2dp(i, j) - sum(i, j)
= 2 * max( sum(i, j) - dp(i, j-1), sum(i, j) - dp(i+1, j) ) - sum(i, j)
= max(sum(i, j) - 2*dp(i, j-1), sum(i, j) - 2*dp(i+1, j) )
= max(sum(i, j) - ( dp'(i, j-1) + sum(i, j-1) ), sum(i, j) - ( dp'(i+1, j) + sum(i+1, j)))
= max(sum(i, j) - sum(i, j-1) - dp'(i, j-1), sum(i, j) - sum(i+1, j) - dp'(i+1, j))
= max(nums[j] - dp'(i, j-1), nums[i] - dp'(i+1, j))

Final formula: dp(i, j) = max(nums[j] - dp(i, j-1), nums[i] - dp(i+1, j))

过程详见代码：

class Solution {
public:
    bool PredictTheWinner(vector<int>& nums) {
        int n = nums.size();
        vector<vector<int>> dp(n, vector<int>(n)); // use to keep the score gap between player1 and player2
        for (int i = 0; i < n; i++) dp[i][i] = nums[i];
        for (int i = 1; i < n; i++) {
            for (int j = 0; j+i < n; j++) {
                dp[j][j+i] = max(nums[j+i]-dp[j][j+i-1], nums[j]-dp[j+1][j+i]);
            }
        }
        return dp[0][n-1] >= 0; // player1 get more score points than player2
    }
    
};