CareerCup Pots of gold game:看谁拿的钱多

最新推荐文章于 2024-10-01 18:52:56 发布

taoqick

最新推荐文章于 2024-10-01 18:52:56 发布

阅读量851

点赞数

分类专栏：算法 careercup c++ 文章标签：动态规划

算法同时被 3 个专栏收录

474 篇文章 6 订阅

订阅专栏

c++

265 篇文章 1 订阅

订阅专栏

careercup

86 篇文章 0 订阅

订阅专栏

问题描述：

Pots of gold game: Two players A & B. There are pots of gold arranged in a line, each containing some gold coins (the players can see how many coins are there in each gold pot - perfect information). They get alternating turns in which the player can pick a pot from one of the ends of the line. The winner is the player which has a higher number of coins at the end. The objective is to "maximize" the number of coins collected by A, assuming B also plays optimally. A starts the game.
The idea is to find an optimal strategy that makes A win knowing that B is playing optimally as well. How would you do that?

简单来说就是很多金币罐排成一行，两个人轮流拿钱。每次只能拿走线端的罐，也就两种选择。A先开始，问你A应该用什么策略使得拿到的钱尽可能多。B也很聪明，每次也是“最优”决策

解答：

非常巧妙，动态规划在这里用上了．

因为每次A只有两种选择，选择头部或者尾部的金币罐。我们不妨假设选择头部，那么此时轮到B选择了，B也有两种选择，选择此时的“头部”或者”尾部”。注意到问题是包括了子问题的优化解的，这个特性比较明显就不多做说明了，可以这么理解，“英明”决策是由一个个小的“英明”决策组成。

所以我们可以采用动态规划的方式来解决

   1:  function max_coin( int *coin, int start, int end ):function max_coin( int *coin, int start, int end ):

   2:      if start > end:

   3:          return 0        return 0

4:

   5:      int a = coin[start] +     int a = coin[start] +

max

( max_coin( coin, start+2,end ), max_coin( coin, start+1,end-1 ) )

   6:      int b = coin[end] +

max

( max_coin( coin, start+1,end-1 ), max_coin( coin, start,end-2 ) )

<span style="color:#606060">   7:  </span>

   8:      return max(a,b)

大家看看这个代码，仔细研究一下有没有问题。

我们来分析一下，锁定第五句，我们用自然语言来解释一下这一句。A选择了线首钱罐，然后根据B的选择有两种情况，去两种情况下的最优解。最优解，最优。。。

分析到这儿，大家有没有感觉到问题的出现？没有的话我们再来看一下。

A选择最优解！这是什么意思，意思就是说B的选择对A没有影响！因为无论B选择是什么，A的选择是一定的。

这显然是不合理的！

所以正确的代码应该是

   1:  function max_coin( int *coin, int start, int end ):function max_coin( int *coin, int start, int end ):

   2:      if start > end:

   3:          return 0        return 0

4:

   5:      int a = coin[start] + min( max_coin( coin, start+2,end ), max_coin( coin, start+1,end-1 ) )    int a = coin[start] + min( max_coin( coin, start+2,end ), max_coin( coin, start+1,end-1 ) )

   6:      int b = coin[end] + min( max_coin( coin, start+1,end-1 ), max_coin( coin, start,end-2 ) )

<span style="color:#606060">   7:  </span>

   8:      return max(a,b)

看着这段代码，大家可能感觉怪怪的，因为好像就算换成min也不一定就说明B影响到了A的选择。看起来像那么回事，但是对不对呢？

说实话，我没法严谨去证明。

暂时这么理解吧，B采取了一个策略，那就是处处为难A，每次选择遵循了一个原则，那就是使得接下来A获得的总钱币数目尽可能少。A尽可能少那么B也就自然尽可能多了。

QUORA上有这么一段代码，也可以看看，不错的

http://www.quora.com/Dynamic-Programming/How-do-you-solve-the-pots-of-gold-game

 pots = [...]
 
 cache = {}
 def optimal(left, right, player):
     if left > right:
         return 0
     if (left, right, player) in cache:
         return cache[(left, right, player)]
     if player == 'A':
         result = max(optimal(left + 1, right, 'B') + pots[left],
                      optimal(left, right - 1, 'B') + pots[right])
     else:
         result = min(optimal(left + 1, right, 'A'),
                      optimal(left, right - 1, 'A'))
     cache[(left, right, player)] = result
     return result
 
 answer = optimal(0, len(pots)-1, 'A')