UCT-RAVE Algorithm Applied to Multi-player Games with Imperfect Information中关于扑克的中的一些描述,基本讨论UCT和RAVE的结合。一些方法在之前的论文中见到过,新意不大。但是有一点,其中提到了多次蒙特卡洛抽样来完成转换。原文如下:
The combination of UCT-RAVE with Monte-Carlo sampling method embodies generation of perfect information situation in initialization course of searching. When UCTRAVE makes one search gap, first of all apply Monte-Carlo sampling to transfer imperfect information into perfect information situation then UCT-RAVE algorithm makes search of path and expansion of nodes as per the said situation. Next search shall base on another perfect information situation generated from Monte-Carlo sampling and nodes generated from all searches are kept in one search tree, the winning rate of every node in tree shall represent performance in average of all possible situations.
具体也没什么实质性的东西,没提什么实践。类似迭代的思想,每一次更换ROOT节点,做多个非完美棋局的平均估值。
The combination of UCT-RAVE with Monte-Carlo sampling method embodies generation of perfect information situation in initialization course of searching. When UCTRAVE makes one search gap, first of all apply Monte-Carlo sampling to transfer imperfect information into perfect information situation then UCT-RAVE algorithm makes search of path and expansion of nodes as per the said situation. Next search shall base on another perfect information situation generated from Monte-Carlo sampling and nodes generated from all searches are kept in one search tree, the winning rate of every node in tree shall represent performance in average of all possible situations.
具体也没什么实质性的东西,没提什么实践。类似迭代的思想,每一次更换ROOT节点,做多个非完美棋局的平均估值。