深度学习实现象棋_象棋的深度学习

最新推荐文章于 2024-10-21 13:08:42 发布

cumei1658

最新推荐文章于 2024-10-21 13:08:42 发布

阅读量4.4k

点赞数 1

文章标签：游戏算法 python 机器学习人工智能

原文链接：https://www.pybloggers.com/2017/02/deep-learning-for-chess/

版权

博主使用深度学习和Theano构建了一个象棋AI。通过从FICS游戏数据库下载的1亿场比赛训练模型，利用玩家会选择最佳或接近最佳的移动这一原理来学习评估函数f(p)。模型是一个3层深度、2048单位宽的神经网络，最终能与现有象棋引擎如Sunfish对抗。尽管速度较慢，但在一定条件下能与Sunfish抗衡，表明直接从原始数据学习的评估功能是可行的。

摘要由CSDN通过智能技术生成

深度学习实现象棋

Erik Bernhardsson | 2017年2月2日 (by Erik Bernhardsson | February 2, 2017)

About Erik: Dad and CTO (Chief Troll Officer) at a fintech startup in NYC. Ex-Spotify, co-organizing NYC ML meetup, open source sometimes (Luigi, Annoy), blogs random stuff

关于埃里克 （ Erik） ：爸爸和首席技术官（首席巨魔官）在纽约市的一家金融科技初创公司。 Ex-Spotify，共同组织NYC ML聚会，有时是开源的（Luigi，Annoy），博客是随机的

象棋的深度学习 (Deep learning for… chess)

I’ve been meaning to learn Theano for a while and I’ve also wanted to build a chess AI at some point. So why not combine the two? That’s what I thought, and I ended up spending way too much time on it.

我一直想学习Theano已有一段时间，我也想在某个时候构建国际象棋AI。那么为什么不将两者结合起来呢？我就是这么想的，最终我花了太多时间在上面。

有什么理论？ (What’s the theory?)

Chess is a game with a finite number of states, meaning if you had infinite computing capacity, you could actually solve chess. Every position in chess is either a win for white, a win for black, or a forced draw for both players. We can denote this by the function f(position) . If we had an infinitely fast machine we could compute this by

国际象棋是一种状态数量有限的游戏，这意味着如果您具有无限的计算能力，则实际上可以解决国际象棋。国际象棋中的每个位置要么是白棋的获胜，要么是黑棋的获胜，要么是两名玩家的平局。我们可以用函数f（position）来表示。如果我们有一个无限快的机器，我们可以通过

Assign all the final positions the value −1,0,1 depending on who wins.
Use the recursive rule

根据最终获胜者，为所有最终位置分配值−1,0,1。
使用递归规则

$$f(p) = max_{p rightarrow p’} -f(p’)$$

$$ f（p）= max_ {p rightarrow p'} -f（p'）$$

where $p rightarrow p’$ denotes all the legal moves from position p. The minus sign is because the players alternate between positions, so if position p is white’s turn, then position p′ is black turns (and vice versa). This is the same thing as minimax.

其中$ p rightarrow p'$表示从位置p开始的所有合法移动。减号是因为玩家在位置之间交替，所以如果位置p是白色的回合，则位置p'是黑色的回合（反之亦然）。这与minimax相同。

There’s approximately 10^43 positions, so there’s no way we can compute this. We need to resort to approximations to f(p).

大约有10 ^ 43个位置，因此我们无法进行计算。我们需要求近似于f（p）。