tic tac toe游戏 java_强化学习应用于游戏Tic-Tac-Toe

最新推荐文章于 2021-05-18 11:02:08 发布

半瓶榴莲奶

最新推荐文章于 2021-05-18 11:02:08 发布

阅读量351

点赞数

文章标签： tic tac toe游戏 java

本文链接：https://blog.csdn.net/weixin_36073959/article/details/114471478

版权

Tic-Tac-Toe游戏为3*3格子里轮流下棋，一方先有3子成直线的为赢家。

参考代码如下，我只删除了几个没用的地方：

########################################################################Copyright (C) ##2016 - 2018 Shangtong Zhang(zhangshangtong.cpp@gmail.com) ##2016 Jan Hakenberg(jan.hakenberg@gmail.com) ##2016 Tian Jun(tianjun.cpp@gmail.com) ##2016 Kenta Shimada(hyperkentakun@gmail.com) ##Permission given to modify the code as long as you keep this ##declaration at the top ##########################################################################https://www.cnblogs.com/pinard/p/9385570.html #### 强化学习(一)模型基础 ##

importnumpy as npimportpickle

BOARD_ROWS= 3BOARD_COLS= 3BOARD_SIZE= BOARD_ROWS * BOARD_COLS

State状态类

简要描述：每个状态用自定义hash值描述，主要方法为get_all_states(运行一次得到所有状态)和next_state(下一次棋，返回新的状态)

classState:def __init__(self):#the board is represented by an n * n array,

#1 represents a chessman of the player who moves first,

#-1 represents a chessman of another player

#0 represents an empty position

self.data =np.zeros((BOARD_ROWS, BOARD_COLS))

self.winner=None

self.hash_val=None

self.end=None#compute the hash value for one state, it's unique

defhash(self):if self.hash_val isNone:

self.hash_val=0for i in self.data.reshape(BOARD_ROWS *BOARD_COLS):#即原来取值-1,0,1，现在将-1设置为2，为了hash方便

if i == -1:

i= 2self.hash_val= self.hash_val * 3 +ireturnint(self.hash_val)#check whether a player has won the game, or it's a tie

defis_end(self):if self.end is notNone:returnself.end

results=[]#check row

for i inrange(0, BOARD_ROWS):

results.append(np.sum(self.data[i, :]))#check columns

for i inrange(0, BOARD_COLS):

results.append(np.sum(self.data[:, i]))#check diagonals

results.append(0)for i inrange(0, BOARD_ROWS):

results[-1] +=self.data[i, i]

results.append(0)for i inrange(0, BOARD_ROWS):

results[-1] += self.data[i, BOARD_ROWS - 1 -i]for result inresults:if result == 3:

self.winner= 1self.end=Truereturnself.endif result == -3:

self.winner= -1self.end=Truere

最低0.47元/天解锁文章

半瓶榴莲奶

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
tic tac toe游戏 java_强化学习应用于游戏Tic-Tac-Toe

Tic-Tac-Toe游戏为3*3格子里轮流下棋，一方先有3子成直线的为赢家。参考代码如下，我只删除了几个没用的地方：########################################################################Copyright (C) #...
复制链接

扫一扫