如何用蒙特卡洛树搜索(MCTS)制作智能五子棋【含源码】_01

近几年来,随着谷歌的阿尔法狗和阿尔法元的问世,蒙特卡洛树搜索(MCTS),作为一种不需要特定领域的先验知识的搜索算法逐渐被人们重视起来。其可以在无任何已知知识,而仅需要了解模拟规则和结束状态的情况下,得到非常好的策略。但是由于其盲目的搜索,其运行时间和对内存空间的需求成为衡量其性能的主要因素之一。随着计算机的计算能力提升,对于一些特定的状态空间较少的问题,MCTS可以在其中表现优秀。

五子棋是初学者熟悉MCTS的常用用例。

图中红色代表模拟结果中下一步棋子的可能位置,红色棋子越不透明,则该位置获胜概率越高。


参数:

1.棋盘维度:7*7

2.根节点模拟次数:一百万次

3.每次运行时间:约40秒

4.赢棋赋分:1分,输棋赋分:-1分

5. Cp = 2

【目前无做任何优化,尚有很大性能优化空间,请查看后续版本】


基本呈现对称分布


下图明显白棋的落子位置能够堵住黑棋,同时与已有白棋连城一条线

黑棋落子位置为两白棋之间,同时与已有黑棋连线

白棋落子位置为黑棋三字的两端,并选择了直接杀掉这三个子的一段

这里还存在bug,白棋没有选择挡住左面即将连成四子的黑棋

黑棋也没有选择即将连成四子的黑棋,怀疑判断输赢的函数出现问题

黑棋发现已有棋面无法继续,故在四周拓展

白棋阻挡黑棋赢棋

出现了bug,所有得分均为负分。



代码尚存在一些bug,暂时开源如下,在后续的版本中修改。

代码如下

1.MCTSwuziqi.java 主程序,负责调用生成棋盘

2.DrawChessBoard.java 负责画出棋盘,时间监听,调用MCTS

3.MCTS_01.java 负责进行MCTS模拟

MCTSwuziqi.java

package bwjiang;


public class MCTSwuziqi {

	public static void main(String[] args) {
		// TODO Auto-generated method stub
		System.out.println("main init");
		DrawChessBoard chessBoard = new DrawChessBoard();
		//chessBoard.boardFrame.setVisible(true);
		
		//System.out.println(chessBoard.boardFrame.getChessmans()[0][0].getColor());
	}

}

DrawChessBoard.java

package bwjiang;
import javax.imageio.ImageIO;
import javax.swing.*;
import javax.swing.border.*;



import java.awt.*;
import java.awt.event.*;
import java.awt.geom.*;
import java.io.*;


public class DrawChessBoard {
	public static final int BLACKWIN = 1;
	public static final int WHITEWIN = 2;
	public static final int NOTWIN = 0;
	public static final int ALLFILLED = -1;
	public static final int BLACK = 1;
	public static final int NOCHESS = 0;
	public static final int WHITE = -1;
	public static final int WINLENGTH = 5;
	
	public BoardFrame boardFrame;
	public int rows = 9;
	JTextField rowsText;
	JLabel nowChessColor;
	
	
	public DrawChessBoard() {
		//棋盘窗体
		this.boardFrame = new BoardFrame();

		boardFrame.setVisible(true);
	}
	//棋盘内容
	class BoardPanel extends JPanel implements MouseListener{
		public Image boardImage;//棋盘边框

		public int lastChessColor = WHITE;
		//记录全部落子的棋子类,第一项为行数,第二项为列数
		public int[][] chessmans = new int[rows][rows];
		public int[][] predictChessmans = new int[rows][rows];
		public int maxReward = 0;
		public int nextChessColor = BLACK;//控制交换棋权
		

		int FrameWidth;//窗体
		int FrameHeight;
		int chessBoardX;//棋盘边框左上角
		int chessBoardY;
		int realChessBoardX;//真正落子的左上角
		int realChessBoardY;
		int deltaX;//间距
		int deltaY;
		
		public int[][] getChessmans(){
			return this.chessmans;
		}
		
		
		//棋盘边框
		public BoardPanel() {
			try {
				boardImage = ImageIO.read(new File("res/boar
  • 6
    点赞
  • 40
    收藏
    觉得还不错? 一键收藏
  • 3
    评论
好的,以下是一个基于Python实现的基于MCTS和UCT的五子棋游戏AI的代码示例: ``` python import random import math class TreeNode: def __init__(self, state, parent=None): self.state = state self.parent = parent self.children = [] self.visits = 0 self.score = 0 def UCT(node): C = 1.4 if node.visits == 0: return float('inf') return (node.score / node.visits) + C * math.sqrt(math.log(node.parent.visits) / node.visits) def MCTS(state, iterations): root = TreeNode(state) for i in range(iterations): node = root # selection while node.children: node = max(node.children, key=UCT) # expansion if node.visits > 0: moves = node.state.get_moves() for move in moves: if move not in [c.state.last_move for c in node.children]: child_state = node.state.apply_move(move) child_node = TreeNode(child_state, node) node.children.append(child_node) # simulation sim_node = node while sim_node.children: sim_node = random.choice(sim_node.children) score = simulate(sim_node.state) # backpropagation while node: node.visits += 1 node.score += score node = node.parent return max(root.children, key=lambda c: c.visits).state.last_move def simulate(state): player = state.get_current_player() while not state.is_terminal(): move = random.choice(state.get_moves()) state = state.apply_move(move) player = state.get_current_player() if state.get_winner() == player: return 1 elif state.get_winner() == None: return 0.5 else: return 0 class Board: def __init__(self, width=15, height=15, win_length=5): self.width = width self.height = height self.win_length = win_length self.board = [[None for y in range(height)] for x in range(width)] self.last_move = None def get_moves(self): moves = [] for x in range(self.width): for y in range(self.height): if self.board[x][y] == None: moves.append((x, y)) return moves def apply_move(self, move): x, y = move player = self.get_current_player() new_board = Board(self.width, self.height, self.win_length) new_board.board = [row[:] for row in self.board] new_board.board[x][y] = player new_board.last_move = move return new_board def get_current_player(self): if sum(row.count(None) for row in self.board) % 2 == 0: return "X" else: return "O" def is_terminal(self): if self.get_winner() != None: return True for x in range(self.width): for y in range(self.height): if self.board[x][y] == None: return False return True def get_winner(self): for x in range(self.width): for y in range(self.height): if self.board[x][y] == None: continue if x + self.win_length <= self.width: if all(self.board[x+i][y] == self.board[x][y] for i in range(self.win_length)): return self.board[x][y] if y + self.win_length <= self.height: if all(self.board[x][y+i] == self.board[x][y] for i in range(self.win_length)): return self.board[x][y] if x + self.win_length <= self.width and y + self.win_length <= self.height: if all(self.board[x+i][y+i] == self.board[x][y] for i in range(self.win_length)): return self.board[x][y] if x + self.win_length <= self.width and y - self.win_length >= -1: if all(self.board[x+i][y-i] == self.board[x][y] for i in range(self.win_length)): return self.board[x][y] return None def __str__(self): return "\n".join(" ".join(self.board[x][y] or "-" for x in range(self.width)) for y in range(self.height)) if __name__ == "__main__": board = Board() while not board.is_terminal(): if board.get_current_player() == "X": x, y = map(int, input("Enter move (x y): ").split()) board = board.apply_move((x, y)) else: move = MCTS(board, 1000) print("AI move:", move) board = board.apply_move(move) print(board) print("Winner:", board.get_winner()) ``` 该代码定义了一个 `TreeNode` 类来保存节点的状态和统计信息,实现了基于UCB公式的UCT算法和基于MCTS和UCT的五子棋AI。同时,代码还定义了一个 `Board` 类来表示五子棋游戏的状态和规则,并实现了判断胜负、获取可行落子位置等方法。在 `__main__` 函数中,代码通过交替输入玩家落子位置和调用AI选择落子位置的方式,实现了人机对战的功能。 希望这个代码对你有所帮助!

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值