python alpha beta 剪枝_极小极大搜索 的个人理解(alpha-beta剪枝)

极小极大搜索的算法过程:

主要思想比较简单,但说清楚也不大容易。其核心思想是通过对于以后的状态进行预见式的暴搜,对可能的状态进行分析。理论上,如果能够搜索到最终状态,那么之后的走法都已经是确定的了。(这个地方觉得有些糊涂)

对于局面形式的估计首先需要一个合理的估价函数。实际上是因为真正的搜索几乎都是无法搜索到所有的可能性,否则完全用0和1就能表示当前局面的胜负态了。所以需要对局面进行较为合理的分析估价。对于某一方来说都是要使得最终局面状态值(理论上,最终局面状态有且仅有一个)获得最大,所以对于两方来说,可以通过一方估价正越大表示胜率越大,另一方估价正越小(负越大)表示胜率越大。

因此出现了极小极大搜索算法。

从最简单的情况开始分析,首先确定我方要使得最终局面估价最大,而非当前局面估价最大,所以需要预测下一个我方局面的估价如何。而在此之前的一步掌握权在对方的手上,自然会选择对他有利的状态(一定是最终局面形式最大的状态走法),也就是走对于对方最终局面估价最大的状态。因此我方的落子是根据所有下一个状态对方会如何走来决定的。最终局面估价是双方共同决定的。

(常常会被Min和Max状态搞乱,其实不要管这个,花时间弄明白其中的含义,写出来自然能够明白(其实Min()表示走到当前可预见的最后的状态的最小值,Max()反之))

伪代码如下:

int MinMax(int depth) {if (SideToMove() == WHITE) { //白方是“最大”者returnMax(depth);

}else {           //黑方是“最小”者returnMin(depth);

}

}

int Max(int depth) {

int best= -INFINITY;if (depth <=0) {returnEvaluate();

}

GenerateLegalMoves();while(MovesLeft()) {

MakeNextMove();

val= Min(depth - 1);

UnmakeMove();if (val >best) {

best=val;

}

}returnbest;

}

int Min(int depth) {

int best= INFINITY; //注意这里不同于“最大”算法if (depth <=0) {returnEvaluate();

}

GenerateLegalMoves();while(MovesLeft()) {

MakeNextMove();

val= Max(depth - 1);

UnmakeMove();if (val < best) {  //注意这里不同于“最大”算法

best=val;

}

}returnbest;

}

一个简便的实现方法,通过来回正负的变化来减少代码量,便于维护。

int NegaMax(int depth) {

int best= -INFINITY;if (depth <=0) {returnEvaluate();

}

GenerateLegalMoves();while(MovesLeft()) {

MakeNextMove();

val= -NegaMax(depth - 1); //注意这里有个负号。

UnmakeMove();if (val >best) {

best=val;

}

}returnbest;

}

剪枝方法也有很多,最经典的莫过于alpha-beta剪枝了。

1)β剪枝:

说的通俗一些,比如当前是我方下子,并且下一个我方局面的估价已经完成(递归),即博弈树的第三层已经预知。中间第二层即对方局面,可知对方走的必然是使得最终局面估价最小的一步,故我方当前要下子显然要使得在对方走估价最小的一步能达到的最大的估价。也就是第一层选取的走法是走向 第二层每一个局面对应的最小最终估价走法到达的 最大的局面。

实现中,你实际上不是站在第一层的视角来看,而是在第二层搜索时进行的,故需要保留第一层已经搜索的最大值,而对于第二层的对手来说,我们是敌人。从上帝视角来看,alpha值为当前棋手预估的最终估价的最大值,beta值为上一个局面(实际上先前那个局面还没下,保留对于先前那个局面)对手棋手预估的最终估价的最小值。也就是如果当前走法能够走到比先前那个局面的棋手预估的最终估价的最小值要大(对手显然不会走这步,因为至少估价比当前小),就直接返回(因为对手不会让你走到这个状态,所以之后怎么走都不用管了)。

2)  α剪枝:即相反的情况。

伪代码如下:

int AlphaBeta(int depth, int alpha, int beta) {if (depth ==0) {returnEvaluate();

}

GenerateLegalMoves();while(MovesLeft()) {

MakeNextMove();

val= -AlphaBeta(depth - 1, -beta, -alpha);

UnmakeMove();if (val >=beta) {returnbeta;

}if (val >alpha) {

alpha=val;

}

}returnalpha;

}

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
======================================================================== MICROSOFT FOUNDATION CLASS LIBRARY : fir ======================================================================== AppWizard has created this fir application for you. This application not only demonstrates the basics of using the Microsoft Foundation classes but is also a starting point for writing your application. This file contains a summary of what you will find in each of the files that make up your fir application. fir.dsp This file (the project file) contains information at the project level and is used to build a single project or subproject. Other users can share the project (.dsp) file, but they should export the makefiles locally. fir.h This is the main header file for the application. It includes other project specific headers (including Resource.h) and declares the CFirApp application class. fir.cpp This is the main application source file that contains the application class CFirApp. fir.rc This is a listing of all of the Microsoft Windows resources that the program uses. It includes the icons, bitmaps, and cursors that are stored in the RES subdirectory. This file can be directly edited in Microsoft Visual C++. fir.clw This file contains information used by ClassWizard to edit existing classes or add new classes. ClassWizard also uses this file to store information needed to create and edit message maps and dialog data maps and to create prototype member functions. res\fir.ico This is an icon file, which is used as the application's icon. This icon is included by the main resource file fir.rc. res\fir.rc2 This file contains resources that are not edited by Microsoft Visual C++. You should place all resources not editable by the resource editor in this file. ///////////////////////////////////////////////////////////////////////////// For the main frame window: MainFrm.h, MainFrm.cpp These files contain the frame class CMainFrame, which is derived from CFrameWnd and controls all SDI frame features. ///////////////////////////////////////////////////////////////////////////// AppWizard creates one document type and one view: firDoc.h, firDoc.cpp - the document These files contain your CFirDoc class. Edit these files to add your special document data and to implement file saving and loading (via CFirDoc::Serialize). firView.h, firView.cpp - the view of the document These files contain your CFirView class. CFirView objects are used to view CFirDoc objects. ///////////////////////////////////////////////////////////////////////////// Other standard files: StdAfx.h, StdAfx.cpp These files are used to build a precompiled header (PCH) file named fir.pch and a precompiled types file named StdAfx.obj. Resource.h This is the standard header file, which defines new resource IDs. Microsoft Visual C++ reads and updates this file. ///////////////////////////////////////////////////////////////////////////// Other notes: AppWizard uses "TODO:" to indicate parts of the source code you should add to or customize. If your application uses MFC in a shared DLL, and your application is in a language other than the operating system's current language, you will need to copy the corresponding localized resources MFC42XXX.DLL from the Microsoft Visual C++ CD-ROM onto the system or system32 directory, and rename it to be MFCLOC.DLL. ("XXX" stands for the language abbreviation. For example, MFC42DEU.DLL contains resources translated to German.) If you don't do this, some of the UI elements of your application will remain in the language of the operating system. /////////////////////////////////////////////////////////////////////////////
好的,下面是用alphabeta剪枝实现五子棋的Python代码: ```python import numpy as np # 定义棋盘大小 BOARD_SIZE = 15 # 定义棋子类型 EMPTY = 0 BLACK = 1 WHITE = 2 # 定义评估函数中的棋型 FIVE = 100000 # 连五 FOUR = 10000 # 活四 THREE = 1000 # 活三 TWO = 100 # 活二 # 定义走法评分表 score_table = { FIVE: 100000, FOUR + THREE: 10000, THREE + THREE: 5000, FOUR + TWO: 5000, THREE + TWO: 1000, TWO + TWO: 500, FOUR: 500, THREE: 100, TWO: 50, } # 初始化棋盘 board = np.zeros((BOARD_SIZE, BOARD_SIZE), dtype=int) # 判断是否在棋盘内 def is_in_board(x, y): return 0 <= x < BOARD_SIZE and 0 <= y < BOARD_SIZE # 判断某一位置是否可以下棋 def is_valid_move(x, y): return is_in_board(x, y) and board[x][y] == EMPTY # 判断是否五子连珠 def is_five_in_a_row(x, y, player): # 水平方向 count = 1 for i in range(1, 5): if is_in_board(x+i, y) and board[x+i][y] == player: count += 1 else: break for i in range(1, 5): if is_in_board(x-i, y) and board[x-i][y] == player: count += 1 else: break if count >= 5: return True # 垂直方向 count = 1 for i in range(1, 5): if is_in_board(x, y+i) and board[x][y+i] == player: count += 1 else: break for i in range(1, 5): if is_in_board(x, y-i) and board[x][y-i] == player: count += 1 else: break if count >= 5: return True # 左上-右下方向 count = 1 for i in range(1, 5): if is_in_board(x+i, y+i) and board[x+i][y+i] == player: count += 1 else: break for i in range(1, 5): if is_in_board(x-i, y-i) and board[x-i][y-i] == player: count += 1 else: break if count >= 5: return True # 右上-左下方向 count = 1 for i in range(1, 5): if is_in_board(x+i, y-i) and board[x+i][y-i] == player: count += 1 else: break for i in range(1, 5): if is_in_board(x-i, y+i) and board[x-i][y+i] == player: count += 1 else: break if count >= 5: return True return False # 获得当前棋盘中的所有空位 def get_empty_positions(): positions = [] for i in range(BOARD_SIZE): for j in range(BOARD_SIZE): if board[i][j] == EMPTY: positions.append((i, j)) return positions # 评估当前棋盘状态 def evaluate_board(player): opp = BLACK if player == WHITE else WHITE score = 0 for i in range(BOARD_SIZE): for j in range(BOARD_SIZE): if board[i][j] == player: # 水平方向 count = 1 for k in range(1, 5): if is_in_board(i+k, j) and board[i+k][j] == player: count += 1 else: break for k in range(1, 5): if is_in_board(i-k, j) and board[i-k][j] == player: count += 1 else: break score += score_table.get(count*1000, 0) # 垂直方向 count = 1 for k in range(1, 5): if is_in_board(i, j+k) and board[i][j+k] == player: count += 1 else: break for k in range(1, 5): if is_in_board(i, j-k) and board[i][j-k] == player: count += 1 else: break score += score_table.get(count*1000, 0) # 左上-右下方向 count = 1 for k in range(1, 5): if is_in_board(i+k, j+k) and board[i+k][j+k] == player: count += 1 else: break for k in range(1, 5): if is_in_board(i-k, j-k) and board[i-k][j-k] == player: count += 1 else: break score += score_table.get(count*1000, 0) # 右上-左下方向 count = 1 for k in range(1, 5): if is_in_board(i+k, j-k) and board[i+k][j-k] == player: count += 1 else: break for k in range(1, 5): if is_in_board(i-k, j+k) and board[i-k][j+k] == player: count += 1 else: break score += score_table.get(count*1000, 0) elif board[i][j] == opp: # 水平方向 count = 1 for k in range(1, 5): if is_in_board(i+k, j) and board[i+k][j] == opp: count += 1 else: break for k in range(1, 5): if is_in_board(i-k, j) and board[i-k][j] == opp: count += 1 else: break score -= score_table.get(count*1000, 0) # 垂直方向 count = 1 for k in range(1, 5): if is_in_board(i, j+k) and board[i][j+k] == opp: count += 1 else: break for k in range(1, 5): if is_in_board(i, j-k) and board[i][j-k] == opp: count += 1 else: break score -= score_table.get(count*1000, 0) # 左上-右下方向 count = 1 for k in range(1, 5): if is_in_board(i+k, j+k) and board[i+k][j+k] == opp: count += 1 else: break for k in range(1, 5): if is_in_board(i-k, j-k) and board[i-k][j-k] == opp: count += 1 else: break score -= score_table.get(count*1000, 0) # 右上-左下方向 count = 1 for k in range(1, 5): if is_in_board(i+k, j-k) and board[i+k][j-k] == opp: count += 1 else: break for k in range(1, 5): if is_in_board(i-k, j+k) and board[i-k][j+k] == opp: count += 1 else: break score -= score_table.get(count*1000, 0) return score # 极大极小搜索 + alphabeta剪枝 def alphabeta_search(player, depth, alpha, beta): if depth == 0: return None, evaluate_board(player) positions = get_empty_positions() if len(positions) == 0: return None, 0 best_pos = None if player == BLACK: best_score = -np.inf for pos in positions: x, y = pos board[x][y] = BLACK if is_five_in_a_row(x, y, BLACK): board[x][y] = EMPTY return pos, FIVE _, score = alphabeta_search(WHITE, depth-1, alpha, beta) board[x][y] = EMPTY if score > best_score: best_score = score best_pos = pos alpha = max(alpha, best_score) if alpha >= beta: break else: best_score = np.inf for pos in positions: x, y = pos board[x][y] = WHITE if is_five_in_a_row(x, y, WHITE): board[x][y] = EMPTY return pos, -FIVE _, score = alphabeta_search(BLACK, depth-1, alpha, beta) board[x][y] = EMPTY if score < best_score: best_score = score best_pos = pos beta = min(beta, best_score) if alpha >= beta: break return best_pos, best_score # 人机对战 def play_with_computer(): print("-----五子棋人机对战-----") print("玩家执黑棋,电脑执白棋") print("请输入您的下棋坐标,格式为x,y,如2,3") # 随机先后手 if np.random.randint(2) == 0: player = BLACK print("您先手") else: player = WHITE print("电脑先手") while True: if player == BLACK: # 人下棋 while True: move = input("请您输入下棋坐标:") x, y = [int(i) for i in move.split(",")] if is_valid_move(x, y): board[x][y] = BLACK break else: print("该位置已有棋子,请重新输入") else: # 电脑下棋 print("电脑正在思考...") pos, _ = alphabeta_search(WHITE, 3, -np.inf, np.inf) x, y = pos board[x][y] = WHITE print("电脑下棋坐标:{},{}".format(x, y)) # 打印棋盘 for i in range(BOARD_SIZE): print(" ".join(str(x) for x in board[i])) print("-" * 20) # 判断游戏是否结束 if is_five_in_a_row(x, y, player): if player == BLACK: print("恭喜您获胜!") else: print("很遗憾,您输了!") break if len(get_empty_positions()) == 0: print("平局!") break # 交换先后手 player = BLACK if player == WHITE else WHITE play_with_computer() ``` 在以上代码中,我们定义了五子棋的棋盘大小、棋子类型、评估函数中的棋型、走法评分表等变量。首先,我们定义了一些基本的函数,如判断某一位置是否可以下棋、判断是否五子连珠、获得当前棋盘中的所有空位等。接着,我们定义了评估函数,该函数通过检查棋盘中各种棋型的数量来评估当前棋盘状态,并返回一个分数。我们还实现了极大极小搜索算法和alphabeta剪枝算法,用于搜索最优解。最后,我们实现了一个人机对战的函数,通过不断交替让玩家和电脑下棋来进行游戏。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值