Python 五子棋AI实现(4):启发式评估

启发式评估

影响alpha beta剪枝效率的关键,是要让评分高的位置更早的被搜索到,这样可以更快的进行剪枝。
比如图1是上一篇文章中的例子,在这个结构下,只有节点C的第二个子节点被剪枝了
1
通过启发式评估后,我们可以先预估节点A,B,C的评分,假设和实际情况一样,得到评分是节点B > 节点C > 节点A,在生成博弈树时,通过调用子节点的前后顺序,就可以更快的进行剪枝。
比如图2,就是上图博弈树重新按照子节点的预估评分进行排序后的结果。可以看到节点C 和 节点A的第二个子节点都被剪枝了,加快了搜索效率。
2
要实现这一点,就需要对每一个可以下的位置进行评分的预估,让预估分高的位置排在前面。目前采用的预估评分方法是对于一个空的位置,分别下白棋或黑棋,获取这个点四个方向能够形成的棋型,然后打分。

代码实现

genmove函数中,和前面文章中对比,可以看到获取位置评分的时候有修改,在空位置上,会调用evaluatePointScore函数获取下己方或对方棋子时形成的棋局的评分,然后将评分较高的位置(有连五,活四,或冲四)加入单独的列表中,因为如果出现有这几种评分的位置,就可以不考虑评分更低的位置。

函数的最后可以看到有个最大位置数目 AI_LIMITED_MOVE_NUM 的限制,因为我们已经对所有可下的空位置进行了评估,经过评分的排序后,排在后面的位置基本不可能是最有利的位置,所以可以提前去掉,减少博弈树搜索的节点数目。

获取位置的评分

AI_LIMITED_MOVE_NUM = 20

	# get all positions near chess
	def genmove(self, board, turn):
		fives = []
		mfours, ofours = [], []
		msfours, osfours = [], []
		if turn == MAP_ENTRY_TYPE.MAP_PLAYER_ONE:
			mine = 1
			opponent = 2
		else:
			mine = 2
			opponent = 1

		moves = []
		radius = 1

		for y in range(self.len):
			for x in range(self.len):
				if board[y][x] == 0 and self.hasNeighbor(board, x, y, radius):
					mscore, oscore = self.evaluatePointScore(board, x, y, mine, opponent)
					point = (max(mscore, oscore), x, y)

					if mscore >= SCORE_FIVE or oscore >= SCORE_FIVE:
						fives.append(point)
					elif mscore >= SCORE_FOUR:
						mfours.append(point)
					elif oscore >= SCORE_FOUR:
						ofours.append(point)
					elif mscore >= SCORE_SFOUR:
						msfours.append(point)
					elif oscore >= SCORE_SFOUR:
						osfours.append(point)

					moves.append(point)

		if len(fives) > 0: return fives

		if len(mfours) > 0: return mfours

		if len(ofours) > 0:
			if len(msfours) == 0:
				return ofours
			else:
				return ofours + msfours

		moves.sort(reverse=True)

		# FIXME: decrease think time: only consider limited moves with higher scores
		if self.maxdepth > 2 and len(moves) > AI_LIMITED_MOVE_NUM:
			moves = moves[:AI_LIMITED_MOVE_NUM]
		return moves

单个位置的评估函数

evaluatePointScore函数分别下己方或对方的棋子,对形成的棋局调用evaluatePoint 函数搜索该位置的4个方向获取棋型统计,然后调用getPointScore 函数获取该位置下己方或对方棋子的评分。

	# evaluate score of point, to improve pruning efficiency
	def evaluatePointScore(self, board, x, y, mine, opponent):
		for i in range(len(self.count)):
			for j in range(len(self.count[0])):
				self.count[i][j] = 0
				
		board[y][x] = mine
		self.evaluatePoint(board, x, y, mine, opponent, self.count[mine-1])
		mine_count = self.count[mine-1]
		board[y][x] = opponent
		self.evaluatePoint(board, x, y, opponent, mine, self.count[opponent-1])
		opponent_count = self.count[opponent-1]
		board[y][x] = 0

		mscore = self.getPointScore(mine_count)
		oscore = self.getPointScore(opponent_count)

		return (mscore, oscore)

单个位置的棋型评分函数

getPointScore函数会统计所有棋型的分数,这边将每种棋型的分数差距变大,防止优先级低的棋型分数相加会大于优先级高的棋型,导致误判。如果棋型中有连五或活四,就直接返回。在只有单独一个冲四的情况,效果和活三差不多,所以只给活三相同的分数。小于等于活三的棋型分数就直接相加。

SCORE_FIVE, SCORE_FOUR, SCORE_SFOUR = 100000, 10000, 1000
SCORE_THREE, SCORE_STHREE, SCORE_TWO, SCORE_STWO = 100, 10, 8, 2

	def getPointScore(self, count):
		score = 0
		if count[FIVE] > 0:
			return SCORE_FIVE

		if count[FOUR] > 0:
			return SCORE_FOUR
		
		# FIXME: the score of one chong four and no live three should be low, set it to live three
		if count[SFOUR] > 1:
			score += count[SFOUR] * SCORE_SFOUR
		elif count[SFOUR] > 0 and count[THREE] > 0:
			score += count[SFOUR] * SCORE_SFOUR
		elif count[SFOUR] > 0:
			score += SCORE_THREE 

		if count[THREE] > 1:
			score += 5 * SCORE_THREE
		elif count[THREE] > 0:
			score += SCORE_THREE

		if count[STHREE] > 0:
			score += count[STHREE] * SCORE_STHREE
		if count[TWO] > 0:
			score += count[TWO] * SCORE_TWO
		if count[STWO] > 0:
			score += count[STWO] * SCORE_STWO

		return score

AI搜索搜索时间

设置的搜索深度 AI_SEARCH_DEPTH 为4,对比上一篇文章的测试,在平均搜索时间和测试中出现的最大搜索时间有很大的优化。可以看到搜索时间保持在2秒左右,最大搜索时间小于5秒。
time[0.78] (8, 8), score[-10] alpha[4023] belta[905]
time[1.69] (10, 8), score[-394] alpha[7833] belta[1870]
time[0.90] (8, 7), score[-398] alpha[3431] belta[1007]
time[0.96] (9, 6), score[-400] alpha[4675] belta[792]
time[0.74] (5, 9), score[-396] alpha[2644] belta[535]
time[0.50] (10, 4), score[-398] alpha[1667] belta[337]
time[2.71] (10, 5), score[-4] alpha[11497] belta[1258]
time[1.61] (10, 7), score[-2] alpha[5080] belta[819]
time[1.48] (11, 4), score[-6] alpha[4381] belta[767]
time[1.73] (12, 4), score[-6] alpha[5090] belta[853]
time[1.51] (13, 4), score[-8] alpha[4139] belta[796]
time[1.52] (8, 5), score[-16] alpha[4068] belta[778]
time[2.08] (7, 6), score[-408] alpha[4769] belta[1215]
time[1.68] (7, 9), score[-402] alpha[2620] belta[1229]
time[1.00] (2, 4), score[-392] alpha[1512] belta[628]
time[2.47] (8, 9), score[-16] alpha[6789] belta[1187]
time[2.45] (8, 10), score[-20] alpha[6682] belta[1113]
time[2.81] (11, 7), score[-18] alpha[7496] belta[1166]

完整代码

一共有三个文件,main.py, GameMap.pyChessAI.py这次只修改了ChessAI.py,前两个文件可以看之前文章中的代码。

ChessAI.py

from GameMap import *
from enum import IntEnum
from random import randint
import time

AI_SEARCH_DEPTH = 4
AI_LIMITED_MOVE_NUM = 20


class CHESS_TYPE(IntEnum):
	NONE = 0,
	SLEEP_TWO = 1,
	LIVE_TWO = 2,
	SLEEP_THREE = 3
	LIVE_THREE = 4,
	CHONG_FOUR = 5,
	LIVE_FOUR = 6,
	LIVE_FIVE = 7,
	
CHESS_TYPE_NUM = 8

FIVE = CHESS_TYPE.LIVE_FIVE.value
FOUR, THREE, TWO = CHESS_TYPE.LIVE_FOUR.value, CHESS_TYPE.LIVE_THREE.value, CHESS_TYPE.LIVE_TWO.value
SFOUR, STHREE, STWO = CHESS_TYPE.CHONG_FOUR.value, CHESS_TYPE.SLEEP_THREE.value, CHESS_TYPE.SLEEP_TWO.value

SCORE_MAX = 0x7fffffff
SCORE_MIN = -1 * SCORE_MAX
SCORE_FIVE, SCORE_FOUR, SCORE_SFOUR = 100000, 10000, 1000
SCORE_THREE, SCORE_STHREE, SCORE_TWO, SCORE_STWO = 100, 10, 8, 2

class ChessAI():
	def __init__(self, chess_len):
		self.len = chess_len
		# [horizon, vertical, left diagonal, right diagonal]
		self.record = [[[0,0,0,0] for x in range(chess_len)] for y in range(chess_len)]
		self.count = [[0 for x in range(CHESS_TYPE_NUM)] for i in range(2)]
		
	def reset(self):
		for y in range(self.len):
			for x in range(self.len):
				for i in range(4):
					self.record[y][x][i] = 0

		for i in range(len(self.count)):
			for j in range(len(self.count[0])):
				self.count[i][j] = 0

	
	def click(self, map, x, y, turn):
		map.click(x, y, turn)
		
	def isWin(self, board, turn):
		return self.evaluate(board, turn, True)
	
	# evaluate score of point, to improve pruning efficiency
	def evaluatePointScore(self, board, x, y, mine, opponent):
		dir_offset = [(1, 0), (0, 1), (1, 1), (1, -1)] # direction from left to right
		for i in range(len(self.count)):
			for j in range(len(self.count[0])):
				self.count[i][j] = 0
				
		board[y][x] = mine
		self.evaluatePoint(board, x, y, mine, opponent, self.count[mine-1])
		mine_count = self.count[mine-1]
		board[y][x] = opponent
		self.evaluatePoint(board, x, y, opponent, mine, self.count[opponent-1])
		opponent_count = self.count[opponent-1]
		board[y][x] = 0

		mscore = self.getPointScore(mine_count)
		oscore = self.getPointScore(opponent_count)

		return (mscore, oscore)

	# check if has a none empty position in it's radius range
	def hasNeighbor(self, board, x, y, radius):
		start_x, end_x = (x - radius), (x + radius)
		start_y, end_y = (y - radius), (y + radius)

		for i in range(start_y, end_y+1):
			for j in range(start_x, end_x+1):
				if i >= 0 and i < self.len and j >= 0 and j < self.len:
					if board[i][j] != 0:
						return True
		return False

	# get all positions near chess
	def genmove(self, board, turn):
		fives = []
		mfours, ofours = [], []
		msfours, osfours = [], []
		if turn == MAP_ENTRY_TYPE.MAP_PLAYER_ONE:
			mine = 1
			opponent = 2
		else:
			mine = 2
			opponent = 1

		moves = []
		radius = 1

		for y in range(self.len):
			for x in range(self.len):
				if board[y][x] == 0 and self.hasNeighbor(board, x, y, radius):
					mscore, oscore = self.evaluatePointScore(board, x, y, mine, opponent)
					point = (max(mscore, oscore), x, y)

					if mscore >= SCORE_FIVE or oscore >= SCORE_FIVE:
						fives.append(point)
					elif mscore >= SCORE_FOUR:
						mfours.append(point)
					elif oscore >= SCORE_FOUR:
						ofours.append(point)
					elif mscore >= SCORE_SFOUR:
						msfours.append(point)
					elif oscore >= SCORE_SFOUR:
						osfours.append(point)

					moves.append(point)

		if len(fives) > 0: return fives

		if len(mfours) > 0: return mfours

		if len(ofours) > 0:
			if len(msfours) == 0:
				return ofours
			else:
				return ofours + msfours

		moves.sort(reverse=True)

		# FIXME: decrease think time: only consider limited moves with higher scores
		if self.maxdepth > 2 and len(moves) > AI_LIMITED_MOVE_NUM:
			moves = moves[:AI_LIMITED_MOVE_NUM]
		return moves
	
	def __search(self, board, turn, depth, alpha = SCORE_MIN, beta = SCORE_MAX):
		score = self.evaluate(board, turn)
		if depth <= 0 or abs(score) >= SCORE_FIVE: 
			return score

		moves = self.genmove(board, turn)
		bestmove = None
		self.alpha += len(moves)
		
		# if there are no moves, just return the score
		if len(moves) == 0:
			return score

		for _, x, y in moves:
			board[y][x] = turn
			
			if turn == MAP_ENTRY_TYPE.MAP_PLAYER_ONE:
				op_turn = MAP_ENTRY_TYPE.MAP_PLAYER_TWO
			else:
				op_turn = MAP_ENTRY_TYPE.MAP_PLAYER_ONE

			score = - self.__search(board, op_turn, depth - 1, -beta, -alpha)

			board[y][x] = 0
			self.belta += 1

			# alpha/beta pruning
			if score > alpha:
				alpha = score
				bestmove = (x, y)
				if alpha >= beta:
					break

		if depth == self.maxdepth and bestmove:
			self.bestmove = bestmove
				
		return alpha

	def search(self, board, turn, depth = 4):
		self.maxdepth = depth
		self.bestmove = None
		score = self.__search(board, turn, depth)
		x, y = self.bestmove
		return score, x, y
		
	def findBestChess(self, board, turn):
		time1 = time.time()
		self.alpha = 0
		self.belta = 0
		score, x, y = self.search(board, turn, AI_SEARCH_DEPTH)
		time2 = time.time()
		print('time[%.2f] (%d, %d), score[%d] alpha[%d] belta[%d]' % ((time2-time1), x, y, score, self.alpha, self.belta))
		return (x, y)
	
	def getPointScore(self, count):
		score = 0
		if count[FIVE] > 0:
			return SCORE_FIVE

		if count[FOUR] > 0:
			return SCORE_FOUR
		
		# FIXME: the score of one chong four and no live three should be low, set it to live three
		if count[SFOUR] > 1:
			score += count[SFOUR] * SCORE_SFOUR
		elif count[SFOUR] > 0 and count[THREE] > 0:
			score += count[SFOUR] * SCORE_SFOUR
		elif count[SFOUR] > 0:
			score += SCORE_THREE 

		if count[THREE] > 1:
			score += 5 * SCORE_THREE
		elif count[THREE] > 0:
			score += SCORE_THREE

		if count[STHREE] > 0:
			score += count[STHREE] * SCORE_STHREE
		if count[TWO] > 0:
			score += count[TWO] * SCORE_TWO
		if count[STWO] > 0:
			score += count[STWO] * SCORE_STWO

		return score

	# calculate score, FIXME: May Be Improved
	def getScore(self, mine_count, opponent_count):
		mscore, oscore = 0, 0
		if mine_count[FIVE] > 0:
			return (SCORE_FIVE, 0)
		if opponent_count[FIVE] > 0:
			return (0, SCORE_FIVE)
				
		if mine_count[SFOUR] >= 2:
			mine_count[FOUR] += 1
		if opponent_count[SFOUR] >= 2:
			opponent_count[FOUR] += 1
				
		if mine_count[FOUR] > 0:
			return (9050, 0)
		if mine_count[SFOUR] > 0:
			return (9040, 0)
			
		if opponent_count[FOUR] > 0:
			return (0, 9030)
		if opponent_count[SFOUR] > 0 and opponent_count[THREE] > 0:
			return (0, 9020)
			
		if mine_count[THREE] > 0 and opponent_count[SFOUR] == 0:
			return (9010, 0)
			
		if (opponent_count[THREE] > 1 and mine_count[THREE] == 0 and mine_count[STHREE] == 0):
			return (0, 9000)

		if opponent_count[SFOUR] > 0:
			oscore += 400

		if mine_count[THREE] > 1:
			mscore += 500
		elif mine_count[THREE] > 0:
			mscore += 100
			
		if opponent_count[THREE] > 1:
			oscore += 2000
		elif opponent_count[THREE] > 0:
			oscore += 400

		if mine_count[STHREE] > 0:
			mscore += mine_count[STHREE] * 10
		if opponent_count[STHREE] > 0:
			oscore += opponent_count[STHREE] * 10
			
		if mine_count[TWO] > 0:
			mscore += mine_count[TWO] * 6
		if opponent_count[TWO] > 0:
			oscore += opponent_count[TWO] * 6
				
		if mine_count[STWO] > 0:
			mscore += mine_count[STWO] * 2
		if opponent_count[STWO] > 0:
			oscore += opponent_count[STWO] * 2
		
		return (mscore, oscore)

	def evaluate(self, board, turn, checkWin=False):
		self.reset()
		
		if turn == MAP_ENTRY_TYPE.MAP_PLAYER_ONE:
			mine = 1
			opponent = 2
		else:
			mine = 2
			opponent = 1
		
		for y in range(self.len):
			for x in range(self.len):
				if board[y][x] == mine:
					self.evaluatePoint(board, x, y, mine, opponent)
				elif board[y][x] == opponent:
					self.evaluatePoint(board, x, y, opponent, mine)
		
		mine_count = self.count[mine-1]
		opponent_count = self.count[opponent-1]
		if checkWin:
			return mine_count[FIVE] > 0
		else:	
			mscore, oscore = self.getScore(mine_count, opponent_count)
			return (mscore - oscore)
	
	def evaluatePoint(self, board, x, y, mine, opponent, count=None):
		dir_offset = [(1, 0), (0, 1), (1, 1), (1, -1)] # direction from left to right
		ignore_record = True
		if count is None:
			count = self.count[mine-1]
			ignore_record = False
		for i in range(4):
			if self.record[y][x][i] == 0 or ignore_record:
				self.analysisLine(board, x, y, i, dir_offset[i], mine, opponent, count)

	
	# line is fixed len 9: XXXXMXXXX
	def getLine(self, board, x, y, dir_offset, mine, opponent):
		line = [0 for i in range(9)]
		
		tmp_x = x + (-5 * dir_offset[0])
		tmp_y = y + (-5 * dir_offset[1])
		for i in range(9):
			tmp_x += dir_offset[0]
			tmp_y += dir_offset[1]
			if (tmp_x < 0 or tmp_x >= self.len or 
				tmp_y < 0 or tmp_y >= self.len):
				line[i] = opponent # set out of range as opponent chess
			else:
				line[i] = board[tmp_y][tmp_x]
						
		return line
		
	def analysisLine(self, board, x, y, dir_index, dir, mine, opponent, count):
		# record line range[left, right] as analysized
		def setRecord(self, x, y, left, right, dir_index, dir_offset):
			tmp_x = x + (-5 + left) * dir_offset[0]
			tmp_y = y + (-5 + left) * dir_offset[1]
			for i in range(left, right+1):
				tmp_x += dir_offset[0]
				tmp_y += dir_offset[1]
				self.record[tmp_y][tmp_x][dir_index] = 1
	
		empty = MAP_ENTRY_TYPE.MAP_EMPTY.value
		left_idx, right_idx = 4, 4
		
		line = self.getLine(board, x, y, dir, mine, opponent)

		while right_idx < 8:
			if line[right_idx+1] != mine:
				break
			right_idx += 1
		while left_idx > 0:
			if line[left_idx-1] != mine:
				break
			left_idx -= 1
		
		left_range, right_range = left_idx, right_idx
		while right_range < 8:
			if line[right_range+1] == opponent:
				break
			right_range += 1
		while left_range > 0:
			if line[left_range-1] == opponent:
				break
			left_range -= 1
		
		chess_range = right_range - left_range + 1
		if chess_range < 5:
			setRecord(self, x, y, left_range, right_range, dir_index, dir)
			return CHESS_TYPE.NONE
		
		setRecord(self, x, y, left_idx, right_idx, dir_index, dir)
		
		m_range = right_idx - left_idx + 1
		
		# M:mine chess, P:opponent chess or out of range, X: empty
		if m_range >= 5:
			count[FIVE] += 1
		
		# Live Four : XMMMMX 
		# Chong Four : XMMMMP, PMMMMX
		if m_range == 4:
			left_empty = right_empty = False
			if line[left_idx-1] == empty:
				left_empty = True			
			if line[right_idx+1] == empty:
				right_empty = True
			if left_empty and right_empty:
				count[FOUR] += 1
			elif left_empty or right_empty:
				count[SFOUR] += 1
		
		# Chong Four : MXMMM, MMMXM, the two types can both exist
		# Live Three : XMMMXX, XXMMMX
		# Sleep Three : PMMMX, XMMMP, PXMMMXP
		if m_range == 3:
			left_empty = right_empty = False
			left_four = right_four = False
			if line[left_idx-1] == empty:
				if line[left_idx-2] == mine: # MXMMM
					setRecord(self, x, y, left_idx-2, left_idx-1, dir_index, dir)
					count[SFOUR] += 1
					left_four = True
				left_empty = True
				
			if line[right_idx+1] == empty:
				if line[right_idx+2] == mine: # MMMXM
					setRecord(self, x, y, right_idx+1, right_idx+2, dir_index, dir)
					count[SFOUR] += 1
					right_four = True 
				right_empty = True
			
			if left_four or right_four:
				pass
			elif left_empty and right_empty:
				if chess_range > 5: # XMMMXX, XXMMMX
					count[THREE] += 1
				else: # PXMMMXP
					count[STHREE] += 1
			elif left_empty or right_empty: # PMMMX, XMMMP
				count[STHREE] += 1
		
		# Chong Four: MMXMM, only check right direction
		# Live Three: XMXMMX, XMMXMX the two types can both exist
		# Sleep Three: PMXMMX, XMXMMP, PMMXMX, XMMXMP
		# Live Two: XMMX
		# Sleep Two: PMMX, XMMP
		if m_range == 2:
			left_empty = right_empty = False
			left_three = right_three = False
			if line[left_idx-1] == empty:
				if line[left_idx-2] == mine:
					setRecord(self, x, y, left_idx-2, left_idx-1, dir_index, dir)
					if line[left_idx-3] == empty:
						if line[right_idx+1] == empty: # XMXMMX
							count[THREE] += 1
						else: # XMXMMP
							count[STHREE] += 1
						left_three = True
					elif line[left_idx-3] == opponent: # PMXMMX
						if line[right_idx+1] == empty:
							count[STHREE] += 1
							left_three = True
						
				left_empty = True
				
			if line[right_idx+1] == empty:
				if line[right_idx+2] == mine:
					if line[right_idx+3] == mine:  # MMXMM
						setRecord(self, x, y, right_idx+1, right_idx+2, dir_index, dir)
						count[SFOUR] += 1
						right_three = True
					elif line[right_idx+3] == empty:
						#setRecord(self, x, y, right_idx+1, right_idx+2, dir_index, dir)
						if left_empty:  # XMMXMX
							count[THREE] += 1
						else:  # PMMXMX
							count[STHREE] += 1
						right_three = True
					elif left_empty: # XMMXMP
						count[STHREE] += 1
						right_three = True
						
				right_empty = True
			
			if left_three or right_three:
				pass
			elif left_empty and right_empty: # XMMX
				count[TWO] += 1
			elif left_empty or right_empty: # PMMX, XMMP
				count[STWO] += 1
		
		# Live Two: XMXMX, XMXXMX only check right direction
		# Sleep Two: PMXMX, XMXMP
		if m_range == 1:
			left_empty = right_empty = False
			if line[left_idx-1] == empty:
				if line[left_idx-2] == mine:
					if line[left_idx-3] == empty:
						if line[right_idx+1] == opponent: # XMXMP
							count[STWO] += 1
				left_empty = True

			if line[right_idx+1] == empty:
				if line[right_idx+2] == mine:
					if line[right_idx+3] == empty:
						if left_empty: # XMXMX
							#setRecord(self, x, y, left_idx, right_idx+2, dir_index, dir)
							count[TWO] += 1
						else: # PMXMX
							count[STWO] += 1
				elif line[right_idx+2] == empty:
					if line[right_idx+3] == mine and line[right_idx+4] == empty: # XMXXMX
						count[TWO] += 1
						
		return CHESS_TYPE.NONE
©️2020 CSDN 皮肤主题: 大白 设计师: CSDN官方博客 返回首页
实付0元
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、C币套餐、付费专栏及课程。

余额充值