遗传基础、井字棋

一、 算法的理解

1) 算法简介:

遗传算法在求解问题时,从初始的解种群开始,首先对种群中的N个个体随机初始化,并计算每个个体的适应度函数。如果不满足优化准则,开始新一代计算;按照适应度选择个体,进行基因重组(杂交或交叉),按一定的概率进行变异,重新计算适应度,替换上一代个体。上述过程循环往复,直到满足优化准则为止。

2) 遗传算法的基本框架(伪码):

def GA0():
Initialize the Population P randomly
随机初始化种群P作为初始群体
Calculate the fitness of each individual in P
计算P中每个个体的适应度
while the terminal condition does not meet:
当终止条件不满足时
Generate the new population newP from P, by individual’s fitness由每个个体的适应度进行选择,在上一代基础上产生新一代种群P
Calculate the fitness of each individual in newP
计算新种群中每一个个体的适应度

3) 简单的遗传算法(Simple Genetic Algorithm, SGA),包括以下几方面:

a) 编码方法: 编码方法采用固定长度的二进制编码
b) 个体评价函数:个体评价函数采用非负数。其设计主要满足一下条件:单值、连续、非负、最大化 合理、一致性、计算量小、通用性强。
c) 初始种群:初始种群随机产生
d) 群体大小:遗传算法中初始群体中的个体是随机产生的,其数量要适中不可太小,也不可太大。
e) 选择算子、交叉算子、变异算子: 仅使用选择、交叉和变异三种遗传算子.
选择方法采用赌轮方法,交叉方法采用单点交叉,变异方法采用基本位变异 f) 算法终止条件:
算法终止条件为指定的迭代次数或找到最优解(或次优解)。

4) 结合井字棋:

对于3*3的方格,8种等价情况,通过其中一种进行旋转,镜像操作得到另其七种等价。
一次迭代过程:
首先对当前种群进行克隆,避免对后面产生影响。
对于种群中的每一个个体。产生一个随机数,如果其概率在遗传概率以内,则将该个体遗传,否则进行杂交操作,生成的新个体。然后对整个种群用适应度函数进行评估。
用PROB保存某种个体的累积适应度值,统计种群中个体适应度的最大值,平均值,并获得适应度最好的个体MAX_INDIVIDUAL,当最好的个体适应度达到最大值1时提前退出迭代。
杂交操作:
对于d1,d2的每一个基因位,通过杂交概率来随机选择取得是d1的还是d2的。
计算某个个体的适应度:
针对此例题,对于一个个体的评价,要综合考虑其先手下棋与后手下棋的表现。通过统计其输棋数量与下棋总数,可以计算出输棋率,进而得到其胜率。将胜率作为其评价值。
GenRandomIndividual为随机从所有下发中选取一个,及随机选取一个孩子节点。Select则是依据累积适应度加以限制,随机从种群中选取一个。

二、 运行结果(截图)

在这里插入图片描述

三、原码

# -*- coding: utf-8 -*-
"""
Created on Wed Nov 28 21:23:04 2018

@author: duxiaoqin
Functions:
    (1) GA for TicTacToe, evolving a perfect strategy which never loses a game.
"""

from graphics import *
from tictactoe import *
from tttdraw import *
from tttinput import *
import sys
import time
from random import *
import numpy as np
import matplotlib.pyplot as plt
import pickle
import copy

EQUIVALENT = [
             [0,1,2,3,4,5,6,7,8],
             [6,3,0,7,4,1,8,5,2],
             [8,7,6,5,4,3,2,1,0],
             [2,5,8,1,4,7,0,3,6],
             [6,7,8,3,4,5,0,1,2],
             [8,5,2,7,4,1,6,3,0],
             [2,1,0,5,4,3,8,7,6],
             [0,3,6,1,4,7,2,5,8]]
generation_num = 3000
population_num = 800
prob_crossover = 0.15
prob_replicate = 0.10
prob_mutation = 0.001
INDIVIDUAL_TEMPLATE = {}
STATE = {}
POPULATION = []
FITNESS = [0]*population_num
PROB = [0]*population_num
T = range(generation_num)
BEST_FITNESS = [1]*generation_num
MAX_FITNESS = [0]*generation_num
AVERAGE_FITNESS = [0]*generation_num
MAX_INDIVIDUAL = [0]*generation_num

def GenEquivalent(ttt_str):
    TTT_STR = []
    for index in EQUIVALENT:
        TTT_STR.append(''.join([ttt_str[i] for i in index]))
    return TTT_STR

def GenEquivalentMove(base_str, base_move, equ_str):
    TTT_STR = GenEquivalent(base_str)
    move = base_move[0]*3+base_move[1]
    equ_index = TTT_STR.index(equ_str)
    move_index = EQUIVALENT[equ_index].index(move)
    return (move_index // 3, move_index % 3)

def Init():
    global INDIVIDUAL_TEMPLATE, STATE
    try:
        template_file = open('IndividualTemplate.dat', 'rb')
        INDIVIDUAL_TEMPLATE = pickle.load(template_file)
        template_file.close()
        
        state_file = open('State.dat', 'rb')
        STATE = pickle.load(state_file)
        state_file.close()
    except FileNotFoundError:
        INDIVIDUAL_TEMPLATE = {}
        STATE = {}
        ttt = TicTacToe()
        GenerateIndividualTemplate(ttt)
        
        template_file = open('IndividualTemplate.dat', 'wb')
        pickle.dump(INDIVIDUAL_TEMPLATE, template_file)
        template_file.close()
        
        state_file = open('State.dat', 'wb')
        pickle.dump(STATE, state_file)
        state_file.close()
        
    items = INDIVIDUAL_TEMPLATE.items()
    print(len(items))
    for i in range(population_num):
        individual = GenRandomIndividual(items)
        POPULATION.append(individual)
    fitness_sum = CalculateFitness()
    PROB[0] = FITNESS[0]/fitness_sum
    for i in range(1, len(FITNESS)):
        PROB[i] = PROB[i-1]+FITNESS[i]/fitness_sum

def GenerateIndividualTemplate(ttt):
    if ttt.isGameOver() != None:
        return
    
    moves = ttt.getAllMoves()
    ttt_str = ttt.ToString()
    if STATE.get(ttt_str) == None:
        for equ_str in GenEquivalent(ttt_str):
            STATE[equ_str] = ttt_str #base state
        INDIVIDUAL_TEMPLATE[ttt_str] = moves
    for move in moves:
        node = ttt.clone()
        node.play(*move)
        GenerateIndividualTemplate(node)
    
def GenRandomIndividual(items):
    seed()
    individual = {}
    for ttt_str, moves in items:
        individual[ttt_str] = moves[randint(0, len(moves)-1)]
    return individual
        
def Select(population):
    r = random()
    for i in range(len(PROB)):
        if r <= PROB[i]:
            return copy.deepcopy(population[i])

#d1, d2: two individuals
def Crossover(d1, d2):
    d = {}
    for key in d1.keys():
        r = random()
        if r <= prob_crossover:
            d[key] = d1[key]
        else:
            d[key] = d2[key]
    return d
            
#d: individual
#d[i][0]: encode of state i
#d[i][1]: move of state i        
def Mutate(d):
    for key in d.keys():
        if random() <= prob_mutation:
            moves = INDIVIDUAL_TEMPLATE[key]
            d[key] = moves[randint(0, len(moves)-1)]
            
def CalculateFitness():
    PLAY_NUM = [0]*population_num
    LOST_NUM = [0]*population_num
    for i in range(population_num):
        ttt = TicTacToe()
        lost_num, play_num = PlayGameAsFirst(ttt, POPULATION[i])
        LOST_NUM[i] += lost_num
        PLAY_NUM[i] += play_num
        ttt = TicTacToe()
        lost_num, play_num = PlayGameAsSecond(ttt, POPULATION[i])
        LOST_NUM[i] += lost_num
        PLAY_NUM[i] += play_num
    fitness_sum = 0
    for i in range(population_num):
        FITNESS[i] = 1 - LOST_NUM[i]/PLAY_NUM[i]
        fitness_sum += FITNESS[i]
    return fitness_sum

def PlayGameAsFirst(ttt, d):
    all_lost_num = 0
    all_play_num = 0
    result = ttt.isGameOver()
    if result != None:
        if result == TicTacToe.WHITEWIN:
            return 1, 1
        else:
            return 0, 1
    ttt_str = ttt.ToString()
    base_str = STATE[ttt_str]
    move = GenEquivalentMove(base_str, d[base_str], ttt_str)
    ttt.play(*move)
    result = ttt.isGameOver()
    if result != None:
        if result == TicTacToe.WHITEWIN:
            return 1, 1
        else:
            return 0, 1
        
    moves = ttt.getAllMoves()
    for move in moves:
        node = ttt.clone()
        node.play(*move)
        result = node.isGameOver()
        if result != None:
            if result == TicTacToe.WHITEWIN:
                all_lost_num += 1
            all_play_num += 1
        else:
            lost_num, play_num = PlayGameAsFirst(node, d)
            all_lost_num += lost_num
            all_play_num += play_num

    return all_lost_num, all_play_num

def PlayGameAsSecond(ttt, d):
    all_lost_num = 0
    all_play_num = 0
    result = ttt.isGameOver()
    if result != None:
        if result == TicTacToe.BLACKWIN:
            return 1, 1
        else:
            return 0, 1
        
    moves = ttt.getAllMoves()
    for move in moves:
        node = ttt.clone()
        node.play(*move)
        result = node.isGameOver()
        if result != None:
            if result == TicTacToe.BLACKWIN:
                all_lost_num += 1
            all_play_num += 1
        else:
            ttt_str = node.ToString()
            base_str = STATE[ttt_str]
            move = GenEquivalentMove(base_str, d[base_str], ttt_str)
            node.play(*move)
            result = node.isGameOver()
            if result != None:
                if result == TicTacToe.BLACKWIN:
                    all_lost_num += 1
                all_play_num += 1
            else:
                lost_num, play_num = PlayGameAsSecond(node, d)
                all_lost_num += lost_num
                all_play_num += play_num

    return all_lost_num, all_play_num

def GetBestIndividual():
    max_fitness = -sys.maxsize
    max_individual = None
    for i in range(population_num):
        if max_fitness < FITNESS[i]:
            max_fitness = FITNESS[i]
            max_individual = copy.deepcopy(POPULATION[i])
    return max_individual

def main():
    global INDIVIDUAL_TEMPLATE, STATE, MAX_FITNESS, AVERAGE_FITNESS, MAX_INDIVIDUAL
    try:
        best_file = open('BestIndividual.dat', 'rb')
        best_individual = pickle.load(best_file)
        best_file.close()

        template_file = open('IndividualTemplate.dat', 'rb')
        INDIVIDUAL_TEMPLATE = pickle.load(template_file)
        template_file.close()
        
        state_file = open('State.dat', 'rb')
        STATE = pickle.load(state_file)
        state_file.close()
        
        maxfitness_file = open('MaxFitness.dat', 'rb')
        MAX_FITNESS = pickle.load(maxfitness_file)
        maxfitness_file.close()

        avgfitness_file = open('AverageFitness.dat', 'rb')
        AVERAGE_FITNESS = pickle.load(avgfitness_file)
        avgfitness_file.close()

        maxindividual_file = open('MaxIndividual.dat', 'rb')
        MAX_INDIVIDUAL = pickle.load(maxindividual_file)
        maxindividual_file.close()
        
    except FileNotFoundError:
        Init()
        for t in range(generation_num):
            P_TMP = copy.deepcopy(POPULATION)
            for i in range(population_num):
                seed()
                if random() <= prob_replicate:
                    POPULATION[i] = Select(P_TMP)
                else:
                    d1 = Select(P_TMP)
                    d2 = Select(P_TMP)
                    d = Crossover(d1, d2)
                    Mutate(d)
                    POPULATION[i] = d
                
            fitness_sum = CalculateFitness()
                
            #Update the statistics of population
            PROB[0] = FITNESS[0]/fitness_sum
            for i in range(1, len(FITNESS)):
                PROB[i] = PROB[i-1]+FITNESS[i]/fitness_sum
            
            MAX_FITNESS[t] = max(FITNESS)
            AVERAGE_FITNESS[t] = fitness_sum/population_num
            MAX_INDIVIDUAL[t] = GetBestIndividual()
            print('t = ', t, ' Average Fitness = ', AVERAGE_FITNESS[t], \
                  ' Max Fitness = ', MAX_FITNESS[t])
            if MAX_FITNESS[t] == 1.0:
                break
                
        best_individual = GetBestIndividual()
        
        best_file = open('BestIndividual.dat', 'wb')
        pickle.dump(best_individual, best_file)
        best_file.close()
        
        maxfitness_file = open('MaxFitness.dat', 'wb')
        pickle.dump(MAX_FITNESS, maxfitness_file)
        maxfitness_file.close()

        avgfitness_file = open('AverageFitness.dat', 'wb')
        pickle.dump(AVERAGE_FITNESS, avgfitness_file)
        avgfitness_file.close()

        maxindividual_file = open('MaxIndividual.dat', 'wb')
        pickle.dump(MAX_INDIVIDUAL, maxindividual_file)
        maxindividual_file.close()
        
        plt.plot(T, BEST_FITNESS)
        plt.plot(T, MAX_FITNESS)
        plt.plot(T, AVERAGE_FITNESS)
        plt.show()
        
    win = GraphWin('GA for TicTacToe', 600, 600, autoflush=False)
    ttt = TicTacToe()
    tttdraw = TTTDraw(win)
    tttinput = TTTInput(win)
    tttdraw.draw(ttt)
    
    while win.checkKey() != 'Escape':
        if ttt.getPlayer() == TicTacToe.WHITE:
            ttt_str = ttt.ToString()
            base_str = STATE[ttt_str]
            move = GenEquivalentMove(base_str, best_individual[base_str], ttt_str)
            if move != ():
                ttt.play(*move)
        tttinput.input(ttt)
        tttdraw.draw(ttt)
        if ttt.isGameOver() != None:
            time.sleep(1)
            ttt.reset()
            tttdraw.draw(ttt)
            #win.getMouse()
    win.close()
    
if __name__ == '__main__':
    main()

四、 阶段总结

遗传算法(Genetic Algorithm, GA)是模拟达尔文生物进化论的自然选择和遗传学机理的生物进化过程的计算模型,是一种通过模拟自然进化过程搜索最优解的方法。遗传算法的核心内容为:参数编码、初始群体的设定、适应度函数的设计、遗传操作设计、控制参数,选择、交叉、和编译构成了遗传算法的遗传操作。
与传统的优化算法相比。传统优化算法是从单个初始值迭代求最优解的;容易误入局部最优解。遗传算法从串集开始搜索,覆盖面大,利于全局择优。同时遗传算法具有自组织、自适应和自学习性。遗传算法利用进化过程获得的信息自行组织搜索时,适应度大的个体具有较高的生存概率,并获得更适应环境的基因结构。但遗传算法也有不足的地方,首先需要有一个良好的编码,编码不能保证一定是规范的及编码存在表示的不准确性。遗传算法通常的效率比其他传统的优化方法低,它需要更多地时间去训练。

引用:

武汉纺织大学杜老师的github
此文章在学完杜小勤的课程后所写,文章中部分内容是借鉴杜老师。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

嗯哼_Hello

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值