遗传算法求解TSP问题学习
(——白嫖代码注释学习系列)
遗传算法属于一种启发式进化算法,简言之即优胜劣汰,通过计算个体适应度来对种群个体进行筛选,再通过一系列的交叉、变异操作来产生新个体,以求最终得到最优的种群,达到求解优化问题的目的。
遗传算法建模流程
1.种群初始化编码(code and decode): 对于想要解决的实际问题,对每一个个体的描述转换成计算机能够读取的语言,例如二进制编码或经其他一些自然语言处理过后得到的玩意,编码方法不能过于复杂,否则解码过程也会很麻烦,另外还要利于进行遗传操作。
2.适应度函数构建(fitness): 在优胜劣汰过程中,谁牛*都是适应度函数说了算,通过适应度函数引入在这个实际问题中你认为的影响因素,也就是多目标问题中各个约束条件,还可以引入一些惩罚机制等等。
3.进化操作(evolution): 选择、交叉、变异。优胜劣汰,遗传进化
- 选择 :往往会误认为是“挑出最好的”这种贪心式选择,其实如果总是挑最好的就成了确定性算法,应该让好的坏的都有几率被选中,只不过好的选中的几率更高。首先就涉及概率分配方法,如适应度比例选择方法和排序方法,然后根据所得的概率分布进行个体选择,方法有轮盘赌、精英保留,往往都是两个一起使,又可以保留好个体不被破坏,又可以保持遗传多样性。
- 交叉:一点交叉、两点交叉、多点交叉。(呜呜呜这里代码最难搞了我觉得我太菜了)
- 变异:根据实际问题,变异方法很多,变异概率不宜太高,效果是可以避免陷入局部最优。
TSP问题
下来我们介绍tsp问题——TSP问题(Traveling Salesman Problem)又译为旅行推销员问题、货郎担问题,是数学领域中著名问题之一。假设有一个旅行商人要拜访n个城市,他必须选择所要走的路径,路径的限制是每个城市只能拜访一次,而且最后要回到原来出发的城市。路径的选择目标是要求得的路径路程为所有路径之中的最小值。
考虑上面提出的几个问题:
1.编码问题: 问题构成是一个起点、一个终点和多个需要经过的中间城市,每一个点有相应的一个坐标,采取直接编码的方式,用点的顺序来表示某一条路径。解码即获取每个点对应的坐标(x,y)。
2.适应度计算:问题求解目标是使得走过得路途最短,每一点的坐标和通过顺序均知道,按顺序计算两点之间得距离即可求得总的路径长度。最小化问题往往转化为最大化问题来求解,但取倒数之后个体之间适应度差异较小,因此再求一次e的幂次方来扩大差异。
3.进化操作:选择过程采用的是适应度比例分配方法,各个个体被选择的概率与其适应度成正比,通过归一化求得其选择概率,采用轮盘赌选择来选择出下一代亲本。两个路径如何交叉配对,莫烦老师采用的是随机选取一些下标组成一个新数组,另外的数字从另一个不同路径中取得,为保证走过所有城市要避免重读。变异采取的是随机选择两个点进行交换变异。
嘴瓢一时爽,动手马上慌。努力慢慢看懂代码,通过写博客逼迫自己,没准多看看就会了对吧,嗯对。
代码出处:https://github.com/MorvanZhou/Evolutionary-Algorithm/blob/master/tutorial-contents/Genetic%20Algorithm/Travel%20Sales%20Person.py
import numpy as np
import matplotlib.pyplot as plt
N_CITIES = 20
CROSS_RATE = 0.1
MUTATE_RATE = 0.04
POP_SIZE = 500
N_GENERATIONS = 500
class GA(object):
def __init__(self,DNA_size, cross_rate, mutation_rate, pop_size,):# __init__class object conductor
# 为GA类增加实例变量
self.DNA_size = DNA_size # DNA length
self.cross_rate = cross_rate # DNA crossover probability
self.mutation_rate = mutation_rate # DNA mutation probability
self.pop_size = pop_size # size of population
# initialize DNA of every individuals in the population
self.pop = np.vstack([np.random.permutation(DNA_size) for _ in range(pop_size)])
# np.random.permutation:randomly permute the sequence of DNA with a length of DNA_size
# for _ in range(): generate a matrix with a dimension of pop_size:generate the first population
"""
example:
np.vstack(np.random.permutation(4) for _ in range(3))
#output:
array([[1, 2, 0, 3],
[1, 0, 3, 2],
[1, 2, 0, 3]])
"""
def translate_DNA(self, DNA, city_position):
'''
get cities' coord(city_position) of cities(DNA) in order
with the coord, fitness can be calculated easily
'''
line_x = np.empty_like(DNA,dtype=np.float64)
line_y = np.empty_like(DNA,dtype=np.float64) # initialization, an empty vessel
for i,d in enumerate(DNA): # for i,key in enumerate() i:obtain the sequence index, key:value of index
city_coord = city_position[d] # Take out the randomly generated cities' coords as a list(x,y)(which has been processed by vstack) (env.city_position)
line_x[i, :] = city_coord[:, 0] # pull out the x and y, in order to calculate the total distance in the fitness
line_y[i, :] = city_coord[:, 1]
return line_x, line_y
def get_fitness(self, line_x, line_y):
'''
calculate the total distance(equal to fitness)
'''
total_distance = np.empty((line_x.shape[0],), dtype=np.float64)
for i, (xs, ys) in enumerate(zip(line_x, line_y)): # samely, a neat way to walk trough the list with enumerate
total_distance[i] = np.sum(np.sqrt(np.square(np.diff(xs)) + np.square(np.diff(ys))))
fitness = np.exp(self.DNA_size * 2 / total_distance)
'''
期望是寻找到total_distance最小的路径,最小化问题往往取倒数或取相反数转化为最大化问题
但是取倒数之后,适应度之间的差异减小,故采取求e的幂指数作为适应度值以增大适应度值的差异
'''
return fitness, total_distance
def select(self, fitness):
'''
(英文注释可真难写呕,死马当活马医吧)
"survival of the fittest“适者生存
roulette wheel selection
select the fittest parents to generate the next population
according to the value of fitness, the bigger value means the higher probability of choice
'''
idx = np.random.choice(np.arange(self.pop_size), size=self.pop_size, replace=True, p=fitness / fitness.sum())
# np.random.choice means randomly take out the value in the list
# p:propability of selection, which is calculated by normalization (fitness / fitness.sum()), realize the roulette wheel selection
# WARNING:propability can't be negative
# replace = True: bootstrap sampling(有放回采样)
return self.pop[idx]
def crossover(self,parent,pop):
'''
genetic operator:crossover
key parameter: cross_rate=0.1,cross_point
'''
if np.random.rand() < self.cross_rate: # a neat way to realize the selection with a rate=0.1
i_ = np.random.randint(0,self.pop_size,size=1) # select another individuals from pop for crossover
cross_points = np.random.randint(0,2,size=self.DNA_size).astype(bool) # generate a sequence of binary code like [0101011] and then transfer to boolean array, like["False","True",...]
keep_city = parent[~cross_points] # take out the value which index is True in the Boolean arrays(说实话为啥要~取反我也不知道为啥)
swap_city = pop[i_,np.isin(pop[i_].ravel(), keep_city,invert=True)]
# parent[cross_points]:先随机选择一些下标提取出来,看作父本
# pop[i_,:]取出pop第i_行,再选择一部分(看作母本)替换父本中的数字实现交叉
# pop[i_].ravel,取出第i行后ravel展开为一维数组
# np.isin(),判断母本中是否有与父本(keep_city)中有重复,同样生成的是一个bool型数组
# invert=True,重复则将bool值反转 避免重复,以保证所有的城市都被走一遍
parent[:] = np.concatenate((keep_city, swap_city)) # Join a sequence of arrays along an existing axis, generate a new indivisuals
'''
Boolean arrays can be used for indexing arrays. They must be the same length as the indexed axis
Multiple Boolean conditions can be applied in combination in these Boolean arrays,such as:&,^,~
'''
return parent
def mutate(self, child):
'''
genetic operator:mutation
Methods:randomly choose two points and exchange position with each other
'''
for point in range(self.DNA_size): # chose first point with "for"
if np.random.rand() < self.mutation_rate:
swap_point = np.random.randint(0,self.DNA_size) # randomly choose second point
swapA, swapB = child[point],child[swap_point]
child[point],child[swap_point] =swapB, swapA # exchange the position of two points
return child
def evolve(self, fitness):
'''
Perform the evolution, call genertic operator
'''
pop = self.select(fitness)
pop_copy = pop.copy()
for parent in pop: # for every parent
child = self.crossover(parent, pop_copy)
child = self.mutate(child)
parent[:] = child
self.pop = pop
class TravelSalesPerson(object):
def __init__(self, n_cities):
self.city_position = np.random.rand(n_cities, 2) # initialize the DB of pop DNA
plt.ion() # Enable (图像交互模式开启)
def plotting(self, lx, ly, total_d):
plt.cla()
plt.scatter(self.city_position[:, 0].T, self.city_position[:, 1].T, s=100, c='k')
plt.plot(lx.T, ly.T, 'r-')
plt.text(-0.05, -0.05, "Total distance=%.2f" % total_d, fontdict={'size': 20, 'color': 'red'})
plt.xlim((-0.1, 1.1))
plt.ylim((-0.1, 1.1))
plt.pause(0.01)
ga = GA(DNA_size=N_CITIES, cross_rate=CROSS_RATE, mutation_rate=MUTATE_RATE, pop_size=POP_SIZE)
env = TravelSalesPerson(N_CITIES) # import the city_position and plot
for generation in range(N_GENERATIONS):
lx, ly = ga.translate_DNA(ga.pop, env.city_position)
fitness, total_distance = ga.get_fitness(lx, ly)
ga.evolve(fitness)
best_idx = np.argmax(fitness)
print('Gen:', generation, '| best fit: %.2f' % fitness[best_idx],)
env.plotting(lx[best_idx], ly[best_idx], total_distance[best_idx])
plt.ioff() # Close image interaction mode
plt.show()
————————————————
原文链接:https://blog.csdn.net/weixin_39812065/article/details/111370719