Python从头开始的演变策略

最新推荐文章于 2023-09-21 18:13:58 发布

Together_CZ

最新推荐文章于 2023-09-21 18:13:58 发布

阅读量439

点赞数 1

分类专栏： python实践机器学习

原文链接：https://machinelearningmastery.com/evolution-strategies-from-scratch-in-python/

版权

python实践同时被 2 个专栏收录

320 篇文章 88 订阅

订阅专栏

机器学习

284 篇文章 57 订阅

订阅专栏

【翻译自： Evolution Strategies From Scratch in Python】

【说明：Jason Brownlee PhD大神的文章个人很喜欢，所以闲暇时间里会做一点翻译和学习实践的工作，这里是相应工作的实践记录，希望能帮到有需要的人！】

进化策略是一种随机的全局优化算法。尽管它是专为连续功能优化而设计的，但它是与其他算法相关的进化算法，例如遗传算法。

在本教程中，您将发现如何实现演化策略优化算法。完成本教程后，您将知道：

进化策略是一种随机的全局优化算法，其灵感来自自然选择的进化生物学理论。
有一种用于演进策略的标准术语，并且有两种常见的算法版本，分别称为（mu，lambda）-ES和（mu + lambda）-ES。
如何实现和将Evolution Strategies算法应用于连续目标函数。

教程概述

本教程分为三个部分：他们是：

进化策略
开发（mu，lambda）-ES
开发（mu + lambda）-ES

进化策略

进化策略，有时也称为进化策略（单数）或ES，是一种随机的全局优化算法。该技术于1960年代开发，由工程师手动实施，以最小化风洞中的风阻设计。

进化策略是一种进化算法，它受到自然进化的生物学理论的启发。与其他进化算法不同，它不使用任何形式的交叉。相反，候选解决方案的修改仅限于变异算子。这样，进化策略可以被认为是一种并行的随机爬山。

该算法涉及最初随机生成的大量候选解决方案。算法的每次迭代都涉及首先评估解的总体，然后删除除最佳解的子集以外的所有子集，这被称为截断选择。其余的解决方案（父代）每个都用作生成许多新的候选解（变异）的基础，这些候选解（父代）替换或与父代争夺总体中的位置，以供算法的下一迭代（生成）考虑。

此过程有多种变体，并且使用标准术语概括了该算法。人口规模称为lambda，每次迭代选择的父母数量称为μ。从每个父代创建的子代数的计算方式为（lambda / mu），应选择参数以使除法没有余数。

mu：每次迭代选择的父母数量。
lambda：人口规模。
lambda / mu：从每个选定的父代生成的子代数。
括号符号用于描述算法配置，例如（μ，λ）-ES。例如，如果mu = 5且lambda = 20，则它将总结为（5，20）-ES。分隔mu和lambda参数的逗号（，）表示在算法的每次迭代中，子代直接替换父代。（mu，lambda）-ES：一种进化策略的版本，其中孩子代替了父母。
mu和lambda参数的加号（+）分隔表示孩子和父母将共同定义下一次迭代的总体。（mu + lambda）-ES：进化策略的一个版本，其中将孩子和父母添加到了种群中。
可以将随机爬山算法实现为“进化策略”，并使用符号（1 +1）-ES。这是一种明喻或规范的ES算法，文献中描述了许多扩展和变体。现在我们熟悉了Evolution Strategies，我们可以探索如何实现该算法

开发（mu，lambda）-ES

在本节中，我们将开发一个（mu，lambda）-ES，即该算法的一种版本，其中子代替换了父代。首先，让我们定义一个具有挑战性的优化问题，作为实现算法的基础。Ackley函数是多模式目标函数的一个示例，该函数具有单个全局最优值和多个局部最优值，可能会卡住局部搜索。因此，需要全局优化技术。这是一个二维目标函数，其全局最佳值为[0,0]，其值为0.0。下面的示例实现了Ackley，并创建了一个三维表面图，显示了全局最优值和多个局部最优值。

# ackley multimodal function
from numpy import arange
from numpy import exp
from numpy import sqrt
from numpy import cos
from numpy import e
from numpy import pi
from numpy import meshgrid
from matplotlib import pyplot
from mpl_toolkits.mplot3d import Axes3D

# objective function
def objective(x, y):
	return -20.0 * exp(-0.2 * sqrt(0.5 * (x**2 + y**2))) - exp(0.5 * (cos(2 * pi * x) + cos(2 * pi * y))) + e + 20

# define range for input
r_min, r_max = -5.0, 5.0
# sample input range uniformly at 0.1 increments
xaxis = arange(r_min, r_max, 0.1)
yaxis = arange(r_min, r_max, 0.1)
# create a mesh from the axis
x, y = meshgrid(xaxis, yaxis)
# compute targets
results = objective(x, y)
# create a surface plot with the jet color scheme
figure = pyplot.figure()
axis = figure.gca(projection='3d')
axis.plot_surface(x, y, results, cmap='jet')
# show the plot
pyplot.show()

运行示例将创建Ackley函数的表面图，以显示大量的局部最优值。

我们将生成随机候选解决方案以及现有候选解决方案的修改版本。所有候选解决方案都应在搜索问题的范围内，这一点很重要。

为此，我们将开发一个功能来检查候选解是否在搜索范围内，然后将其丢弃，如果不是，则生成另一个解。

下面的in_bounds（）函数将采用候选解（点）和搜索空间边界的定义（边界），如果解在搜索范围之内，则返回True，否则返回False。

# check if a point is within the bounds of the search
def in_bounds(point, bounds):
	# enumerate all dimensions of the point
	for d in range(len(bounds)):
		# check if out of bounds for this dimension
		if point[d] < bounds[d, 0] or point[d] > bounds[d, 1]:
			return False
	return True

然后，我们可以在生成“ lam”（例如lambda）随机候选解决方案的初始填充时使用此函数。

例如：

# initial population
population = list()
for _ in range(lam):
 candidate = None
 while candidate is None or not in_bounds(candidate, bounds):
 candidate = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])
 population.append(candidate)

接下来，我们可以迭代固定数量的算法迭代。每次迭代首先涉及评估总体中的每个候选解决方案。

我们将计算分数并将其存储在单独的并行列表中。

# evaluate fitness for the population
scores = [objective(c) for c in population]

接下来，由于我们要最小化目标函数，因此需要选择得分最高，得分最低的“亩”父母。

我们将分两步执行此操作。首先，我们将根据候选解决方案的得分以升序对它们进行排序，以使得分最低的解决方案的排名为0，下一个的解决方案的排名为1，依此类推。我们可以使用argsort函数的两次调用来实现这一点。

然后，我们将使用等级，并选择等级低于“ mu”值的那些父母。这意味着，如果将mu设置为5以选择5个父母，则只会选择等级介于0和4之间的那些父母。

# rank scores in ascending order
ranks = argsort(argsort(scores))
# select the indexes for the top mu ranked solutions
selected = [i for i,_ in enumerate(ranks) if ranks[i] < mu]

然后，我们可以为每个选定的父代创建子代。首先，我们必须计算每个父母要创建的孩子总数。

# calculate the number of children per parent
n_children = int(lam / mu)

然后，我们可以遍历每个父对象并创建每个父对象的修改版本。

我们将使用在随机爬山中使用的类似技术来创建孩子。具体来说，将使用高斯分布对每个变量进行采样，并以当前值作为均值，将标准偏差作为“步长”超参数提供。

# create children for parent
for _ in range(n_children):
 child = None
 while child is None or not in_bounds(child, bounds):
 child = population[i] + randn(len(bounds)) * step_size

我们还可以检查每个选定的父级是否比到目前为止看到的最佳解决方案好，以便我们可以在搜索结束时返回最佳解决方案。

# check if this parent is the best solution ever seen
if scores[i] < best_eval:
 best, best_eval = population[i], scores[i]
 print('%d, Best: f(%s) = %.5f' % (epoch, best, best_eval))

可以将创建的子代添加到列表中，并且在算法迭代结束时，我们可以用子代列表替换总体。

# replace population with children
population = children

我们可以将所有这些绑定到一个名为es_comma（）的函数中，该函数执行Evolution Strategy算法的逗号版本。

该函数采用目标函数的名称，搜索空间的边界，迭代次数，步长以及mu和lambda超参数，并返回在搜索及其评估中找到的最佳解决方案。

# evolution strategy (mu, lambda) algorithm
def es_comma(objective, bounds, n_iter, step_size, mu, lam):
	best, best_eval = None, 1e+10
	# calculate the number of children per parent
	n_children = int(lam / mu)
	# initial population
	population = list()
	for _ in range(lam):
		candidate = None
		while candidate is None or not in_bounds(candidate, bounds):
			candidate = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])
		population.append(candidate)
	# perform the search
	for epoch in range(n_iter):
		# evaluate fitness for the population
		scores = [objective(c) for c in population]
		# rank scores in ascending order
		ranks = argsort(argsort(scores))
		# select the indexes for the top mu ranked solutions
		selected = [i for i,_ in enumerate(ranks) if ranks[i] < mu]
		# create children from parents
		children = list()
		for i in selected:
			# check if this parent is the best solution ever seen
			if scores[i] < best_eval:
				best, best_eval = population[i], scores[i]
				print('%d, Best: f(%s) = %.5f' % (epoch, best, best_eval))
			# create children for parent
			for _ in range(n_children):
				child = None
				while child is None or not in_bounds(child, bounds):
					child = population[i] + randn(len(bounds)) * step_size
				children.append(child)
		# replace population with children
		population = children
	return [best, best_eval]

接下来，我们可以将此算法应用于Ackley目标函数。

我们将对该算法进行5,000次迭代，并在搜索空间中使用0.15的步长。我们将使用100个精选的20个父母（亩）的人口规模（拉姆达）。经过一些反复试验后，才选择了这些超参数。

搜索结束时，我们将报告在搜索过程中找到的最佳候选解决方案。

# seed the pseudorandom number generator
seed(1)
# define range for input
bounds = asarray([[-5.0, 5.0], [-5.0, 5.0]])
# define the total iterations
n_iter = 5000
# define the maximum step size
step_size = 0.15
# number of parents selected
mu = 20
# the number of children generated by parents
lam = 100
# perform the evolution strategy (mu, lambda) search
best, score = es_comma(objective, bounds, n_iter, step_size, mu, lam)
print('Done!')
print('f(%s) = %f' % (best, score))

结合在一起，下面列出了将Evolution Strategies算法的逗号版本应用于Ackley目标函数的完整示例。

# evolution strategy (mu, lambda) of the ackley objective function
from numpy import asarray
from numpy import exp
from numpy import sqrt
from numpy import cos
from numpy import e
from numpy import pi
from numpy import argsort
from numpy.random import randn
from numpy.random import rand
from numpy.random import seed

# objective function
def objective(v):
	x, y = v
	return -20.0 * exp(-0.2 * sqrt(0.5 * (x**2 + y**2))) - exp(0.5 * (cos(2 * pi * x) + cos(2 * pi * y))) + e + 20

# check if a point is within the bounds of the search
def in_bounds(point, bounds):
	# enumerate all dimensions of the point
	for d in range(len(bounds)):
		# check if out of bounds for this dimension
		if point[d] < bounds[d, 0] or point[d] > bounds[d, 1]:
			return False
	return True

# evolution strategy (mu, lambda) algorithm
def es_comma(objective, bounds, n_iter, step_size, mu, lam):
	best, best_eval = None, 1e+10
	# calculate the number of children per parent
	n_children = int(lam / mu)
	# initial population
	population = list()
	for _ in range(lam):
		candidate = None
		while candidate is None or not in_bounds(candidate, bounds):
			candidate = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])
		population.append(candidate)
	# perform the search
	for epoch in range(n_iter):
		# evaluate fitness for the population
		scores = [objective(c) for c in population]
		# rank scores in ascending order
		ranks = argsort(argsort(scores))
		# select the indexes for the top mu ranked solutions
		selected = [i for i,_ in enumerate(ranks) if ranks[i] < mu]
		# create children from parents
		children = list()
		for i in selected:
			# check if this parent is the best solution ever seen
			if scores[i] < best_eval:
				best, best_eval = population[i], scores[i]
				print('%d, Best: f(%s) = %.5f' % (epoch, best, best_eval))
			# create children for parent
			for _ in range(n_children):
				child = None
				while child is None or not in_bounds(child, bounds):
					child = population[i] + randn(len(bounds)) * step_size
				children.append(child)
		# replace population with children
		population = children
	return [best, best_eval]


# seed the pseudorandom number generator
seed(1)
# define range for input
bounds = asarray([[-5.0, 5.0], [-5.0, 5.0]])
# define the total iterations
n_iter = 5000
# define the maximum step size
step_size = 0.15
# number of parents selected
mu = 20
# the number of children generated by parents
lam = 100
# perform the evolution strategy (mu, lambda) search
best, score = es_comma(objective, bounds, n_iter, step_size, mu, lam)
print('Done!')
print('f(%s) = %f' % (best, score))

运行示例将报告候选解决方案，并在每次找到更好的解决方案时进行评分，然后在搜索结束时报告找到的最佳解决方案。

注意：由于算法或评估程序的随机性，或者数值精度的差异，您的结果可能会有所不同。考虑运行该示例几次并比较平均结果。

在这种情况下，我们可以看到在搜索过程中看到了大约22项性能提升，而最佳解决方案接近最优值。

毫无疑问，可以将这种解决方案作为本地搜索算法的起点，以便进一步完善，这是使用诸如ES之类的全局优化算法时的一种常见做法。

0, Best: f([-0.82977995 2.20324493]) = 6.91249
0, Best: f([-1.03232526 0.38816734]) = 4.49240
1, Best: f([-1.02971385 0.21986453]) = 3.68954
2, Best: f([-0.98361735 0.19391181]) = 3.40796
2, Best: f([-0.98189724 0.17665892]) = 3.29747
2, Best: f([-0.07254927 0.67931431]) = 3.29641
3, Best: f([-0.78716147 0.02066442]) = 2.98279
3, Best: f([-1.01026218 -0.03265665]) = 2.69516
3, Best: f([-0.08851828 0.26066485]) = 2.00325
4, Best: f([-0.23270782 0.04191618]) = 1.66518
4, Best: f([-0.01436704 0.03653578]) = 0.15161
7, Best: f([0.01247004 0.01582657]) = 0.06777
9, Best: f([0.00368129 0.00889718]) = 0.02970
25, Best: f([ 0.00666975 -0.0045051 ]) = 0.02449
33, Best: f([-0.00072633 -0.00169092]) = 0.00530
211, Best: f([2.05200123e-05 1.51343187e-03]) = 0.00434
315, Best: f([ 0.00113528 -0.00096415]) = 0.00427
418, Best: f([ 0.00113735 -0.00030554]) = 0.00337
491, Best: f([ 0.00048582 -0.00059587]) = 0.00219
704, Best: f([-6.91643854e-04 -4.51583644e-05]) = 0.00197
1504, Best: f([ 2.83063223e-05 -4.60893027e-04]) = 0.00131
3725, Best: f([ 0.00032757 -0.00023643]) = 0.00115
Done!
f([ 0.00032757 -0.00023643]) = 0.001147

既然我们已经熟悉了如何实现逗号分隔版本的进化策略，那么让我们看一下如何实现加号版本。

开发（mu + lambda）-ES

Evolution Strategies算法的加号版本与逗号版本非常相似。主要区别在于，儿童和父母构成了人口的末端，而不仅仅是儿童。这使父母可以与孩子竞争在算法的下一次迭代中进行选择。这可能会导致搜索算法的行为更加贪婪，并可能过早收敛到局部最优（次优解决方案）。好处是该算法能够利用已发现的良好候选解决方案，并专注于该地区的候选解决方案，从而有可能寻求进一步的改进。我们可以通过修改功能来实现算法的加号版本，以便在创建子代时将父代添加到总体中。

# keep the parent
children.append(population[i])

下面列出了此功能的更新版本，并带有新名称es_plus（）。

# evolution strategy (mu + lambda) algorithm
def es_plus(objective, bounds, n_iter, step_size, mu, lam):
	best, best_eval = None, 1e+10
	# calculate the number of children per parent
	n_children = int(lam / mu)
	# initial population
	population = list()
	for _ in range(lam):
		candidate = None
		while candidate is None or not in_bounds(candidate, bounds):
			candidate = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])
		population.append(candidate)
	# perform the search
	for epoch in range(n_iter):
		# evaluate fitness for the population
		scores = [objective(c) for c in population]
		# rank scores in ascending order
		ranks = argsort(argsort(scores))
		# select the indexes for the top mu ranked solutions
		selected = [i for i,_ in enumerate(ranks) if ranks[i] < mu]
		# create children from parents
		children = list()
		for i in selected:
			# check if this parent is the best solution ever seen
			if scores[i] < best_eval:
				best, best_eval = population[i], scores[i]
				print('%d, Best: f(%s) = %.5f' % (epoch, best, best_eval))
			# keep the parent
			children.append(population[i])
			# create children for parent
			for _ in range(n_children):
				child = None
				while child is None or not in_bounds(child, bounds):
					child = population[i] + randn(len(bounds)) * step_size
				children.append(child)
		# replace population with children
		population = children
	return [best, best_eval]

我们可以使用上一节中使用的相同的超参数将此算法版本应用于Ackley目标函数。

下面列出了完整的示例。

# evolution strategy (mu + lambda) of the ackley objective function
from numpy import asarray
from numpy import exp
from numpy import sqrt
from numpy import cos
from numpy import e
from numpy import pi
from numpy import argsort
from numpy.random import randn
from numpy.random import rand
from numpy.random import seed

# objective function
def objective(v):
	x, y = v
	return -20.0 * exp(-0.2 * sqrt(0.5 * (x**2 + y**2))) - exp(0.5 * (cos(2 * pi * x) + cos(2 * pi * y))) + e + 20

# check if a point is within the bounds of the search
def in_bounds(point, bounds):
	# enumerate all dimensions of the point
	for d in range(len(bounds)):
		# check if out of bounds for this dimension
		if point[d] < bounds[d, 0] or point[d] > bounds[d, 1]:
			return False
	return True

# evolution strategy (mu + lambda) algorithm
def es_plus(objective, bounds, n_iter, step_size, mu, lam):
	best, best_eval = None, 1e+10
	# calculate the number of children per parent
	n_children = int(lam / mu)
	# initial population
	population = list()
	for _ in range(lam):
		candidate = None
		while candidate is None or not in_bounds(candidate, bounds):
			candidate = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])
		population.append(candidate)
	# perform the search
	for epoch in range(n_iter):
		# evaluate fitness for the population
		scores = [objective(c) for c in population]
		# rank scores in ascending order
		ranks = argsort(argsort(scores))
		# select the indexes for the top mu ranked solutions
		selected = [i for i,_ in enumerate(ranks) if ranks[i] < mu]
		# create children from parents
		children = list()
		for i in selected:
			# check if this parent is the best solution ever seen
			if scores[i] < best_eval:
				best, best_eval = population[i], scores[i]
				print('%d, Best: f(%s) = %.5f' % (epoch, best, best_eval))
			# keep the parent
			children.append(population[i])
			# create children for parent
			for _ in range(n_children):
				child = None
				while child is None or not in_bounds(child, bounds):
					child = population[i] + randn(len(bounds)) * step_size
				children.append(child)
		# replace population with children
		population = children
	return [best, best_eval]

# seed the pseudorandom number generator
seed(1)
# define range for input
bounds = asarray([[-5.0, 5.0], [-5.0, 5.0]])
# define the total iterations
n_iter = 5000
# define the maximum step size
step_size = 0.15
# number of parents selected
mu = 20
# the number of children generated by parents
lam = 100
# perform the evolution strategy (mu + lambda) search
best, score = es_plus(objective, bounds, n_iter, step_size, mu, lam)
print('Done!')
print('f(%s) = %f' % (best, score))

运行示例将报告候选解决方案，并在每次找到更好的解决方案时进行评分，然后在搜索结束时报告找到的最佳解决方案。

注意：由于算法或评估程序的随机性，或者数值精度的差异，您的结果可能会有所不同。考虑运行该示例几次并比较平均结果。

在这种情况下，我们可以看到在搜索过程中看到了大约24项性能提升。我们还可以看到，对于该目标函数，找到的最终解决方案更好，最终评估为0.000532，而逗号形式的发现为0.001147。

0, Best: f([-0.82977995 2.20324493]) = 6.91249
0, Best: f([-1.03232526 0.38816734]) = 4.49240
1, Best: f([-1.02971385 0.21986453]) = 3.68954
2, Best: f([-0.96315064 0.21176994]) = 3.48942
2, Best: f([-0.9524528 -0.19751564]) = 3.39266
2, Best: f([-1.02643442 0.14956346]) = 3.24784
2, Best: f([-0.90172166 0.15791013]) = 3.17090
2, Best: f([-0.15198636 0.42080645]) = 3.08431
3, Best: f([-0.76669476 0.03852254]) = 3.06365
3, Best: f([-0.98979547 -0.01479852]) = 2.62138
3, Best: f([-0.10194792 0.33439734]) = 2.52353
3, Best: f([0.12633886 0.27504489]) = 2.24344
4, Best: f([-0.01096566 0.22380389]) = 1.55476
4, Best: f([0.16241469 0.12513091]) = 1.44068
5, Best: f([-0.0047592 0.13164993]) = 0.77511
5, Best: f([ 0.07285478 -0.0019298 ]) = 0.34156
6, Best: f([-0.0323925 -0.06303525]) = 0.32951
6, Best: f([0.00901941 0.0031937 ]) = 0.02950
32, Best: f([ 0.00275795 -0.00201658]) = 0.00997
109, Best: f([-0.00204732 0.00059337]) = 0.00615
195, Best: f([-0.00101671 0.00112202]) = 0.00434
555, Best: f([ 0.00020392 -0.00044394]) = 0.00139
2804, Best: f([3.86555110e-04 6.42776651e-05]) = 0.00111
4357, Best: f([ 0.00013889 -0.0001261 ]) = 0.00053
Done!
f([ 0.00013889 -0.0001261 ]) = 0.000532