
        【翻译自 : Evolution Strategies From Scratch in Python

       【说明:Jason Brownlee PhD大神的文章个人很喜欢,所以闲暇时间里会做一点翻译和学习实践的工作,这里是相应工作的实践记录,希望能帮到有需要的人!】



有一种用于演进策略的标准术语,并且有两种常见的算法版本,分别称为(mu,lambda)-ES和(mu + lambda)-ES。
如何实现和将Evolution Strategies算法应用于连续目标函数。


          本教程分为三个部分: 他们是:

开发(mu + lambda)-ES





       此过程有多种变体,并且使用标准术语概括了该算法。人口规模称为lambda,每次迭代选择的父母数量称为μ。从每个父代创建的子代数的计算方式为(lambda / mu),应选择参数以使除法没有余数。

lambda / mu:从每个选定的父代生成的子代数。
       括号符号用于描述算法配置,例如(μ,λ)-ES。例如,如果mu = 5且lambda = 20,则它将总结为(5,20)-ES。分隔mu和lambda参数的逗号(,)表示在算法的每次迭代中,子代直接替换父代。(mu,lambda)-ES:一种进化策略的版本,其中孩子代替了父母。
       mu和lambda参数的加号(+)分隔表示孩子和父母将共同定义下一次迭代的总体。(mu + lambda)-ES:进化策略的一个版本,其中将孩子和父母添加到了种群中。
       可以将随机爬山算法实现为“进化策略”,并使用符号(1 +1)-ES。这是一种明喻或规范的ES算法,文献中描述了许多扩展和变体。现在我们熟悉了Evolution Strategies,我们可以探索如何实现该算法


       在本节中,我们将开发一个(mu,lambda)-ES,即该算法的一种版本,其中子代替换了父代。首先,让我们定义一个具有挑战性的优化问题,作为实现算法的基础。Ackley函数是多模式目标函数的一个示例,该函数具有单个全局最优值和多个局部最优值,可能会卡住局部搜索。因此,需要全局优化技术。 这是一个二维目标函数,其全局最佳值为[0,0],其值为0.0。下面的示例实现了Ackley,并创建了一个三维表面图,显示了全局最优值和多个局部最优值。

# ackley multimodal function
from numpy import arange
from numpy import exp
from numpy import sqrt
from numpy import cos
from numpy import e
from numpy import pi
from numpy import meshgrid
from matplotlib import pyplot
from mpl_toolkits.mplot3d import Axes3D

# objective function
def objective(x, y):
	return -20.0 * exp(-0.2 * sqrt(0.5 * (x**2 + y**2))) - exp(0.5 * (cos(2 * pi * x) + cos(2 * pi * y))) + e + 20

# define range for input
r_min, r_max = -5.0, 5.0
# sample input range uniformly at 0.1 increments
xaxis = arange(r_min, r_max, 0.1)
yaxis = arange(r_min, r_max, 0.1)
# create a mesh from the axis
x, y = meshgrid(xaxis, yaxis)
# compute targets
results = objective(x, y)
# create a surface plot with the jet color scheme
figure = pyplot.figure()
axis = figure.gca(projection='3d')
axis.plot_surface(x, y, results, cmap='jet')
# show the plot


         我们将生成随机候选解决方案以及现有候选解决方案的修改版本。 所有候选解决方案都应在搜索问题的范围内,这一点很重要。



# check if a point is within the bounds of the search
def in_bounds(point, bounds):
	# enumerate all dimensions of the point
	for d in range(len(bounds)):
		# check if out of bounds for this dimension
		if point[d] < bounds[d, 0] or point[d] > bounds[d, 1]:
			return False
	return True

        然后,我们可以在生成“ lam”(例如lambda)随机候选解决方案的初始填充时使用此函数。


# initial population
population = list()
for _ in range(lam):
 candidate = None
 while candidate is None or not in_bounds(candidate, bounds):
 candidate = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])

       接下来,我们可以迭代固定数量的算法迭代。 每次迭代首先涉及评估总体中的每个候选解决方案。


# evaluate fitness for the population
scores = [objective(c) for c in population]


        我们将分两步执行此操作。 首先,我们将根据候选解决方案的得分以升序对它们进行排序,以使得分最低的解决方案的排名为0,下一个的解决方案的排名为1,依此类推。 我们可以使用argsort函数的两次调用来实现这一点。

       然后,我们将使用等级,并选择等级低于“ mu”值的那些父母。 这意味着,如果将mu设置为5以选择5个父母,则只会选择等级介于0和4之间的那些父母。

# rank scores in ascending order
ranks = argsort(argsort(scores))
# select the indexes for the top mu ranked solutions
selected = [i for i,_ in enumerate(ranks) if ranks[i] < mu]


# calculate the number of children per parent
n_children = int(lam / mu)


         我们将使用在随机爬山中使用的类似技术来创建孩子。 具体来说,将使用高斯分布对每个变量进行采样,并以当前值作为均值,将标准偏差作为“步长”超参数提供。 

# create children for parent
for _ in range(n_children):
 child = None
 while child is None or not in_bounds(child, bounds):
 child = population[i] + randn(len(bounds)) * step_size


# check if this parent is the best solution ever seen
if scores[i] < best_eval:
 best, best_eval = population[i], scores[i]
 print('%d, Best: f(%s) = %.5f' % (epoch, best, best_eval))


# replace population with children
population = children

        我们可以将所有这些绑定到一个名为es_comma()的函数中,该函数执行Evolution Strategy算法的逗号版本。


# evolution strategy (mu, lambda) algorithm
def es_comma(objective, bounds, n_iter, step_size, mu, lam):
	best, best_eval = None, 1e+10
	# calculate the number of children per parent
	n_children = int(lam / mu)
	# initial population
	population = list()
	for _ in range(lam):
		candidate = None
		while candidate is None or not in_bounds(candidate, bounds):
			candidate = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])
	# perform the search
	for epoch in range(n_iter):
		# evaluate fitness for the population
		scores = [objective(c) for c in population]
		# rank scores in ascending order
		ranks = argsort(argsort(scores))
		# select the indexes for the top mu ranked solutions
		selected = [i for i,_ in enumerate(ranks) if ranks[i] < mu]
		# create children from parents
		children = list()
		for i in selected:
			# check if this parent is the best solution ever seen
			if scores[i] < best_eval:
				best, best_eval = population[i], scores[i]
				print('%d, Best: f(%s) = %.5f' % (epoch, best, best_eval))
			# create children for parent
			for _ in range(n_children):
				child = None
				while child is None or not in_bounds(child, bounds):
					child = population[i] + randn(len(bounds)) * step_size
		# replace population with children
		population = children
	return [best, best_eval]


       我们将对该算法进行5,000次迭代,并在搜索空间中使用0.15的步长。 我们将使用100个精选的20个父母(亩)的人口规模(拉姆达)。 经过一些反复试验后,才选择了这些超参数。


# seed the pseudorandom number generator
# define range for input
bounds = asarray([[-5.0, 5.0], [-5.0, 5.0]])
# define the total iterations
n_iter = 5000
# define the maximum step size
step_size = 0.15
# number of parents selected
mu = 20
# the number of children generated by parents
lam = 100
# perform the evolution strategy (mu, lambda) search
best, score = es_comma(objective, bounds, n_iter, step_size, mu, lam)
print('f(%s) = %f' % (best, score))

        结合在一起,下面列出了将Evolution Strategies算法的逗号版本应用于Ackley目标函数的完整示例。

# evolution strategy (mu, lambda) of the ackley objective function
from numpy import asarray
from numpy import exp
from numpy import sqrt
from numpy import cos
from numpy import e
from numpy import pi
from numpy import argsort
from numpy.random import randn
from numpy.random import rand
from numpy.random import seed

# objective function
def objective(v):
	x, y = v
	return -20.0 * exp(-0.2 * sqrt(0.5 * (x**2 + y**2))) - exp(0.5 * (cos(2 * pi * x) + cos(2 * pi * y))) + e + 20

# check if a point is within the bounds of the search
def in_bounds(point, bounds):
	# enumerate all dimensions of the point
	for d in range(len(bounds)):
		# check if out of bounds for this dimension
		if point[d] < bounds[d, 0] or point[d] > bounds[d, 1]:
			return False
	return True

# evolution strategy (mu, lambda) algorithm
def es_comma(objective, bounds, n_iter, step_size, mu, lam):
	best, best_eval = None, 1e+10
	# calculate the number of children per parent
	n_children = int(lam / mu)
	# initial population
	population = list()
	for _ in range(lam):
		candidate = None
		while candidate is None or not in_bounds(candidate, bounds):
			candidate = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])
	# perform the search
	for epoch in range(n_iter):
		# evaluate fitness for the population
		scores = [objective(c) for c in population]
		# rank scores in ascending order
		ranks = argsort(argsort(scores))
		# select the indexes for the top mu ranked solutions
		selected = [i for i,_ in enumerate(ranks) if ranks[i] < mu]
		# create children from parents
		children = list()
		for i in selected:
			# check if this parent is the best solution ever seen
			if scores[i] < best_eval:
				best, best_eval = population[i], scores[i]
				print('%d, Best: f(%s) = %.5f' % (epoch, best, best_eval))
			# create children for parent
			for _ in range(n_children):
				child = None
				while child is None or not in_bounds(child, bounds):
					child = population[i] + randn(len(bounds)) * step_size
		# replace population with children
		population = children
	return [best, best_eval]

# seed the pseudorandom number generator
# define range for input
bounds = asarray([[-5.0, 5.0], [-5.0, 5.0]])
# define the total iterations
n_iter = 5000
# define the maximum step size
step_size = 0.15
# number of parents selected
mu = 20
# the number of children generated by parents
lam = 100
# perform the evolution strategy (mu, lambda) search
best, score = es_comma(objective, bounds, n_iter, step_size, mu, lam)
print('f(%s) = %f' % (best, score))


       注意:由于算法或评估程序的随机性,或者数值精度的差异,您的结果可能会有所不同。 考虑运行该示例几次并比较平均结果。



0, Best: f([-0.82977995 2.20324493]) = 6.91249
0, Best: f([-1.03232526 0.38816734]) = 4.49240
1, Best: f([-1.02971385 0.21986453]) = 3.68954
2, Best: f([-0.98361735 0.19391181]) = 3.40796
2, Best: f([-0.98189724 0.17665892]) = 3.29747
2, Best: f([-0.07254927 0.67931431]) = 3.29641
3, Best: f([-0.78716147 0.02066442]) = 2.98279
3, Best: f([-1.01026218 -0.03265665]) = 2.69516
3, Best: f([-0.08851828 0.26066485]) = 2.00325
4, Best: f([-0.23270782 0.04191618]) = 1.66518
4, Best: f([-0.01436704 0.03653578]) = 0.15161
7, Best: f([0.01247004 0.01582657]) = 0.06777
9, Best: f([0.00368129 0.00889718]) = 0.02970
25, Best: f([ 0.00666975 -0.0045051 ]) = 0.02449
33, Best: f([-0.00072633 -0.00169092]) = 0.00530
211, Best: f([2.05200123e-05 1.51343187e-03]) = 0.00434
315, Best: f([ 0.00113528 -0.00096415]) = 0.00427
418, Best: f([ 0.00113735 -0.00030554]) = 0.00337
491, Best: f([ 0.00048582 -0.00059587]) = 0.00219
704, Best: f([-6.91643854e-04 -4.51583644e-05]) = 0.00197
1504, Best: f([ 2.83063223e-05 -4.60893027e-04]) = 0.00131
3725, Best: f([ 0.00032757 -0.00023643]) = 0.00115
f([ 0.00032757 -0.00023643]) = 0.001147


开发(mu + lambda)-ES

       Evolution Strategies算法的加号版本与逗号版本非常相似。主要区别在于,儿童和父母构成了人口的末端,而不仅仅是儿童。 这使父母可以与孩子竞争在算法的下一次迭代中进行选择。这可能会导致搜索算法的行为更加贪婪,并可能过早收敛到局部最优(次优解决方案)。 好处是该算法能够利用已发现的良好候选解决方案,并专注于该地区的候选解决方案,从而有可能寻求进一步的改进。我们可以通过修改功能来实现算法的加号版本,以便在创建子代时将父代添加到总体中。

# keep the parent


# evolution strategy (mu + lambda) algorithm
def es_plus(objective, bounds, n_iter, step_size, mu, lam):
	best, best_eval = None, 1e+10
	# calculate the number of children per parent
	n_children = int(lam / mu)
	# initial population
	population = list()
	for _ in range(lam):
		candidate = None
		while candidate is None or not in_bounds(candidate, bounds):
			candidate = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])
	# perform the search
	for epoch in range(n_iter):
		# evaluate fitness for the population
		scores = [objective(c) for c in population]
		# rank scores in ascending order
		ranks = argsort(argsort(scores))
		# select the indexes for the top mu ranked solutions
		selected = [i for i,_ in enumerate(ranks) if ranks[i] < mu]
		# create children from parents
		children = list()
		for i in selected:
			# check if this parent is the best solution ever seen
			if scores[i] < best_eval:
				best, best_eval = population[i], scores[i]
				print('%d, Best: f(%s) = %.5f' % (epoch, best, best_eval))
			# keep the parent
			# create children for parent
			for _ in range(n_children):
				child = None
				while child is None or not in_bounds(child, bounds):
					child = population[i] + randn(len(bounds)) * step_size
		# replace population with children
		population = children
	return [best, best_eval]



# evolution strategy (mu + lambda) of the ackley objective function
from numpy import asarray
from numpy import exp
from numpy import sqrt
from numpy import cos
from numpy import e
from numpy import pi
from numpy import argsort
from numpy.random import randn
from numpy.random import rand
from numpy.random import seed

# objective function
def objective(v):
	x, y = v
	return -20.0 * exp(-0.2 * sqrt(0.5 * (x**2 + y**2))) - exp(0.5 * (cos(2 * pi * x) + cos(2 * pi * y))) + e + 20

# check if a point is within the bounds of the search
def in_bounds(point, bounds):
	# enumerate all dimensions of the point
	for d in range(len(bounds)):
		# check if out of bounds for this dimension
		if point[d] < bounds[d, 0] or point[d] > bounds[d, 1]:
			return False
	return True

# evolution strategy (mu + lambda) algorithm
def es_plus(objective, bounds, n_iter, step_size, mu, lam):
	best, best_eval = None, 1e+10
	# calculate the number of children per parent
	n_children = int(lam / mu)
	# initial population
	population = list()
	for _ in range(lam):
		candidate = None
		while candidate is None or not in_bounds(candidate, bounds):
			candidate = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])
	# perform the search
	for epoch in range(n_iter):
		# evaluate fitness for the population
		scores = [objective(c) for c in population]
		# rank scores in ascending order
		ranks = argsort(argsort(scores))
		# select the indexes for the top mu ranked solutions
		selected = [i for i,_ in enumerate(ranks) if ranks[i] < mu]
		# create children from parents
		children = list()
		for i in selected:
			# check if this parent is the best solution ever seen
			if scores[i] < best_eval:
				best, best_eval = population[i], scores[i]
				print('%d, Best: f(%s) = %.5f' % (epoch, best, best_eval))
			# keep the parent
			# create children for parent
			for _ in range(n_children):
				child = None
				while child is None or not in_bounds(child, bounds):
					child = population[i] + randn(len(bounds)) * step_size
		# replace population with children
		population = children
	return [best, best_eval]

# seed the pseudorandom number generator
# define range for input
bounds = asarray([[-5.0, 5.0], [-5.0, 5.0]])
# define the total iterations
n_iter = 5000
# define the maximum step size
step_size = 0.15
# number of parents selected
mu = 20
# the number of children generated by parents
lam = 100
# perform the evolution strategy (mu + lambda) search
best, score = es_plus(objective, bounds, n_iter, step_size, mu, lam)
print('f(%s) = %f' % (best, score))


       注意:由于算法或评估程序的随机性,或者数值精度的差异,您的结果可能会有所不同。 考虑运行该示例几次并比较平均结果。

      在这种情况下,我们可以看到在搜索过程中看到了大约24项性能提升。 我们还可以看到,对于该目标函数,找到的最终解决方案更好,最终评估为0.000532,而逗号形式的发现为0.001147。

0, Best: f([-0.82977995 2.20324493]) = 6.91249
0, Best: f([-1.03232526 0.38816734]) = 4.49240
1, Best: f([-1.02971385 0.21986453]) = 3.68954
2, Best: f([-0.96315064 0.21176994]) = 3.48942
2, Best: f([-0.9524528 -0.19751564]) = 3.39266
2, Best: f([-1.02643442 0.14956346]) = 3.24784
2, Best: f([-0.90172166 0.15791013]) = 3.17090
2, Best: f([-0.15198636 0.42080645]) = 3.08431
3, Best: f([-0.76669476 0.03852254]) = 3.06365
3, Best: f([-0.98979547 -0.01479852]) = 2.62138
3, Best: f([-0.10194792 0.33439734]) = 2.52353
3, Best: f([0.12633886 0.27504489]) = 2.24344
4, Best: f([-0.01096566 0.22380389]) = 1.55476
4, Best: f([0.16241469 0.12513091]) = 1.44068
5, Best: f([-0.0047592 0.13164993]) = 0.77511
5, Best: f([ 0.07285478 -0.0019298 ]) = 0.34156
6, Best: f([-0.0323925 -0.06303525]) = 0.32951
6, Best: f([0.00901941 0.0031937 ]) = 0.02950
32, Best: f([ 0.00275795 -0.00201658]) = 0.00997
109, Best: f([-0.00204732 0.00059337]) = 0.00615
195, Best: f([-0.00101671 0.00112202]) = 0.00434
555, Best: f([ 0.00020392 -0.00044394]) = 0.00139
2804, Best: f([3.86555110e-04 6.42776651e-05]) = 0.00111
4357, Best: f([ 0.00013889 -0.0001261 ]) = 0.00053
f([ 0.00013889 -0.0001261 ]) = 0.000532


