Real-valued Coded Evolutionary Algorithms
1 Binary Coded Genetic Algorithms
2 Key Concepts in Binary Coding
- schema: a template that identifies a subset of streams with similarities at certain string positions.
一个长度为L的chromosome,包含了3^L个schemata (0, 1 or * at each position)。
而一个大小为M的population,我们需要评估至多M*3^L个schemata - implicit parallelism: to find fitter schemata, and this manipulation is based on M·3^L schemata. (broaden search space)
3 Drawbacks of Binary Coding
- 【1】Hamming cliff problem
- 汉明距离:表示两个(相同长度)字符串对应位置的不同字符的数量
- 汉明悬崖问题:二进制和十进制,000跟100只翻转了一个(one-bit mutation)但相差4 (large jump),011和100翻转了3个(multi-bit mutation)但只相差1 (small jump)。
- 换言之,the change in genotypic space does not reflec the change in phenotypic space.
- Solution to Hamming cliff problem: Gray encoding
- 最左边是standard binary code,往右一直异或,取值0或1
- equation as follow
- 【2】problem in discrete search spaces: redundancy problem
- redundancy problem: when the variables belongs to a finite discrete set with a cardinal different from a power of two, some binary strings are redundant, which correspond infeasible solutions.
- 举个例子:我有一个集合A = {0, 2, 3},the cardinal of this set is 3。我们需要一个长度为2 bits的binary string去表示它 (encode 0 1 2 3)。其中,1是infeasible的,这就是redundency。
- 【3】problem in continuous search spaces: precision problem
- 从01字符串,decode到real value,想要约精确就需要更长的字符串
- 换言之,the precision depends on length L
- Solution to continuous optimisation problems: Real-valued vector representation
- no differeneccs between genotypes and phenotypes
- each element (also called gene) in this vector is a real value, and that represents a real value variable of this problem.
- 不会受到encode和decode的字符长度限制
- 一些based on real-valued vector representation的evolutioanry algorithms:Evolution Strategies, Evolutionary Programming and Differential Evolution。
- 总结advantages: (1) natural, simple and faster (no need to encode and decode); (2) better preision and easy to handle large dimensional problems
4 Real-valued Vector Operators
-
回顾real-valued vector representation
- it’s a solution to continuous optimisation problems
- it’s simple, natural and faster (no need to encode and decode)
- and it has better preision and is easy to handle large dimensional problems
-
Real value mutation
- 根据概率p_m去randomly select a parent for mutation, then randomly select a gene c_i from the parent
- different forms of mutation operators
- Uniform mutation: 在variable x_i的intervel bound里,uniformly generate一个数,并用它去替换gene c_i。
- Gaussian mutation: 在以gene c_i为mean的高斯分布中(标准差根据interval bound决定)计算一个数,去替换gene c_i。里面的min, max function是为了确保计算出来的数字不要超出interva bound。
- Non-uniform mutation:
- v_i是upper bound,u_i是lower bound
- ❓ 问题:为什么这个delta的公式是这样的?What’s ths intuition?
➡ 如果t/g_m小,则Δ大。that means mutation at the beginning of generation would give us large variation (large standard diviation), and that helps the exploration at the beginning. After a few generations, mutation would be smaller, and that is exploitation.
- v_i是upper bound,u_i是lower bound
-
Real-valued crossover
- 选择two parents, then apply a corssover operator
- 一些crossover operators
-
Flat crossover: 两个父母中,同一个位置的gene随机在interval中选一个value
-
Simple crossover: 随机一个crossover point,在这个点以后的所有gene全部swap,这样的到了两个new offsprings
-
Whole arithemetical crossover: 根据公式计算新的两个offsprings
-
Local arithmetical crossover: 同上,但此处的α不是一个数,是一个Vector
-
Single arithmetical crossover: 只改变一个gene,其他直接复制parents
-
BLX-α crossover: 计算得到一个new offspring
-
⚠ 不论是mutation operator还是crossover operator,都可以从最简单开始尝试。