Genetic Algorithms in Search and Optimization

转载 2004年07月04日 01:44:00


By Richard Baker

The past decade has witnessed a flurry of interest within the financial industry regarding artificial intelligence technologies, including neural networks, fuzzy systems, and genetic algorithms. In many ways, genetic algorithms, and the extension of genetic programming, offer an outstanding combination of flexibility, robustness, and simplicity. The following discussion highlights some of the key features of genetic algorithms (GAs), and illustrates an application of a particular GA in the search and estimation of global optima.

Virtually every technical discipline, from science and engineering to finance and economics, frequently encounters problems of optimization. Although optimization techniques abound, such techniques often involve identifying, in some fashion, the values of a sequence of explanatory parameters associated with the best performance of an underlying dependent, scalar-valued objective function. For example, in simple 3-D space, this amounts to finding the (x,y) point associated with the optimal z value above or below the x-y plane, where the scalar-valued z is a surface identified by the objective function f(x,y). Or it may involve estimating a large number of parameters of a more elaborate econometric model. For example, we might wish to estimate the coefficients of a generalized auto-regressive conditional heteroskedastic (GARCH) model, in which the log-likelihood function is the objective to maximize. In each case, the unknowns may be thought of as a parameter vector, V, and the objective function, z = f(V), as a transformation of a vector-valued input to a scalar-valued performance metric z.

Optimization may take the form of a minimization or maximization procedure. Throughout this article, optimization will refer to maximization without loss of generality, because maximizing f(V) is the same as minimizing -f(V). My preference for maximization is simply intuitive: Genetic algorithms are based on evolutionary processes and Darwin's concept of natural selection. In a GA context, the objective function is usually referred to as a fitness function, and the phrase survival of the fittest implies a maximization procedure.

Motivation for the Use of GAs

A natural question for those unfamiliar with GAs might be: Given all the powerful numerical optimization methods available, such as deterministic gradient-based and simplex-based search methods, why would I be interested in a stochastic alternative like a GA?

The answer is, at least in part, dependent upon the application. If the optimization problems you encounter are reasonably well-behaved, then conventional deterministic techniques are clearly the best choice. But what if the problem is ill-behaved? Or, put another way, what are some of the features that make an optimization problem difficult?

Well, difficulties arise when the estimation involves many parameters that interact in highly non-linear ways. Objective functions characterized by many local optima, expansive flat planes in multi-dimensional space, points at which gradients are undefined, or when the objective function is discontinuous, pose difficulty for traditional mathematical techniques. In these situations, heuristic methods like GAs offer a powerful alternative, and can greatly enhance the set of tools available to researchers. Throughout the following discussion, try to think of GAs as enhancements to a suite of optimization techniques, regarding them as alternatives to more traditional methods, and not as replacements for conventional numerical optimizers. I other words, GAs should be used to complement familiar numerical techniques, and are not meant to replace or subsume them.

Common Features of GAs

It's important to note that, despite all the surrounding hype, the successful design and use of a GA is as much art as science. GA theory is elegant, fascinating, and full of subtle variations that may - or may not - be useful in certain applications. Since Holland's seminal work (Holland, 1975), numerous variations of the conventional GA have been proposed. Most of the variations are well beyond the scope of this article, so I will only discuss some the most common features of GAs. Interested readers are encouraged to consult the references listed below.

One of the most powerful features of GAs is their elegant simplicity. In fact, the simple power of a genetic approach may be harnessed with little more than a few lines of computer code and a random number generator. There are no closed form gradients to derive, and no differentiability or continuity requirements to satisfy. A GA is, in essence, nothing more than a repeated application of a few simple operations. Moreover, a GA knows nothing about the problem at hand, nor does it care. In fact, GAs need not know how to solve a problem, they just need to recognize a good solution when they see it!

GAs begin by randomly generating, or seeding, an initial population of candidate solutions. Each candidate is an individual member of a large population of size M. For the purposes of this discussion, think of each individual as a row vector composed of N elements. In GA parlance, individuals are often referred to as chromosomes, and the vector elements as genes. Each gene will provide storage for, and may be associated with, a specific parameter of the search space. As a simple example of two unknowns, we may think of each individual as a parameter vector V, in which the each V contains a point in the x-y plane, V = [x y]. With this example in mind, the entire population may be stored and manipulated as two-column matrix: The first column represents a population of x-axis values, and the second column a population of y-axis values, and each of the M rows of the matrix is a solution vector V of length N = 2.

The individuals (chromosomes) in genetic algorithms are usually constant-length character sequences (vectors V of constant size). In the traditional GA, these vectors are usually sequences of zeros and ones, but in practice may be anything, including a mix of integers and real numbers, or even a mix of numbers and character strings. The actual data encoded in the vectors is called the representation scheme. To keep the discussion simple and concrete, the chromosomes in this article will be real, continuous parameters with two elements, V = [x y]. Given these two-element chromosomes, the objective is to search for the (x,y) point that maximizes the scalar-valued fitness function z = f(V) = f(x,y). Starting with the initial random population of vectors, a GA then applies a sequence of operations to the population, guided only by the relative fitness of the individuals, and allows the population to evolve over a number of generations. The goal of the evolutionary process is to continually improve the fitness of the best solution vector, as well as the average population fitness, until some termination criteria is satisfied.

Conventional GAs usually apply a sequence of operations to the population based on the relative fitness of the members. The operations typically involve reproduction, in which individuals are randomly selected to survive into the next generation. In reproduction, highly fit individuals are more likely to be selected than unfit members. The idea behind reproduction is to allow relatively fit members to survive and procreate, with the hope that their genes are truly associated with better performance. Note that reproduction is asexual, in that only a single individual is involved.

The next operation, crossover, is a sexual operation involving two (or even more!) individuals. In crossover, highly fit individuals are more likely to be selected to mate and produce children than unfit members. In this manner, highly fit vectors are allowed to breed, with the hope that they will produce ever more fit offspring. Although the crossover operation may take many forms, it typically involves splitting each parent chromosome at a randomly-selected point within the interior of the chromosome, and rearranging the fragments so as to produce offspring of the same size as the parents. The children usually differ from each other, and are usually different from the parents. The effect of crossover is to build upon the success of the past, yet still explore new areas of the search space. Crossover is a slice-and-dice operation which efficiently shuffles genetic information from one generation the next.

The next operation is mutation, in which individuals are slightly changed. In our case, this means that the (x,y) parameters of the chromosome are slightly perturbed with some small probability. Mutation is an asexual operation, and usually plays a relatively minor role in the population. The idea behind mutation is to restore genetic diversity lost during the application of reproduction and crossover, both of which place relentless pressure on the population to converge.

The relentless pressure towards convergence is driven by fitness, and offers a convenient termination criteria. In fact, after many generations of evolution via the repeated application of reproduction, crossover, and mutation, the individuals in the population will often begin to look alike, so to speak. At this point, the GA typically terminates because additional evolution will produce little improvement in fitness. Many termination criteria may be used, in which the most simple is to just stop after some predetermined number of generations. See what I mean about part art, part science?

Pros & Cons of GAs

In contrast to more traditional numerical techniques, which iteratively refine a single solution vector as they search for optima in a multi-dimensional landscape, genetic algorithms operate on entire populations of candidate solutions in parallel. In fact, the parallel nature of a GA's stochastic search is one of the main strengths of the genetic approach. This parallel nature implies that GAs are much more likely to locate a global peak than traditional techniques, because they are much less likely to get stuck at local optima. Also, due to the parallel nature of the stochastic search, the performance is much less sensitive to initial conditions, and hence and a GA's convergence time is rather predictable. In fact, the problem of finding a local optimum is greatly minimized because GAs, in effect, make hundreds, or even thousands, of initial guesses. This implies that a GA's performance is at least as good as a purely random search. In fact, by simply seeding an initial population and stopping there, a GA without any evolutionary progression is essentially a Monte Carlo simulation.

As appealing as a GA may seem, the parallel nature of the stochastic search is not without consequences. Although the prospects of finding global optima make it robust, the convergence of a GA is usually slower than traditional techniques. In fact, with a good initial guess close to the global optimum, a numerical technique will likely be much faster, and more accurate, than a genetic search because, in essence, the GA will be wasting time testing the fitness of sub-optimal solutions.

Furthermore, due to the stochastic nature of a GA, the solution, although more likely to estimate the global optimum, will only be an estimate. Users must realize that GAs will only by chance find an exact optimum, whereas traditional methods will find it exactly . assuming, of course, they find it all. The user must then determine whether the solution found by a GA is close enough. In may cases it will be, but the question of 'How close is close enough?' is somewhat arbitrary and application-dependent.

An Optimization Example

To build on the notation developed earlier, this example involves the optimization of a real valued fitness function of two variables, z = f(V) = f(x,y). That is, we are to estimate the 2-D parameter vector V = [x y] of the (x,y) point associated with the maximum z value. This problem is very instructive, because it allows the user to visualize the entire evolutionary process as an animated sequence in 3-D space, and provides a very intuitive understanding of the adaptive, self-improving nature of a GA in action.

Furthermore, the example is motivated by a desire to help users integrate the stochastic search capabilities of a GA with traditional gradient-based search methods. Specifically, I want to emphasize the complementary nature of the two fundamentally different approaches.

The entire optimization example was developed in MATLAB. Although the analysis could have been performed with any of a host of software packages, MATLAB's integrated graphics and matrix-based processing capabilities make it ideal for GA work. In fact, since GAs are inherently parallel, simultaneously processing populations of parameter vectors in MATLAB is almost trivial and significantly reduces the development time and the amount of code required.

The particular GA I chose, specifically designed to optimize fitness functions of continuous parameters, was co-developed by Ken Price and Rainer Storn. Although the details of this particular GA cannot be addressed in this article, I strongly refer interested readers to the April 1997 edition of Dr. Dobb's Journal.

The fitness function for this example is the MATLAB demonstration function PEAKS shown in Figure 1; PEAKS is a function of two variables obtained by translating and scaling several Gaussian distributions. Although PEAKS is a smooth function of only two variables, it nevertheless has three local maxima, and thus represents a simple function for which the parameters estimated by a gradient-based search method are sensitive to the initial guess of the solution vector. The gradient-based optimization function used is the FMINU function of the MATLAB Optimization Toolbox. Note that FMINU is a minimization routine, so to maximize the PEAKS function, FMINU actually needs to minimize -PEAKS.

Figure 1 MATLAB PEAKS as the Fitness Function

To illustrate the example, set an initial vector estimate V0 = [-1 -1]. FMINU returns a solution vector V = [-0.4601 -0.6292] associated with the local maximum z = f(-0.4601,-0.6292) = 3.7766. If we start at V0 = [1 0], FMINU returns a solution vector V = [1.2857 -0.0048] associated with the local maximum z = f(1.2857,-0.0048) = 3.5925. Finally, if we start at V0 = [2 2], FMINU returns the solution vector V = [-0.0093 1.5814] associated with the true global maximum z = f(-0.0093,1.5814]) = 8.1062.

The gradient-based method works extremely well, provided it knows where to start. In this simple case we can visually determine the location of the true global peak, but in problems of much higher dimension there is frequently no such luxury.

Now consider first running the Price/Storn GA to provide FMINU with an initial estimate. The initial population was seeded with only 20 individual chromosomes (20 (x,y) points randomly generated such that -3 < x,y < 3), and allowed to evolve for only 10 generations, producing the parameter estimate V0 = [0.1205 1.6551]. This estimate is then passed to FMINU. The result, as expected, is now the true global maximum z = f(-0.0093,1.5814]) = 8.1062.


Stochastic techniques can expand the optimization tools now available to researchers in a way that enhances and complements the familiar, traditional methods. Genetic approaches, in particular, are now available to optimize difficult, ill-behaved objective functions that often prove difficult for many conventional methods. Furthermore, these genetic approaches are often simple to design and easy to code, and can be used in concert with traditional methods to greatly increase the probability of finding the true global optimum of multi-dimensional functions.


1. Dhar, V, and Stein, R.(1997). Seven Methods for Transforming Corporate Data into Business Intelligence.. Prentice-Hall.
2. Epstein, J.M., and Axtell, R. (1996). Growing Artificial Societies: Social Science from the Bottom Up. MIT Press.
3. Goldberg, D.E (1989). Genetic Algorithms in Search, Optimization & Machine Learning. Addison-Wesley Longman.
4. Holland, J.H. (1992). Adaptation in Natural and Artificial Systems. MIT Press.
5. Koza, J.R. (1992). Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press.
6. Koza, J.R. (1994). Genetic Programming II: Automatic Discovery of Reusable Programs. MIT Press.
7. Price, K., and Storn, R (April 1997). Differential Evolution: A Simple Evolution Strategy for Fast Optimization. Dr. Dobb's Journal, 264, pp. 18-24.

一种优化的Genetic Algorithm —— Python实现

一种优化的Genetic Algorithm —— Python实现优化内容:1、 加入精英保护机制,种群内最优秀个体将被保留,精英更迭采用“打擂机制”,加快收敛。2、 变异基因数服从泊松分布,模拟每...
  • JayRoxis
  • JayRoxis
  • 2017年06月22日 13:58
  • 685

业界 | OpenAI提出强化学习近端策略优化,可替代策略梯度法

选自OpenAI 机器之心编辑部 参与:蒋思源、Smith 近日,OpenAI 发布了一种新型的强化学习算法,近端策略优化(Proximal Policy Optimiz...
  • AMDS123
  • AMDS123
  • 2017年07月21日 13:28
  • 665

简单的遗传算法(Genetic algorithms)-吃豆人

遗传算法简介:一直都在收听卓老板聊科技这个节目,最近播出了一起人工智能的节目,主要讲的是由霍兰提出的遗传算法,在目中详细阐述了一个有趣的小实验:吃豆人。首先简单介绍下遗传算法: 1:为了解决某个具体...
  • u013564276
  • u013564276
  • 2016年12月05日 20:52
  • 1721


在深度学习领域,对于具有上百万个连接的多层深度神经网络(DNN),现在往往通过随机梯度下降(SGD)算法进行常规训练。许多人认为,SGD 算法有效计算梯度的能力对于这种训练能力而言至关重要。但是,Ub...
  • Uwr44UOuQcNsUQb60zk2
  • Uwr44UOuQcNsUQb60zk2
  • 2017年12月23日 06:41
  • 189

SEO全称:Search Engine Optimization,即搜索引擎优化

SEO全称:Search Engine Optimization,即搜索引擎优化。是指为了从搜索引擎中获得更多的免费流量,从网站结构、内容建设方案、用户互动传播、页面等角度进行合理规划,使网站更适合搜...
  • u012959846
  • u012959846
  • 2013年11月26日 11:59
  • 394

Genetic Algorithm遗传算法学习

参考资料:冒昧的用了链接下的几张图) 百度百科:http://baike....
  • Androidlushangderen
  • Androidlushangderen
  • 2015年03月03日 18:34
  • 4422

关于 最优化/Optimization 的一些概念解释

转载请注明出处:   以下是我曾在学习“最优化”理论与实践中遇到的一些概念,我刚开始学的时候,有些东西看了很多遍都还觉得很别扭、晦涩难懂,在...
  • shenxiaolong188
  • shenxiaolong188
  • 2015年12月09日 12:19
  • 902

Grokking Algorithms 算法图解 一本基于Python的算法科普读本

Grokking Algorithms算法图解是由美国软件工程师Aditya Bhargava写的一本算法科普读物,由图灵教育引进,组织翻译之后17年三月份正式出版,目前来说知名度还不是很高。 毕...
  • GeneralLi95
  • GeneralLi95
  • 2017年06月01日 10:26
  • 602

数值优化(Numerical Optimization)学习系列-目录

概述 数值优化对于最优化问题提供了一种迭代算法思路,通过迭代逐渐接近最优解,分别对无约束最优化问题和带约束最优化问题进行求解。 该系列教程可以参考的资料有 1. 《Numerical Opti...
  • fangqingan_java
  • fangqingan_java
  • 2015年12月27日 19:07
  • 4051


Machine Learning with Spark-Packt Publishing(2014) epub  Feature hashing Feature hashing is a tech...
  • cteng
  • cteng
  • 2016年02月19日 17:27
  • 1454
您举报文章:Genetic Algorithms in Search and Optimization