Multi-optimization (without using weights)

The big challenge in today's engineering field is that how we optimize something where the fitness criteria involves multiple factors and these factors are contradictory.

(For example, when we build a bridge, we want it to be stable and light, these two are contradictory since for stability we need more material for to build those strcutures, which makes the bridge heavier)

Let's say a situation that we're trying to understand the car market and make new car model. We focus on two factors "Efficiency" and "Power" for example. Generally there's a trader off between them: high power cars are not efficient in mile per gallon (MPG), and efficient cars are not very powerful. This mean we cannot achieve both high efficiency and high power.

The idea is that every car out there in the market has both efficiency and power, though they may vary from different models. Then we can just put them in this figure (shown as A B C etc.). 

Now if we only have A,B,C,D, think about would you ever wanna buy the car model D? (Assuming these four models are at same price)

The answer would definitely be no, since nobody wanna buy D, a car that is less powerful and less efficient than B at the same time. So from a market perspective, we say B dominates D. Generally two things have a dominance relationship with one thing is better in at least one factor that we mentioned. 

 What are we trying to say? We can have a million car models in the figure, scattering all around, but the only cars that matter are those are on the frontier like A,B,C rather than D. 

This means anything under this step function formed by A,B,C is dead, having no market. We call this frontier as Pareto front. It is the goal that every product desires to be on the Pareto front, if not, that product has no market

Notice that if we have model E in the figure, we would find that E is dominates by both A and B. If we have to compare between D and E, we will say D is better than E, why? Because D is dominated by one model while E is dominated by two models.

Now I have to point out that inside of the step function, if the blue blocks gets darker, it means things in that area will be a worse product. Until here, we are never considering weights.

If we are the designer of model E, how should we improve it? If we only improve Power, there is a long way to go along the x_axis; if we only improve Efficiency, there's also a long way to go along the y_axis. The best way will be just improve a little bit power and efficiency at the same time. If E keeps improving in the red arrow, it will soon dominates a large area. This is an important insight. We also want our Evolutionary Algorithm to do so, evoluting a litte for a individual, making huge change of dominace.


What we can do in Evolutionary Algorithm using this theory?

It means in the selection process, we select individuals based on how many layers that individual are from the front. 

Sometimes, we might have too many individuals from the front, we can thin them and only take a sample of them.

One thing to note that previously we always fix our population size, but in this Pareto_based optimization, this value varies:

Sometimes it could be just a handful (small) of individuals on the Pareto front, then they will donimate and take over the population. Starting very small size, but they will spawn off a lot of individuals. Then the Pareto front will be crowded. (Everything on the Pareto front goes to the next generation, and in the end, we will have a set of good solution scattering on that front)

This is why we want to compare algorithm based on number of evaluations rather than numbers of generations, because when population size varies we cannot compare by generations anymore.

 

For symbolic regression problem, we will have a set of solution, but we should know that too complex expression means that we are overfitting, and too simple expression has a large error. We have to find the true solution in between.

The interesting thing is: most of time, the true solution will be the place that has a big drop in error without increasing expression size (complexity).

You could see after the red point, all the overfit solutions grow in complexity but have very diminishing returns in terms of reducing error

If we are trying to solve a problem but it actually has no solution, like to know the expression of random noise, we will have a smooth trade-off like this:


 

 The blue dots are data, and the colored line are functions, so we find that the red line has large error on the right part of data while the green line has large error on the left part of data. In other words, their error are not correlated.

So to what degree, the errors are located in the same location is a metric for novelty. The function that can suddenly reduce the error while no other function can do it, would be something novel and something we want to keep.


By checking the age of a solution, we can also get some information about novelty: The younger the solution, the better it is.

So for age-pareto method, one criteria is the inverse of the age, we call it youthfulness, and the other one is whatever criteria we want. Then we throw them together and input some random solutions for each evaluation. (random solution will be protected as youngest solution because they are in age 0, so nothing can dominate them)


Imagine we are solving symbolic regression, one axis will be error (objective: Accuracy), and for the other axis, we have many objective to choose , which one is better?

 

We just take every combination to evolve, and the result is telling us something important:

Compare EC and EN, we would say diversity is really siginicant -- the rewarding (min error) for simplicity is not as important as rewarding for divesity.

And if we see EA, we see a huge difference, this means metric on age inherently increases diversity. Remember what we said before: younger solution are protected.

We see more improvment for EAN, so diversity and age are really the factors that matter.

  

 

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值