高级优化理论与方法(八)

Global Search Method

之前的方法都需要用到函数Gradient,并且要求用户自己给出初始点 x 0 x_0 x0。接下来介绍几种不一样的启发式算法。

Neler-Mead Simplex

Def: “Simplex” R n \mathbb{R}^n Rn
Object determined by an assemby of n + 1 n+1 n+1 points.

d e t [ P 0 P 1 ⋯ P n 1 1 ⋯ 1 ] ≠ 0 det\begin{bmatrix} P_0&P_1&\cdots &P_n \\ 1&1&\cdots &1 \end{bmatrix}\neq 0 det[P01P11Pn1]=0

Initialize: P 0 , ⋯   , P n ∈ R n P_0,\cdots,P_n\in \mathbb{R}^n P0,,PnRn
( P i = P 0 + α i e i , α i ∈ R , e i = [ 0 ⋯ 0 1 0 ⋯ 0 ] ) P_i=P_0+\alpha_i e_i, \alpha_i\in \mathbb{R}, e_i=\begin{bmatrix} 0\\ \cdots \\ 0\\ 1\\ 0\\ \cdots \\ 0 \end{bmatrix}) Pi=P0+αiei,αiR,ei= 00100 )
注:上面给出了一种可行的初始化方法, e i e_i ei表示只有第 i i i维为1,其余都为0的 n n n维向量。

Update: replace P i P_i Pi with the max f ( P i ) f(P_i) f(Pi) by a new point.

Termination conditions satisfied.

2-dimensional: P s , P n l , P l : f ( P s ) ≤ f ( P n l ) ≤ f ( P l ) P_s, P_{nl}, P_l: f(P_s)\leq f(P_{nl})\leq f(P_l) Ps,Pnl,Pl:f(Ps)f(Pnl)f(Pl)

注:二维情况下,有三个初始点,将三个初始点按照大小关系排列产生 P s , P n l , P l P_s, P_{nl}, P_l Ps,Pnl,Pl

f ( P 0 ) ≤ f ( P 1 ) ≤ ⋯ ≤ f ( P n ) f(P_0)\leq f(P_1)\leq \cdots \leq f(P_n) f(P0)f(P1)f(Pn)

P g = 1 n ∑ i = 0 n − 1 P i P_g=\frac{1}{n} \sum_{i=0}^{n-1} P_i Pg=n1i=0n1Pi

P r = P g + ρ ( P g − P l ) P_r=P_g+\rho (P_g-P_l) Pr=Pg+ρ(PgPl) [typical: ρ = 1 \rho=1 ρ=1]

接下来都是以二维情况举例,进行分类讨论。

Case 1

f ( P s ) ≤ f ( P r ) ≤ f ( P n l ) f(P_s)\leq f(P_r)\leq f(P_{nl}) f(Ps)f(Pr)f(Pnl)

replace P l P_l Pl by P r → P_r\rightarrow Pr next iteration

Case 2

f ( P r ) < f ( P s ) f(P_r)<f(P_s) f(Pr)<f(Ps)

expansion: P e = P g + λ ( P g − P l ) P_e=P_g+\lambda (P_g-P_l) Pe=Pg+λ(PgPl) [ λ = 2 \lambda=2 λ=2]

Case 2.1

f ( P e ) ≤ f ( P r ) f(P_e)\leq f(P_r) f(Pe)f(Pr)

replace P l P_l Pl by P e P_e Pe

Case 2.2

otherwise
replace P l P_l Pl by P r P_r Pr

Case 3

f ( P r ) > f ( P n l ) f(P_r)>f(P_{nl}) f(Pr)>f(Pnl)

Case 3.1

f ( P l ) > f ( P r ) : P c = P g + r ( P r − P g ) f(P_l)>f(P_r): P_c=P_g+r(P_r-P_g) f(Pl)>f(Pr):Pc=Pg+r(PrPg) [ r = 1 2 r=\frac{1}{2} r=21]

Case 3.2

otherwise
P c = P g + r ( P l − P g ) P_c=P_g+r(P_l-P_g) Pc=Pg+r(PlPg) [ r = 1 2 r=\frac{1}{2} r=21]
If f ( P c ) < f ( P l ) f(P_c)<f(P_l) f(Pc)<f(Pl) then replace P l P_l Pl by P c P_c Pc, next iteration.
Otherwise, shrinkage: ∀ i : V i = δ ( P l − P s ) \forall i: V_i=\delta (P_l-P_s) i:Vi=δ(PlPs) [ δ = 1 2 \delta=\frac{1}{2} δ=21]

Simulated Annealing

模拟退火算法是一种随机搜索(Randomized Search) 算法。

Def: “Neighborhood” of x x x: N ϵ ( x ) = { x ′ : d ( x , x ′ ) ≤ ϵ } N_{\epsilon}(x)=\{x': d(x,x')\leq \epsilon\} Nϵ(x)={x:d(x,x)ϵ}

Naive Random Search

  1. k : = 0 k:=0 k:=0, initialize x 0 x^0 x0
  2. Pick a point z k z^k zk at random from N ϵ ( x k ) N_{\epsilon}(x^k) Nϵ(xk)
  3. If f ( z k ) < f ( x k ) f(z^k)<f(x^k) f(zk)<f(xk), then x k + 1 = z k x^{k+1}=z^k xk+1=zk; else x k + 1 = x k x^{k+1}=x^k xk+1=xk
  4. If some stop criterium satisfied, then stop
  5. k k k++; Goto 2

Problem: local optimum
way: enlarge N ϵ ( x ) N_{\epsilon}(x) Nϵ(x)

Simulated Annealing

  1. Toss coin with probability of HEAD equal to p ( k , f ( x k ) , f ( z k ) ) p(k,f(x^k),f(z^k)) p(k,f(xk),f(zk)). If HEAD, then x k + 1 = z k x^{k+1}=z^k xk+1=zk; else x k + 1 = x k x^{k+1}=x^k xk+1=xk

P ( k , f ( x k ) , f ( z k ) ) = m i n { 1 , e x p ( − f ( z k ) − f ( x k ) T k ) } P(k,f(x^k),f(z^k))=min\{1,exp(-\frac{f(z^k)-f(x^k)}{T_k})\} P(k,f(xk),f(zk))=min{1,exp(Tkf(zk)f(xk))}

where T k T_k Tk is a positive sequence.

T k = r l o g ( k + 2 ) , r > 0 T_k=\frac{r}{log(k+2)}, r>0 Tk=log(k+2)r,r>0
monotonically decreased to 0.

{ f ( z k < f ( x k ) : x k + 1 = z k ( 概率为 1 ) f ( z k ) ≥ f ( x k ) : x k + 1 = z k ( 概率为 e x p ( − f ( z k ) − f ( x k ) T k ) ) \begin{cases} f(z^k<f(x^k): x^{k+1}=z^k(概率为1) \\ f(z^k)\geq f(x^k): x^{k+1}=z^k (概率为exp(-\frac{f(z^k)-f(x^k)}{T_k})) \end{cases} {f(zk<f(xk):xk+1=zk(概率为1)f(zk)f(xk):xk+1=zk(概率为exp(Tkf(zk)f(xk)))

k → ∞ k\to \infty k: “escape” probability decreased.
注:该方法通过抛硬币的方式,解决了朴素随机搜索中可能陷入局部最小值的问题。

Particle Swarm Optimization (PSO)

粒子群优化

∣ P ∣ = m |P|=m P=m

∀ i : p i b e s t \forall i: p_i^{best} i:pibest

g b e s t g^{best} gbest: globally best

basic PSO

  1. k : = 0 k:=0 k:=0, generate initial random points. < p i 0 , v i 0 > p i b e s t = p i 0 , g b e s t = a r g m i n i f ( p i 0 ) <p_i^0,v_i^0>p_i^{best}=p_i^0, g^{best}=argmin_i f(p_i^0) <pi0,vi0>pibest=pi0,gbest=argminif(pi0)
  2. For i = 1 , ⋯   , m i=1,\cdots,m i=1,,m generate random vectors r i k , s i k r_i^k, s_i^k rik,sik with components from {0,1}, and set ω < 1 , c 1 , c 2 ≈ 2 ⇒ V i k + 1 = ω V i k + c 1 r i k ( p i b e s t , k − p i k ) + c 2 s i k ( g b e s t , k − p i k ) , p i k + 1 = p i k + V i k + 1 \omega<1,c_1,c_2\approx2\Rightarrow V_i^{k+1}=\omega V_i^k+c_1r_i^k(p_i^{best,k}-p_i^k)+c_2s_i^k(g^{best,k}-p_i^k), p_i^{k+1}=p_i^k+V_i^{k+1} ω<1,c1,c22Vik+1=ωVik+c1rik(pibest,kpik)+c2sik(gbest,kpik),pik+1=pik+Vik+1
  3. For i = 1 , ⋯   , m i=1,\cdots,m i=1,,m do: if f ( p i k + 1 ) < f ( p i b e s t , k ) f(p_i^{k+1})<f(p_i^{best,k}) f(pik+1)<f(pibest,k), then p i b e s t , k + 1 = p i k + 1 p_i^{best,k+1}=p_i^{k+1} pibest,k+1=pik+1; else p i b e s t , k + 1 = p i b e s t , k p_i^{best,k+1}=p_i^{best,k} pibest,k+1=pibest,k
  4. If ∃ i ∈ { 1 , ⋯   , m } \exist i\in \{1,\cdots,m\} i{1,,m} with f ( p i k + 1 ) < f ( g b e s t , k ) f(p_i^{k+1})<f(g^{best,k}) f(pik+1)<f(gbest,k) then g b e s t , k + 1 = p i k + 1 g^{best,k+1}=p_i^{k+1} gbest,k+1=pik+1; else g b e s t , k + 1 = g b e s t , k g^{best,k+1}=g^{best,k} gbest,k+1=gbest,k
  5. If some stop criterion satisfied then stop;
  6. k k k++; goto 2

Genetic Algorithms

遗传算法

representation scheme: ①selection②cross over③mutation

算法流程:

  1. P 0 P_0 P0
  2. Selection → M k \rightarrow M_k Mk
  3. Cross Over
  4. Mutation
  5. If some stop criterion satisfied then stop;
  6. goto 2

为了表述方便,这里假定求最大值而非最小值。

Selection

population set: ∣ P ( k ) ∣ = N |P(k)|=N P(k)=N

P ( k ) = { x 1 , ⋯   , x N } P(k)=\{x_1,\cdots,x_N\} P(k)={x1,,xN}

∣ M ( k ) ∣ = N |M(k)|=N M(k)=N

注:Selection的目的是从大小为N的population set(即 P P P)中选出N个元素组成 M M M

Rouletle-Wheel

P r o b ( x i → M ( k ) ) = f ( x i ) F ( k ) , F ( k ) = ∑ i = 1 N f ( x i ) Prob(x_i\to M(k))=\frac{f(x_i)}{F(k)}, F(k)=\sum_{i=1}^N f(x_i) Prob(xiM(k))=F(k)f(xi),F(k)=i=1Nf(xi)

Tournament Scheme

随机两个元素 x i , x j x_i,x_j xi,xj,若 f ( x i ) > f ( x j ) f(x_i)>f(x_j) f(xi)>f(xj),则选取 x i x_i xi进入 M M M

Cross Over

随机两个元素 x i , x j x_i,x_j xi,xj,将 x i x_i xi的前半部分和 x j x_j xj的后半部分结合,形成新的元素。

Mutation

以较低的概率对元素 x i x_i xi的某一位进行变异。

Constrained Optimization

min f ( x ) f(x) f(x)
s.t. x ∈ Ω x\in \Omega xΩ

Linear Programming(LP)

min/max f ( x ) = c T x = ∑ i = 1 n c i x i j , c ∈ R n , x ∈ R n f(x)=c^Tx=\sum_{i=1}^nc_ix_{ij}, c\in \mathbb{R}^n, x \in \mathbb{R}^n f(x)=cTx=i=1ncixij,cRn,xRn
s.t. { a 11 x 1 + ⋯ + a 1 n x n > b 1 a 21 x 1 + ⋯ + a 2 n x n ≤ b 2 ⋯ a m 1 x 1 + ⋯ + a m n x n ≥ b m \begin{cases} a_{11}x_1+\cdots+a_{1n}x_n>b_1\\ a_{21}x_1+\cdots+a_{2n}x_n\leq b_2\\ \cdots\\ a_{m1}x_1+\cdots+a_{mn}x_n\geq b_m \end{cases} a11x1++a1nxn>b1a21x1++a2nxnb2am1x1++amnxnbm
b i ∈ R , ∀ 1 ≤ i ≤ m b_i\in\mathbb{R},\forall 1\leq i\leq m biR,∀1im
a i j ∈ R , ∀ 1 ≤ i ≤ n , 1 ≤ j ≤ m a_{ij}\in\mathbb{R}, \forall 1\leq i\leq n, 1\leq j\leq m aijR,∀1in,1jm

Complex

LP Standtard Form

min c T x c^Tx cTx
s.t. A x ≥ b Ax\geq b Axb

Normal Form

min c T x c^Tx cTx
s.t. A x = b Ax=b Ax=b
x ≥ 0 x\geq 0 x0
注:为了满足 x ≥ 0 x\geq0 x0,若 x i x_i xi的要求是小于等于0,则可以用 − x i -x_i xi来代替 x i x_i xi;若 x i x_i xi没有要求,则可以令 x i = u − v , u , v ≥ 0 x_i=u-v,u,v\geq0 xi=uv,u,v0

Example

max x 2 − x 1 x_2-x_1 x2x1
s.t. 3 x 1 = x 2 − 5 3x_1=x_2-5 3x1=x25
∣ x 2 ∣ ≤ 2 |x_2|\leq 2 x22
x 1 ≤ 0 x_1\leq 0 x10

①min x 1 − x 2 x_1-x_2 x1x2
x 1 ← − x 1 x_1\leftarrow-x_1 x1x1
∣ x 2 ∣ ≤ 2 ⇒ x 2 ≤ 2 , x 2 ≥ − 2 |x_2|\leq2\Rightarrow x_2\leq 2, x_2\geq -2 x22x22,x22
x 2 = u − v , u , v ≥ 0 x_2=u-v,u,v\geq 0 x2=uv,u,v0

min − x 1 − ( u − v ) -x_1-(u-v) x1(uv)
s.t. − 3 x 1 = u − v − 5 -3x_1=u-v-5 3x1=uv5
u − v ≤ 2 u-v\leq 2 uv2
u − v ≥ 2 u-v\geq2 uv2
x 1 , u , v ≥ 0 x_1,u,v\geq0 x1,u,v0

min − x 1 − u + v -x_1-u+v x1u+v
s.t. 3 x 1 + u − v = 5 3x_1+u-v=5 3x1+uv=5
u − v + y = 2 u-v+y=2 uv+y=2
u − v − z = − 2 u-v-z=-2 uvz=2
x 1 , u , v , y , z ≥ 0 x_1,u,v,y,z\geq0 x1,u,v,y,z0

Theorem

For each LP, there exists an equivalent LP in normal form.

总结

这节课先介绍了一些全局搜索法。介绍了奈勒-米德单纯形算法,模拟退火算法,粒子群优化算法,遗传算法(这里讲的比较粗略,可以参考我的另一篇博客)。这些算法都属于启发式算法,算法的理论基础较为薄弱,所以在介绍算法之后没有做过多展开。

到这周是第八周了,学期过半。前半学期都在介绍无限制条件的优化算法,后半学期要开始介绍带限制条件的优化算法了。这节课先从比较简单的线性优化开始,介绍单纯形法。这节课证明了任何线性优化问题都可以转化为规范形式,这方便了我们后面的求解。

  • 25
    点赞
  • 13
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值