凸优化算法—坐标下降法(Coordinate Descent Method)& 分块坐标下降法(Block Coordinate Descent Method)


Coordinate Descent Method

Conditions Required: The objective function is differentiable and smooth.

The coodinate descent method is not a gradient optimal method. The method can find the local minimum along the direction of a coordinate at each iteration. For fixed other vectors x j , . . . , x n x_j,...,x_n xj,...,xn, by minimizing the objective function with respect to x i x_i xi, e.g., ∂ f ∂ x i = 0 \frac{\partial f}{\partial x_i} = 0 xif=0, the method can optimize all vectors one by one.

The iterative formula of CDM is
x i ( k ) = arg min ⁡ x i f ( x 1 ( k ) , . . . , x i − 1 ( k ) , x i , x i + 1 ( k ) , . . , x n ( k ) ) , x_i^{(k)} = \argmin_{x_i} f \left(x_1^{(k)},...,x_{i-1}^{(k)},x_i,x_{i+1}^{(k)},..,x_n^{(k)} \right), xi(k)=xiargminf(x1(k),...,xi1(k),xi,xi+1(k),..,xn(k)), which can be rewritten as
x 1 ( k ) = arg min ⁡ x 1 f ( x 1 , x 2 ( k − 1 ) , x 3 ( k − 1 ) , . . . , x n ( k − 1 ) ) , x 2 ( k ) = arg min ⁡ x 1 f ( x 1 ( k ) , x 2 , x 3 ( k − 1 ) , . . . , x n ( k − 1 ) ) , ⋅ ⋅ ⋅ , x n ( k ) = arg min ⁡ x 1 f ( x 1 ( k ) , x 2 ( k ) , x 3 ( k ) , . . . , x n ) . \begin{aligned} x_1^{(k)} &= \argmin_{x_1} f \left( x_1,x_2^{(k-1)},x_3^{(k-1)},...,x_n^{(k-1)} \right), \\ x_2^{(k)} &= \argmin_{x_1} f \left( x_1^{(k)},x_2, x_3^{(k-1)},...,x_n^{(k-1)} \right) , \\ \quad \cdot &\cdot \cdot, \\ x_n^{(k)} &= \argmin_{x_1} f \left( x_1^{(k)},x_2^{(k)}, x_3^{(k)},...,x_n \right). \end{aligned} x1(k)x2(k)xn(k)=x1argminf(x1,x2(k1),x3(k1),...,xn(k1)),=x1argminf(x1(k),x2,x3(k1),...,xn(k1)),,=x1argminf(x1(k),x2(k),x3(k),...,xn).

Different to the gradient descent method, corrdinate descent method makes linear search along one dimension, but the former needs to calculate the gradient of the objective function.

The pseudo-code of coordinate descent method see the following figure:

The following points should be noted.

  • If coordinate descent method is used in non-smooth objective functions, then the method maybe stop to search results in some points which are not stationary (critical) points (非驻点).
  • The method can’t handle with high-dimensional problems.

Block Coordinate Descent Method

For better solving the high-dimensional problems, we can introduce the Block Coordinate Descent Method (BCDM).

The idea is spliting variables to many blocks, e.g., f ( x , y ) f(\mathbf{x},\mathbf{y}) f(x,y). The coordinate descent method is to alternately optimize x 1 , . . . , x N , y 1 , . . . , y N x_1,...,x_N,y_1,...,y_N x1,...,xN,y1,...,yN in one by one manner, but the block coordinate descent method alternately optimize one block with fixed other block, e.g., alternately optimizing x i k x_i^k xik with fixed y i k − 1 y_i^{k-1} yik1 and y i k y_i^k yik with fixed x i k − 1 x_i^{k-1} xik1.

If we split the problem to two sub-problems, then we


  • 7
    点赞
  • 39
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值