Understanding the ADMM algorithm

最新推荐文章于 2024-08-07 13:51:12 发布

郭十天

最新推荐文章于 2024-08-07 13:51:12 发布

阅读量274

点赞数

分类专栏：算法文章标签： ADMM Optimization Convex 算法凸函数

本文链接：https://blog.csdn.net/weixin_42444045/article/details/83512560

版权

算法专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Understanding the ADMM algorithm

ADMM
1. What is optimization problem?
2. Dual Gradient Ascent
3. Augmented Lagrangian Method
4.ADMM

ADMM

This chapter explains the procedure of Alternative Direction Methods for Multipliers (ADMM), starting from the fundamental knowledge about the optimization problem.

1. What is optimization problem?

The following function represents a most common and simple optimization problem:
$min_x f(x) ----(1)$

The x represents the variable needs to be optimized, i.e. by changing the value of x to make our objective function $f (x)$ achieve the minimum value.

In above equation, if we have only this function without any constrains to variable x, this problem is a simplest optimization problem, but in many circumstances, there will be many constrains to optimization variable x such as x needs to meet equality or inequality constraints, or to satisfy a set of numbers, e.g. $s . t . A x = b, A x > c$ . So an optimization problem with equality constraints might look like this:
$min_x f(x) ----(2)$ $s . t . A x = b$

2. Dual Gradient Ascent

The alternating direction method of multipliers (ADMM) is an algorithm that solves convex optimization problems by breaking them into smaller pieces, each of which are then easier to handle.

Consider the question (2), $x∈\mathbb R^n, A∈\mathbb R^{m*n}, f: \mathbb R^n∈R$ is a convex function, the Lagrangian function is :
$L(x,y)=f(x)+y^T(Ax-b)----(3)$
The dual function of (3) is:
$g(y)=inf_xL(x,y)=-f^*(-A^Ty)-b^Ty----(4)$
Where the $y$ is Lagrangian multiplier and dual variable, $f *$ is the conjugate function of $f$ .

Assume a function satisfies strong duality property, then the optimal solution for original problem is same as the optimal solution for dual problem. If the optimal solution for original problem is $x *$ , the optimal solution for dual problem is $y *$ , then:
$x^*=argmin_xL(x,y^*)----(5)$
We use gradient ascend to solve dual problem, the update of dual gradient ascent is:
$x^{k+1}=argmin_xL(x,y^k)$ $y^{k+1}=y^k+a_k(Ax^k+1-b)----(6)$
Where $a_k>0$ is the step size of the gradient ascent.

Assume the objective function is decomposable, then:
$f(x)=\sum_{i=1}^nf_i(x_i)----(7)$
Where , $x=(x_1,x_2,...,x_n),x_i∈\mathbb R^n$ , split matrix A:
$A_x=\sum_{i=1}^{n}A_ix_i----(8)$
then,
$A_x=\sum{i=1}^nA_ix_i----(9)$
The reconstructed Lagrangian function is :
$L(x,y)=\sum_{i=1}^nL_i(x_i,y)=\sum_{i=1}^n(f_i(x_i)+y^TA_ix_i-(1/N)y^Tb)----(10)$
The update for dual gradient ascent:
$x_i^{k+1}=argmin_{x_i}L_i(x_i,y^k)$ $y^{k+1}=y^k+a^k(Ax^{k+1}-b)----(11)$

3. Augmented Lagrangian Method

In order to increase the robustness of dual ascent method and relax constrains of strong convexity, we introduce the augmented Lagrangian method as follow:
$L_\rho(x,y)=f(x)+y^T(Ax-b)+({\rho/2})||Ax-b||_2^2----(12)$
Where is penalty parameter. Augmented Lagrangian method is actually adding a penalty term to reconstruct the original problem into:
$min_xf(x)+(\rho/2)||Ax-b||_2^2$ $s . t . A x = b - - - - (13)$
Dual ascent update:
$x^{k+1}=argmin_xL(x,y^k)$
$y^{k+1}=y^k+\rho(Ax^k-b)----(14)$

4.ADMM

The method in section 3 uses augmented Lagrangian method to update and the step size is now ρ which replaces , this is called multiplier method, it can make function be more convergence than normal dual ascent method, but the addition of second penalty term disabled the decomposable property of original problem, to address this problem, alternative direction method for multipliers (ADMM) is introduced.

For optimization problem:
$min_xf(x)+g(z)----(15)$ $s . t . A x + B z = c$
$x∈\mathbb R^n, z∈\mathbb R^m, A∈\mathbb R^{p*n}, B∈\mathbb R^{p*m}, c∈\mathbb R^p$ , the augmented Lagrangian function is:
$L_p(x,z,y)=f(x)+g(z)+y^T(Ax+Bz-c)+(\rho/2)||Ax+Bz-c||_2^2----(16)$
The update of variables by using ADMM is:
$x^{k+1}=argmin_xL_\rho(x^{k+1},z,y^k)$ $z^{k+1}=argmin_xL_\rho(x^{k+1},z,y^k)$ $y^{k+1}=y^k+\rho(Ax^{k+1}-Bz^{k+1}-c)$
With $\rho>0$ , the basic original iteration structure of ADMM method is presented.

郭十天

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Understanding the ADMM algorithm

Understanding the ADMM algorithmADMM1. What is optimization problem?2. Dual Gradient Ascent3. Augmented Lagrangian Method4.ADMMADMMThis chapter explains the procedure of Alternative Direction Method...
复制链接

扫一扫

专栏目录