Week 2: Convex Functions
1 Definition of convex function
1.1 Basic definition
A function
f
:
R
d
→
R
f: \mathbb{R}^d\rightarrow \mathbb{R}
f:Rd→R is convex if for
∀
\forall
∀
x
1
,
x
2
x_1,x_2
x1,x2 and
∀
\forall
∀
λ
∈
[
0
,
1
]
\lambda \in [0,1]
λ∈[0,1]:
f
(
λ
x
1
+
(
1
−
λ
)
x
2
)
≤
λ
f
(
x
1
)
+
(
1
−
λ
)
f
(
x
2
)
f(\lambda x_1+(1-\lambda)x_2)\leq \lambda f(x_1)+(1-\lambda)f(x_2)
f(λx1+(1−λ)x2)≤λf(x1)+(1−λ)f(x2)
1.2 Definition of differentiable function
- If f f f is differentiable, f f f is convex ⇔ \Leftrightarrow ⇔ ∀ \forall ∀ x x x and y y y, f ( y ) ≥ f ( x ) + ∇ f ( x ) T ( y − x ) f(y)\geq f(x)+\nabla f(x)^T(y-x) f(y)≥f(x)+∇f(x)T(y−x)
- If f f f is twice differentiable, f f f is convex ⇔ \Leftrightarrow ⇔ ∀ \forall ∀ x x x, ∇ 2 f ( x ) ≥ 0 \nabla^2f(x)\geq 0 ∇2f(x)≥0
1.3 Definition with monotone gradient
- If f f f is differentiable, f f f is convex ⇔ \Leftrightarrow ⇔ ∀ \forall ∀ x x x and y y y, ( ∇ f ( x ) − ∇ f ( y ) ) ( x − y ) ≥ 0 (\nabla f(x)-\nabla f(y))(x-y)\geq 0 (∇f(x)−∇f(y))(x−y)≥0
2 Operations preserving convexity
2.1 Non-negative sum
f = a 1 f 1 ( x ) + a 2 f 2 ( x ) f=a_1f_1(x)+a_2f_2(x) f=a1f1(x)+a2f2(x)
2.2 Affine map
f = f 1 ( A x + b ) f=f_1(Ax+b) f=f1(Ax+b)
2.3 Pointwise max
f = max { f 1 ( x ) , f 2 ( x ) } f=\max\{f_1(x),f_2(x)\} f=max{f1(x),f2(x)}
2.4 Minimizing out a subset of vars
f ( x ) = min y f 1 ( x , y ) f(x)=\min_y f_1(x,y) f(x)=yminf1(x,y)
3 Examples
3.1 Linear, Exponential, Negative entrophy (use hessian)
f ( x ) = a x + b f(x)=ax+b f(x)=ax+b f ( x ) = e a T x f(x)=e^{a^T x} f(x)=eaTx f ( x ) = x log x f(x)=x \log x f(x)=xlogx
3.2 All norms
f ( x ) = ∣ x ∣ p , p ≥ 1 f(x)=|x|_p,p\geq 1 f(x)=∣x∣p,p≥1
3.3 Maximum singular value of X X X
f ( X ) = max λ of X f(X)=\text{max } \lambda \text{ of } X f(X)=max λ of X