凸集的定义(Convex Set)
A set ( S ) in a vector space (like
R
n
\mathbb{R}^n
Rn), the space of n-dimensional real vectors) is called convex if, for any two points within the set, every point on the straight line segment that connects these two points also lies within the set. Mathematically, this can be expressed as:
for any
x
,
y
∈
S
x,y\in S
x,y∈S and any
0
≤
λ
≤
1
0\leq\lambda\leq1
0≤λ≤1,the point
z
=
λ
x
+
(
1
−
λ
)
y
z=\lambda x+(1-\lambda)y
z=λx+(1−λ)y must also be in
S
S
S.
凸约束(Convex Constraints)
- Convex Constraint: A constraint is convex if it defines a convex set. For inequality constraints g ( x ) ≤ 0 g(x) \leq 0 g(x)≤0, g g g must be a convex function. For equality constraints h ( x ) = 0 h(x) = 0 h(x)=0, h h h must be affine (i.e., h ( x ) = A x + b h(x) = Ax + b h(x)=Ax+b where A A A is a matrix and b b b is a vector).
凸函数(Convex Functions)
Determining if a function is convex is essential in various fields like optimization, economics, and machine learning. Convex functions have important properties that make optimization problems easier to solve because a local minimum of a convex function is also a global minimum. Here are several methods to determine if a function is convex:
1. Definition
A function
f
:
R
n
→
R
f: \mathbb{R}^n \to \mathbb{R}
f:Rn→R is convex on a convex set
D
D
D if for all
x
,
y
∈
D
x, y \in D
x,y∈D and for all
λ
\lambda
λ such that
0
≤
λ
≤
1
0 \leq \lambda \leq 1
0≤λ≤1, the following inequality holds:
f
(
λ
x
+
(
1
−
λ
)
y
)
≤
λ
f
(
x
)
+
(
1
−
λ
)
f
(
y
)
f(\lambda x + (1-\lambda) y) \leq \lambda f(x) + (1-\lambda) f(y)
f(λx+(1−λ)y)≤λf(x)+(1−λ)f(y)
This definition states that the line segment connecting any two points on the graph of the function lies above or on the graph itself.
2. First Derivative Test (for functions of one variable)
A function f f f of one variable is convex on an interval if its first derivative f ′ f' f′ is monotonically non-decreasing on that interval. This means as x x x increases, f ′ ( x ) f'(x) f′(x) does not decrease.
3. Second Derivative Test
A function
f
f
f is convex on an interval if its second derivative
f
′
′
(
x
)
f''(x)
f′′(x) is non-negative for all
x
x
x in that interval. This is a straightforward test because it only involves checking the sign of the second derivative:
f
′
′
(
x
)
≥
0
for all
x
∈
D
f''(x) \geq 0 \quad \text{for all } x \in D
f′′(x)≥0for all x∈D
4. Hessian Matrix (for multivariable functions)
For functions of several variables,
f
:
R
n
→
R
f: \mathbb{R}^n \to \mathbb{R}
f:Rn→R, the function is convex on a convex set if its Hessian matrix
H
f
(
x
)
H_f(x)
Hf(x) is positive semidefinite(半定) for all
x
x
x in the domain of
f
f
f. The Hessian matrix of
f
f
f at
x
x
x is given by:
H
f
(
x
)
=
[
∂
2
f
∂
x
1
2
∂
2
f
∂
x
1
∂
x
2
⋯
∂
2
f
∂
x
1
∂
x
n
∂
2
f
∂
x
2
∂
x
1
∂
2
f
∂
x
2
2
⋯
∂
2
f
∂
x
2
∂
x
n
⋮
⋮
⋱
⋮
∂
2
f
∂
x
n
∂
x
1
∂
2
f
∂
x
n
∂
x
2
⋯
∂
2
f
∂
x
n
2
]
H_f(x) = \begin{bmatrix} \frac{\partial^2 f}{\partial x_1^2} & \frac{\partial^2 f}{\partial x_1 \partial x_2} & \cdots & \frac{\partial^2 f}{\partial x_1 \partial x_n} \\ \frac{\partial^2 f}{\partial x_2 \partial x_1} & \frac{\partial^2 f}{\partial x_2^2} & \cdots & \frac{\partial^2 f}{\partial x_2 \partial x_n} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial^2 f}{\partial x_n \partial x_1} & \frac{\partial^2 f}{\partial x_n \partial x_2} & \cdots & \frac{\partial^2 f}{\partial x_n^2} \end{bmatrix}
Hf(x)=
∂x12∂2f∂x2∂x1∂2f⋮∂xn∂x1∂2f∂x1∂x2∂2f∂x22∂2f⋮∂xn∂x2∂2f⋯⋯⋱⋯∂x1∂xn∂2f∂x2∂xn∂2f⋮∂xn2∂2f
A matrix is positive semidefinite if all its eigenvalues(特征值) are non-negative.
凸近似(Convex Approximation)
Convex approximation involves using a convex function to approximate a non-convex function.
1.Linearization (First-Order Taylor Expansion)
The simplest form of convex approximation is to perform a first-order Taylor expansion around a point
x
0
x_0
x0:
f
(
x
)
≈
f
(
x
0
)
+
f
′
(
x
0
)
(
x
−
x
0
)
f(x) \approx f(x_0) + f'(x_0) (x - x_0)
f(x)≈f(x0)+f′(x0)(x−x0)
Here,
f
′
(
x
0
)
f'(x_0)
f′(x0) is the derivative of the function at
x
0
x_0
x0. This linear approximation is usually only valid within a very small neighborhood of
x
0
x_0
x0.
2. Second-Order Taylor Expansion
If the second-order derivative (Hessian matrix) is semi-definite at some condition, a second-order Taylor expansion can be used:
f
(
x
)
≈
f
(
x
0
)
+
f
′
(
x
0
)
(
x
−
x
0
)
+
1
2
(
x
−
x
0
)
T
H
(
x
0
)
(
x
−
x
0
)
f(x) \approx f(x_0) + f'(x_0) (x - x_0) + \frac{1}{2} (x - x_0)^T H(x_0) (x - x_0)
f(x)≈f(x0)+f′(x0)(x−x0)+21(x−x0)TH(x0)(x−x0)
Where
H
(
x
0
)
H(x_0)
H(x0) is the Hessian matrix at
x
0
x_0
x0. If
H
(
x
0
)
H(x_0)
H(x0) is semi-definite, this approximation is convex.
3. Substitution and Relaxation
Sometimes, convex approximations can be created by substituting non-convex expressions or relaxing constraints. For instance, the absolute value function can be approximated by the square of the absolute value.