往期文章链接目录
文章目录
-
- 往期文章链接目录
- Optimization problem
- Optimization Categories
- Different initialization brings different optimum (if not convex)
- Affine sets
- Affine combination
- Affine hull
- Convex Sets
- Convex combination
- Convex hull
- Cones
- Hyperplanes and halfspaces
- Polyhedra
- Linearly Independent v.s. Affinely Independent
- Simplexes
- What is the key distinction between a convex hull and a simplex?
- Convex Functions
- First-order conditions
- Second-order conditions
- Examples of Convex and Concave Functions
- 往期文章链接目录
Optimization problem
All optimization problems can be written as:
![](https://img-blog.csdnimg.cn/20200327230137113.png)
Optimization Categories
-
convex v.s. non-convex
Deep Neural Network is non-convex -
continuous v.s.discrete
Most are continuous variable; tree structure is discrete -
constrained v.s. non-constrained
We add prior to make it a constrained problem -
smooth v.s.non-smooth
Most are smooth optimization
Different initialization brings different optimum (if not convex)
Idea: Give up global optimal and find a good local optimal.
-
Purpose of pre-training: Find a good initialization to start training, and then find a better local optimal.
-
Relaxation: Convert to a convex optimization problem.
-
Brute force: If a problem is small, we can use brute force.
Affine sets
A set C ⊆ R n C \subseteq \mathbf R^n C⊆Rn is affine if the line through any two distinct points in C C C lies in C C C, i.e., if for any x 1 x1 x1, x 2 ∈ C x2 \in C x2∈C and θ ∈ R \theta \in \mathbf R θ∈R, we have θ x 1 + ( 1 − θ ) x 2 ∈ C . \theta x_1 + (1-\theta) x_2 \in C. θx1+(1−θ)x2∈C.
Note: The line passing throught x 1 x_1 x1 and x 2 x_2 x2: y = θ x 1 + ( 1 − θ ) x 2 y=\theta x_1 + (1-\theta)x_2 y=θx1+(1−θ)x2.
Affine combination
We refer to a point of the form θ 1 x 1 + θ 2 x 2 + . . . + θ k x k \theta_1 x_1 + \theta_2 x_2 + ... + \theta_k x_k θ1x1+θ2x2+...+θkxk, where θ 1 + θ 2 + . . . + θ k = 1 \theta_1 + \theta_2 + ... + \theta_k = 1 θ1+θ2+...+θk=1 as an affine combination of the points x 1 , x 2 , . . . , x k x_1, x_2, ..., x_k x1,x2,...,xk. An affine set contains every affine combination of its points.
Affine hull
The set of all affine combinations of points in some set C ⊆ R n C \subseteq \mathbf R^n C⊆Rn is called the affine hull of C C C, and denoted a f f C \mathbf{aff}\, C affC:
a f f C = { θ 1 x 1 + θ 2 x 2 + . . . + θ k x k ∣ x 1 , x 2 , . . . , x k ∈ C , θ 1 + θ 2 + . . . + θ k = 1 } . \mathbf{aff}\, C =\{\theta_1 x_1 + \theta_2 x_2 + ... + \theta_k x_k \, | x_1, x_2, ..., x_k \in C, \theta_1 + \theta_2 + ... + \theta_k = 1\}. affC={ θ1x1+θ2x2+...+θkxk∣x1,x2,...,xk∈C,θ1+θ2+...+θk=1}.
The affine hull is the smallest affine set that contains C C C, in the following sense: if
S S S is any affine set with C ⊆ S C \subseteq S C⊆S, then aff C ⊆ S \operatorname{aff} C \subseteq S affC⊆S.
Affine dimension: We define the affine dimension of a set C C C as the dimension of its affine hull.
Convex Sets
A set C C C is convex if the line segment between any two points in C