1. 定义
1.1 定义一
- 如果对任意 x 1 x_1 x1、 x 2 x_2 x2总有 f [ α x 1 + ( 1 − α ) x 2 ] ≥ α f ( x 1 ) + ( 1 − α ) f ( x 2 ) f[\alpha x_1 + (1 - \alpha )x_2] \ge \alpha f(x_1) + (1 - \alpha )f(x_2) f[αx1+(1−α)x2]≥αf(x1)+(1−α)f(x2),其中 0 ≤ α ≤ 1 \displaystyle 0 \le \alpha \le 1 0≤α≤1,则称 f ( x ) f(x) f(x)为上凸函数
- 如果对任意 x 1 x_1 x1、 x 2 x_2 x2且 x 1 ≠ x 2 x_1 \ne x_2 x1=x2,总有 f [ α x 1 + ( 1 − α ) x 2 ] > α f ( x 1 ) + ( 1 − α ) f ( x 2 ) f[\alpha x_1 + (1 - \alpha )x_2] \gt \alpha f(x_1) + (1 - \alpha )f(x_2) f[αx1+(1−α)x2]>αf(x1)+(1−α)f(x2),其中 0 < α < 1 0 \lt\alpha \lt 1 0<α<1,则称 f ( x ) f(x) f(x)为严格上凸函数
1.2 定义二
- 如果对任意 x 1 x_1 x1、 x 2 x_2 x2总有 f [ α x 1 + ( 1 − α ) x 2 ] ≤ α f ( x 1 ) + ( 1 − α ) f ( x 2 ) f[\alpha x_1 + (1 - \alpha )x_2]\le \alpha f(x_1) + (1 - \alpha )f(x_2) f[αx1+(1−α)x2]≤αf(x1)+(1−α)f(x2),其中 0 ≤ α ≤ 1 \displaystyle 0 \le \alpha \le 1 0≤α≤1,则称 f ( x ) f(x) f(x)为下凸函数
- 如果对任意 x 1 x_1 x1、 x 2 x_2 x2且 x 1 ≠ x 2 x_1 \ne x_2 x1=x2,总有 f [ α x 1 + ( 1 − α ) x 2 ] < α f ( x 1 ) + ( 1 − α ) f ( x 2 ) f[\alpha x_1 + (1 - \alpha )x_2] \lt \alpha f(x_1) + (1 - \alpha )f(x_2) f[αx1+(1−α)x2]<αf(x1)+(1−α)f(x2),其中 0 < α < 1 0 \lt\alpha \lt 1 0<α<1,则称 f ( x ) f(x) f(x)为严格下凸函数
2. 琴生(Jenson)不等式
- 对于上凸函数, f ( E [ X ] ) ≥ E [ f ( x ) ] f(E[X]) \ge E[f(x)] f(E[X])≥E[f(x)]或 ∑ k = 1 q λ k f ( x k ) ≤ f ( ∑ k = 1 q λ k x k ) \displaystyle \sum_{k=1}^q \lambda_k f(x_k)\le f(\sum_{k=1}^q \lambda_k x_k) k=1∑qλkf(xk)≤f(k=1∑qλkxk),其中 λ 1 , λ 2 , ⋯ , λ q \lambda_1,\lambda_2,\cdots,\lambda_q λ1,λ2,⋯,λq为正实数(或非负实数,后者去除无影响的 λ i = 0 \lambda_i = 0 λi=0的项即为前者,故二者等价)且 ∑ k = 1 q λ k = 1 \displaystyle \sum_{k=1}^q \lambda_k = 1 k=1∑qλk=1;对于严格上凸函数,上述等号成立当且仅当 x 1 = x 2 = ⋯ = x q x_1 = x_2 = \cdots = x_q x1=x2=⋯=xq。
- 对于下凸函数, f ( E [ X ] ) ≤ E [ f ( x ) ] f(E[X])\le E[f(x)] f(E[X])≤E[f(x)]或 ∑ k = 1 q λ k f ( x k ) ≥ f ( ∑ k = 1 q λ k x k ) \displaystyle \sum_{k=1}^q \lambda_k f(x_k) \ge f(\sum_{k=1}^q \lambda_k x_k) k=1∑qλkf(xk)≥f(k=1∑qλkxk),其中 λ 1 , λ 2 , ⋯ , λ q \lambda_1,\lambda_2,\cdots,\lambda_q λ1,λ2,⋯,λq为正实数(或非负实数,后者去除无影响的 λ i = 0 \lambda_i = 0 λi=0的项即为前者,故二者等价)且 ∑ k = 1 q λ k = 1 \displaystyle \sum_{k=1}^q \lambda_k = 1 k=1∑qλk=1;对于严格下凸函数,上述等号成立当且仅当 x 1 = x 2 = ⋯ = x q x_1 = x_2 = \cdots = x_q x1=x2=⋯=xq。
↓ \downarrow ↓证明过程如下 ↓ \downarrow ↓
2.1 上凸函数
证明:因为
λ
i
\lambda_i
λi均为正实数,故有
f
(
∑
k
=
1
q
λ
k
x
k
)
=
f
(
λ
1
x
1
+
∑
k
=
2
q
λ
k
∑
k
=
2
q
λ
k
x
k
∑
k
=
2
q
λ
k
)
≥
λ
1
f
(
x
1
)
+
∑
k
=
2
q
λ
k
⋅
f
(
∑
k
=
2
q
λ
k
x
k
∑
k
=
2
q
λ
k
)
\displaystyle f( \sum_{k=1}^q \lambda_k x_k) = f(\lambda_1 x_1 + \sum_{k=2}^q \lambda_k {\sum_{k=2}^q \lambda_k x_k \over \sum_{k=2}^q \lambda_k}) \ge \lambda_1 f(x_1) + \sum_{k=2}^q \lambda_k \cdot f({\sum_{k=2}^q \lambda_k x_k \over \sum_{k=2}^q \lambda_k})
f(k=1∑qλkxk)=f(λ1x1+k=2∑qλk∑k=2qλk∑k=2qλkxk)≥λ1f(x1)+k=2∑qλk⋅f(∑k=2qλk∑k=2qλkxk)
=
λ
1
f
(
x
1
)
+
∑
k
=
2
q
λ
k
⋅
f
(
λ
2
∑
k
=
2
q
λ
k
x
2
+
∑
k
=
3
q
λ
k
∑
k
=
2
q
λ
k
⋅
∑
k
=
3
q
λ
k
x
k
∑
k
=
3
q
λ
k
)
\displaystyle = \lambda_1 f(x_1) + \sum_{k=2}^q \lambda_k \cdot f({\lambda_2 \over \sum_{k=2}^q \lambda_k} x_2 + {\sum_{k=3}^q \lambda_k \over \sum_{k=2}^q \lambda_k} \cdot {\sum_{k=3}^q \lambda_k x_k \over \sum_{k=3}^q \lambda_k})
=λ1f(x1)+k=2∑qλk⋅f(∑k=2qλkλ2x2+∑k=2qλk∑k=3qλk⋅∑k=3qλk∑k=3qλkxk)
≥
λ
1
f
(
x
1
)
+
λ
2
f
(
x
2
)
+
∑
k
=
3
q
λ
k
⋅
f
(
∑
k
=
3
q
λ
k
x
k
∑
k
=
3
q
λ
k
)
\displaystyle \ge \lambda_1 f(x_1) + \lambda_2 f(x_2) + \sum_{k=3}^q \lambda_k \cdot f({\sum_{k=3}^q \lambda_k x_k \over \sum_{k=3}^q \lambda_k})
≥λ1f(x1)+λ2f(x2)+k=3∑qλk⋅f(∑k=3qλk∑k=3qλkxk)
≥
⋯
≥
∑
k
=
1
q
λ
k
f
(
x
k
)
\displaystyle \ge \cdots \ge \sum_{k=1}^q \lambda_k f(x_k)
≥⋯≥k=1∑qλkf(xk)
2.2 严格上凸函数
证明:由定义可知,对于严格上凸函数, f [ α x 1 + ( 1 − α ) x 2 ] ≥ α f ( x 1 ) + ( 1 − α ) f ( x 2 ) f[\alpha x_1 + (1 - \alpha )x_2] \ge \alpha f(x_1) + (1 - \alpha )f(x_2) f[αx1+(1−α)x2]≥αf(x1)+(1−α)f(x2)等号成立时当且仅当 x 1 = x 2 x_1 = x_2 x1=x2 。而根据上文对于上凸函数对于 ∑ k = 1 q λ k f ( x k ) ≤ f ( ∑ k = 1 q λ k x k ) \displaystyle \sum_{k=1}^q \lambda_k f(x_k)\le f(\sum_{k=1}^q \lambda_k x_k) k=1∑qλkf(xk)≤f(k=1∑qλkxk)不等式推导过程可知,若上凸函数为严格上凸函数,则第一个 ≥ \ge ≥处等号成立当且仅当: x 1 = ∑ k = 2 q λ k x k ∑ k = 2 q λ k x_1 = {\sum_{k=2}^q \lambda_k x_k \over \sum_{k=2}^q \lambda_k} x1=∑k=2qλk∑k=2qλkxk;第二个 ≥ \ge ≥处等号成立当且仅当: x 2 = ∑ k = 3 q λ k x k ∑ k = 3 q λ k x_2 = {\sum_{k=3}^q \lambda_k x_k \over \sum_{k=3}^q \lambda_k} x2=∑k=3qλk∑k=3qλkxk; ⋯ \cdots ⋯;第 q − 1 q-1 q−1个 ≥ \ge ≥处等号成立当且仅当: x q − 1 = ∑ k = q q λ k x k ∑ k = q q λ q = x q x_{q-1} = {\sum_{k=q}^q \lambda_k x_k \over \sum_{k=q}^q \lambda_q} = x_q xq−1=∑k=qqλq∑k=qqλkxk=xq。所有等号都成立则以上条件都需满足,对以上条件反向推导可得: x q = x q − 1 x_q = x_{q-1} xq=xq−1; x q − 2 = ∑ k = q − 1 q λ k x k ∑ k = q − 1 q λ k = λ q − 1 x q − 1 + λ q x q λ q − 1 + λ q = x q − 1 x_{q-2} = {\sum_{k=q-1}^q \lambda_k x_k \over \sum_{k=q-1}^q \lambda_k} = {\lambda_{q-1} x_{q-1} + \lambda_{q} x_q \over \lambda_{q-1} + \lambda_{q}} = x_{q-1} xq−2=∑k=q−1qλk∑k=q−1qλkxk=λq−1+λqλq−1xq−1+λqxq=xq−1; ⋯ \cdots ⋯; x 1 = x 2 x_1 = x_2 x1=x2。
即 ∑ k = 1 q λ k f ( x k ) ≤ f ( ∑ k = 1 q λ k x k ) \displaystyle \sum_{k=1}^q \lambda_k f(x_k)\le f(\sum_{k=1}^q \lambda_k x_k) k=1∑qλkf(xk)≤f(k=1∑qλkxk)等号成立当且仅当 x 1 = x 2 = ⋯ = x q x_1 = x_2 = \cdots = x_q x1=x2=⋯=xq
2.3 下凸函数
证明:因为
λ
i
\lambda_i
λi均为正实数,故有
f
(
∑
k
=
1
q
λ
k
x
k
)
=
f
(
λ
1
x
1
+
∑
k
=
2
q
λ
k
∑
k
=
2
q
λ
k
x
k
∑
k
=
2
q
λ
k
)
≤
λ
1
f
(
x
1
)
+
∑
k
=
2
q
λ
k
⋅
f
(
∑
k
=
2
q
λ
k
x
k
∑
k
=
2
q
λ
k
)
\displaystyle f( \sum_{k=1}^q \lambda_k x_k) = f(\lambda_1 x_1 + \sum_{k=2}^q \lambda_k {\sum_{k=2}^q \lambda_k x_k \over \sum_{k=2}^q \lambda_k}) \le \lambda_1 f(x_1) + \sum_{k=2}^q \lambda_k \cdot f({\sum_{k=2}^q \lambda_k x_k \over \sum_{k=2}^q \lambda_k})
f(k=1∑qλkxk)=f(λ1x1+k=2∑qλk∑k=2qλk∑k=2qλkxk)≤λ1f(x1)+k=2∑qλk⋅f(∑k=2qλk∑k=2qλkxk)
=
λ
1
f
(
x
1
)
+
∑
k
=
2
q
λ
k
⋅
f
(
λ
2
∑
k
=
2
q
λ
k
x
2
+
∑
k
=
3
q
λ
k
∑
k
=
2
q
λ
k
⋅
∑
k
=
3
q
λ
k
x
k
∑
k
=
3
q
λ
k
)
\displaystyle = \lambda_1 f(x_1) + \sum_{k=2}^q \lambda_k \cdot f({\lambda_2 \over \sum_{k=2}^q \lambda_k} x_2 + {\sum_{k=3}^q \lambda_k \over \sum_{k=2}^q \lambda_k} \cdot {\sum_{k=3}^q \lambda_k x_k \over \sum_{k=3}^q \lambda_k})
=λ1f(x1)+k=2∑qλk⋅f(∑k=2qλkλ2x2+∑k=2qλk∑k=3qλk⋅∑k=3qλk∑k=3qλkxk)
≤
λ
1
f
(
x
1
)
+
λ
2
f
(
x
2
)
+
∑
k
=
3
q
λ
k
⋅
f
(
∑
k
=
3
q
λ
k
x
k
∑
k
=
3
q
λ
k
)
\displaystyle \le \lambda_1 f(x_1) + \lambda_2 f(x_2) + \sum_{k=3}^q \lambda_k \cdot f({\sum_{k=3}^q \lambda_k x_k \over \sum_{k=3}^q \lambda_k})
≤λ1f(x1)+λ2f(x2)+k=3∑qλk⋅f(∑k=3qλk∑k=3qλkxk)
≤
⋯
≤
∑
k
=
1
q
λ
k
f
(
x
k
)
\displaystyle \le \cdots \le \sum_{k=1}^q \lambda_k f(x_k)
≤⋯≤k=1∑qλkf(xk)
2.4 严格下凸函数
证明:由定义可知,对于严格下凸函数, f [ α x 1 + ( 1 − α ) x 2 ] ≤ α f ( x 1 ) + ( 1 − α ) f ( x 2 ) f[\alpha x_1 + (1 - \alpha )x_2] \le \alpha f(x_1) + (1 - \alpha )f(x_2) f[αx1+(1−α)x2]≤αf(x1)+(1−α)f(x2)等号成立时当且仅当 x 1 = x 2 x_1 = x_2 x1=x2 。而根据上文对于下凸函数对于 ∑ k = 1 q λ k f ( x k ) ≤ f ( ∑ k = 1 q λ k x k ) \displaystyle \sum_{k=1}^q \lambda_k f(x_k)\le f(\sum_{k=1}^q \lambda_k x_k) k=1∑qλkf(xk)≤f(k=1∑qλkxk)不等式推导过程可知,若下凸函数为严格下凸函数,则第一个 ≤ \le ≤处等号成立当且仅当: x 1 = ∑ k = 2 q λ k x k ∑ k = 2 q λ k x_1 = {\sum_{k=2}^q \lambda_k x_k \over \sum_{k=2}^q \lambda_k} x1=∑k=2qλk∑k=2qλkxk;第二个 ≤ \le ≤处等号成立当且仅当: x 2 = ∑ k = 3 q λ k x k ∑ k = 3 q λ k x_2 = {\sum_{k=3}^q \lambda_k x_k \over \sum_{k=3}^q \lambda_k} x2=∑k=3qλk∑k=3qλkxk; ⋯ \cdots ⋯;第 q − 1 q-1 q−1个 ≤ \le ≤处等号成立当且仅当: x q − 1 = ∑ k = q q λ k x k ∑ k = q q λ q = x q x_{q-1} = {\sum_{k=q}^q \lambda_k x_k \over \sum_{k=q}^q \lambda_q} = x_q xq−1=∑k=qqλq∑k=qqλkxk=xq。所有等号都成立则以上条件都需满足,对以上条件反向推导可得: x q = x q − 1 x_q = x_{q-1} xq=xq−1; x q − 2 = ∑ k = q − 1 q λ k x k ∑ k = q − 1 q λ k = λ q − 1 x q − 1 + λ q x q λ q − 1 + λ q = x q − 1 x_{q-2} = {\sum_{k=q-1}^q \lambda_k x_k \over \sum_{k=q-1}^q \lambda_k} = {\lambda_{q-1} x_{q-1} + \lambda_{q} x_q \over \lambda_{q-1} + \lambda_{q}} = x_{q-1} xq−2=∑k=q−1qλk∑k=q−1qλkxk=λq−1+λqλq−1xq−1+λqxq=xq−1; ⋯ \cdots ⋯; x 1 = x 2 x_1 = x_2 x1=x2。
即 ∑ k = 1 q λ k f ( x k ) ≥ f ( ∑ k = 1 q λ k x k ) \displaystyle \sum_{k=1}^q \lambda_k f(x_k) \ge f(\sum_{k=1}^q \lambda_k x_k) k=1∑qλkf(xk)≥f(k=1∑qλkxk)等号成立当且仅当 x 1 = x 2 = ⋯ = x q x_1 = x_2 = \cdots = x_q x1=x2=⋯=xq