多元线性回归Tips

一、多元线性回归Tips

1.凸集

  • 凸集定义:设集合 D ∈ R n D\in R^{n} DRn,如果对任意的 x , y ∈ D x,y\in D x,yD与任意的 a ∈ [ 0 , 1 ] a\in \left[0,1\right] a[0,1],有 a x + ( 1 − a ) y ∈ D ax+\left(1-a \right)y \in D ax+(1a)yD,则称集合 D D D是凸集。
  • 凸集的几何意义是:若两个点属于此集合,则这两点连线上的任意一点均属于此集合。
    凸集 非凸集
  • 如上两图所示,左边是凸集、右边是非凸集

2.梯度

  • 梯度定义:设 n n n元函数 f ( x ) f\left(x\right) f(x)对自变量 x = ( x 1 , x 2 , ⋯   , x n ) T x=\left(x_{1},x_{2},\cdots ,x_{n} \right)^T x=(x1,x2,,xn)T的各分量 x i x_{i} xi的偏导数 ∂ f ( x ) ∂ x i ( i = 1 , 2 , ⋯   , n ) \frac{\partial f\left (x \right )}{\partial x_{i}}\left ( i=1,2,\cdots,n\right ) xif(x)(i=1,2,,n)都存在,则称函数 f ( x ) f\left(x\right) f(x) x x x处一阶可导,并称向量 ▽ f ( x ) = ( ∂ f ( x ) ∂ x 1 ∂ f ( x ) ∂ x 2 ⋯ ∂ f ( x ) ∂ x n ) \bigtriangledown f\left ( x\right )= \begin{pmatrix} \frac{\partial f\left ( x\right )}{\partial x_{1}}\\ \frac{\partial f\left ( x\right )}{\partial x_{2}}\\ \cdots \\ \frac{\partial f\left ( x\right )}{\partial x_{n}}\\ \end{pmatrix} f(x)=x1f(x)x2f(x)xnf(x)
    为函数 f ( x ) f\left(x\right) f(x) x x x处的一阶导数或者梯度,记为 ▽ f ( x ) \bigtriangledown f \left (x \right ) f(x)(列向量)

3.Hessian矩阵(海塞矩阵)

  • Hessian(海塞)矩阵定义:设 n n n元函数 f ( x ) f\left(x\right) f(x)对自变量 x = ( x 1 , x 2 , ⋯   , x n ) T x=\left(x_{1},x_{2},\cdots ,x_{n} \right)^T x=(x1,x2,,xn)T的各个分量 x i x_{i} xi的二阶偏导数 ∂ 2 f ( x ) ∂ x i 2 ( i = 1 , 2 , ⋯   , n ; i = 1 , 2 , ⋯   , n ) \frac{\partial^2 f\left (x \right )}{\partial x_{i}^2}\left ( i=1,2,\cdots,n;i=1,2,\cdots,n\right ) xi22f(x)(i=1,2,,n;i=1,2,,n)都存在,则称函数 f ( x ) f\left(x\right) f(x) x x x处二阶可导,并称矩阵 ▽ 2 f ( x ) = ( ∂ 2 f ( x ) ∂ x 1 2 ∂ 2 f ( x ) ∂ x 1 ∂ x 2 ⋯ ∂ 2 f ( x ) ∂ x 1 ∂ x n ∂ 2 f ( x ) ∂ x 2 ∂ x 1 ∂ 2 f ( x ) ∂ x 2 2 ⋯ ∂ 2 f ( x ) ∂ x 2 ∂ x n ⋯ ⋯ ⋱ ⋯ ∂ 2 f ( x ) ∂ x n ∂ x 1 ∂ 2 f ( x ) ∂ x n ∂ x 2 ⋯ ∂ 2 f ( x ) ∂ x n 2 ) \bigtriangledown^2 f\left ( x\right )= \begin{pmatrix} \frac{\partial^2 f\left ( x\right )}{\partial x_{1}^2}& \frac{\partial^2 f\left ( x\right )}{\partial x_{1}\partial x_{2}}&\cdots&\frac{\partial^2 f\left ( x\right )}{\partial x_{1}\partial x_{n}}\\ \frac{\partial^2 f\left ( x\right )}{\partial x_{2}\partial x_{1}}& \frac{\partial^2 f\left ( x\right )}{\partial x_{2}^2}&\cdots&\frac{\partial^2 f\left ( x\right )}{\partial x_{2}\partial x_{n}}\\ \cdots&\cdots&\ddots &\cdots& \\ \frac{\partial^2 f\left ( x\right )}{\partial x_{n}\partial x_{1}}& \frac{\partial^2 f\left ( x\right )}{\partial x_{n}\partial x_{2}}&\cdots&\frac{\partial^2 f\left ( x\right )}{\partial x_{n}^2}\\ \end{pmatrix} 2f(x)=x122f(x)x2x12f(x)xnx12f(x)x1x22f(x)x222f(x)xnx22f(x)x1xn2f(x)x2xn2f(x)xn22f(x)
    f ( x ) f\left(x\right) f(x) x x x处的二阶导数或者Hessian矩阵,记为 ▽ 2 f ( x ) \bigtriangledown^2f\left(x\right) 2f(x),若 f ( x ) f\left(x\right) f(x) x x x各变元的所有二阶偏导数都连续,则 ∂ 2 f ( x ) ∂ x i ∂ x j = ∂ 2 f ( x ) ∂ x j ∂ x i \frac{\partial^2 f\left ( x\right )}{\partial x_{i}\partial x_{j}}=\frac{\partial^2 f\left ( x\right )}{\partial x_{j}\partial x_{i}} xixj2f(x)=xjxi2f(x)此时 ▽ 2 f ( x ) \bigtriangledown^2f\left(x\right) 2f(x)为对称矩阵。
  • 补充在二元函数中,如果 f ( x , y ) f\left(x,y\right) f(x,y)对于 x , y x,y x,y的二阶偏导数都连续,则 f x y ′ ′ = f y x ′ ′ {f}''_{xy}={f}''_{yx} fxy=fyx

4.多元实值函数凹凸性判定定理

  • D ⊂ R n D\subset R^n DRn是非空开凸集, f : D ⊂ R n → R f:D\subset R^n \rightarrow R f:DRnR,且 f ( x ) f\left(x\right) f(x) D D D上二阶连续可微,如果 f ( x ) f\left(x\right) f(x)的Hessian矩阵 ▽ 2 f ( x ) \bigtriangledown^2 f\left(x\right) 2f(x) D D D上是正定的,则 f ( x ) f\left(x\right) f(x) D D D上的严格凸函数。

5.凸充分性定理

  • f : R n → R f:R^n \rightarrow R f:RnR是凸函数,且 f ( x ) f\left(x\right) f(x)一阶连续可微,则 x ∗ x^* x是全局解的充分必要条件是 ▽ f ( x ∗ ) = 0 ⃗ \bigtriangledown f\left(x^*\right)=\vec{0} f(x)=0 ,其中 ▽ f ( x ) \bigtriangledown f\left(x\right) f(x) f ( x ) f\left(x\right) f(x)关于 x x x的一阶导数(也称梯度)。

6.[标量-向量]的矩阵微分公式

∂ y ∂ x = ( ∂ y ∂ x 1 ∂ y ∂ x 2 ⋮ ∂ y ∂ x n )            ∂ y ∂ x = ( ∂ y ∂ x 1 ∂ y ∂ x 2 ⋯ ∂ y ∂ x n ) \frac {\partial y}{\partial x}=\begin{pmatrix} \frac {\partial y}{\partial x_{1}} \\ \frac {\partial y}{\partial x_{2}} \\ \vdots \\ \frac {\partial y}{\partial x_{n}} \\ \end{pmatrix}\;\;\;\;\;\frac {\partial y}{\partial x}=\begin{pmatrix} \frac {\partial y}{\partial x_{1}} & \frac {\partial y}{\partial x_{2}} & \cdots & \frac {\partial y}{\partial x_{n}} & \end{pmatrix} xy=x1yx2yxnyxy=(x1yx2yxny)

  • 左式为分母布局(默认采用),右式为分子布局,其中, x = ( x 1 , x 2 , ⋯   , x n ) T x=(x_{1},x_{2},\cdots,x_{n})^T x=(x1,x2,,xn)T n n n维列向量, y y y x x x n n n元标量函数

  • 由【标量-向量】的矩阵微分公式可推得:
    ∂ x T a ⃗ ∂ x = ∂ a ⃗ T x ∂ x = ( ∂ ( a 1 x 1 + a 2 x 2 + ⋯ + a n x n ) ∂ x 1 ∂ ( a 1 x 1 + a 2 x 2 + ⋯ + a n x n ) ∂ x 2 ⋯ ∂ ( a 1 x 1 + a 2 x 2 + ⋯ + a n x n ) ∂ x n ) = ( a 1 a 2 ⋯ a n ) = a ⃗ \frac {\partial x^T\vec{a}}{\partial x}=\frac {\partial \vec{a}^Tx}{\partial x}=\begin{pmatrix} \frac {\partial \left(a_{1}x_{1}+a_{2}x_{2}+\cdots+a_{n}x_{n} \right)}{\partial x_{1}} \\ \frac {\partial \left(a_{1}x_{1}+a_{2}x_{2}+\cdots+a_{n}x_{n} \right)}{\partial x_{2}} \\ \cdots \\ \frac {\partial \left(a_{1}x_{1}+a_{2}x_{2}+\cdots+a_{n}x_{n} \right)}{\partial x_{n}} \\ \end{pmatrix}=\begin{pmatrix} a_{1} \\ a_{2} \\ \cdots \\ a_{n} \\ \end{pmatrix} = \vec{a} xxTa =xa Tx=x1(a1x1+a2x2++anxn)x2(a1x1+a2x2++anxn)xn(a1x1+a2x2++anxn)=a1a2an=a

  • 同理可推得: ∂ x T B x ∂ x = ( B + B T ) x \frac {\partial x^TBx}{\partial x}=\left(B+B^T \right)x xxTBx=(B+BT)x

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值