Problem Set 1

最新推荐文章于 2024-07-21 13:38:21 发布

u小鬼

最新推荐文章于 2024-07-21 13:38:21 发布

阅读量402

点赞数 1

分类专栏：机器学习文章标签：算法线性代数

本文链接：https://blog.csdn.net/qq_23096319/article/details/128949390

版权

机器学习专栏收录该内容

21 篇文章 11 订阅

订阅专栏

Problem Set 1

1 Conditions for Normal Equation Prove the following theorem: The
matrix AT A is invertible if and only if thecolumns of A are linearly
independent.

1.先证：当矩阵A的列向量组线性无关，则矩阵 $A^TA$ 可逆。
设 $A^TAX=0$ ，如果 $A^TA$ 可逆则方程有唯一解 $X = 0$ ,
原命题等价于证明当矩阵A的列向量组线性无关，则 $A^TAX=0$ 有唯一解 $X = 0$ ，
有 $X^TA^TAX=0$ ，变换得 $AX)^TAX=0$ ， $A X = 0$ ，
设 $A=[a_1,a_2,...,a_n]$ ， $X=[x_1,x_2,...,x_n]^T$ ,
有 $a_1x_1+a_2x_2+...+a_nx_n=0$ （1），
因为A的列向量组线性无关，所以方程有唯一解 $x_1,x_2,...,x_n]=[0,0,...,0]$ ，即 $X = 0$ ，原命题得证。

若矩阵A的列向量组不满足线性无关的条件，观察式（1），不保证有唯一解 $X = 0$ ，因此矩阵 $A^TA$ 可逆当且仅当矩阵A的列向量组线性无关。

Newton’s Method for Computing Least SquaresIn this problem，we will
prove that if we use Newton’s method solve the leastsquares
optimization problem,then we only need one iteration to converge tothe
optimal parameter O* (a) Find the Hessian of the cost function
J(e)=责〉T(OT a()- y()2 (b)Show that the first iteration of Newton’s
method gives us 0*=(XTX)-1XTg, the solution to our least square
problem.(gdenotes the vector of the fea-tures.)

2.(a)根据Hessian矩阵定义： $H_{i,j}=\frac{\partial ^2 J(\theta)}{\partial \theta _i \partial \theta _j}=\frac{\partial}{\partial \theta _j}(\frac{\partial J(\theta)}{\partial \theta _i})=\frac{\partial}{\partial \theta _j}(\sum_{k=1}^{m}(\theta ^Tx^{(k)}-y^{(k)})x^{(k)}_i)=\sum_{k=1}^{m}x^{(k)}_ix^{(k)}_j$
(b)设 $X=[x^{(1)} x^{(2)} ... x^{(m)}]^T$ ， $Y=[y^{(1)} y^{(2)} ... y^{(m)}]$ ，
可得 $H=X^TX$ ，
对于 $▽_{\theta}J(\theta)$ ，根据梯度的含义：
$▽_{\theta}J(\theta)=[\frac{\partial J(\theta)}{\partial \theta _1} \frac{\partial J(\theta)}{\partial \theta _2} ... \frac{\partial J(\theta)}{\partial \theta _n}]^T$
$▽_{\theta}J(\theta)_i=\sum_{k=1}^{m}(\theta ^Tx^{(k)}-y^{(k)})x^{(k)}_i=\sum_{k=1}^{m}\theta ^Tx^{(k)}x^{(k)}_i-\sum_{k=1}^{m}y^{(k)}x^{(k)}_i \\ =(X^TX\theta-X^T\overrightarrow{y})_i$
因此 $▽_{\theta}J(\theta)=X^TX\theta-X^T\overrightarrow{y}$ ,
根据牛顿法迭代公式 $\theta:=\theta-H^{-1}▽_{\theta}J(\theta)$ ，
第一轮迭代后 $\theta^*=\theta-(X^TX)^{-1}(X^TX\theta-X^T\overrightarrow{y})=(X^TX)^{-1}X^T\overrightarrow{y}$ .

3Prediction using Linear Regression The sales of a company (in million
dollars)for each year are shown in the tablebelow. x (year) 2005 2006
2007 2008 2009 y (sales) 12 19 29 37 45 (a)Find the least square
regression line y = aa ＋b. (b)Use the least squares regression line as
a model to estimate the sales of the company in 2012.

3.(a)根据定理，最小二乘的唯一解为 $\theta=(X^TX)^{-1}X^TY$ ，
根据题意设
$X=\begin{bmatrix} 1 && 2005 \\ 1 && 2006 \\ 1 && 2007 \\ 1 && 2008 \\ 1 && 2009 \\ \end{bmatrix}$ ， $Y=\begin{bmatrix} 12 \\ 19 \\ 29 \\ 37 \\ 45 \\ \end{bmatrix}$ ，
代入计算得 $\theta=[-1.68e+04;8.40]$ ，
即最小二乘回归直线为 $y = 8.40 a - 1.68 e 4$ .
(b)预测公司2012年的销售额为70.4百万美元。

4Logistic Regression Consider the average empirical loss for logistic
regression: m J(0)= >log (1 +e-yO a’) – >log (ho(gs))) m :1 i=1 ，i=1
where g() e {-1,1} ho(z)= g(0T .c) and g(z)=1/(1＋e-2). Find the
HessianH of this function，and show that for any vector z, it holds
true that H z ≥0

4.根据Hessian矩阵定义：
$\frac{\partial J(\theta)}{\partial \theta _i}=-\frac{1}{m}\sum_{k=1}^{m}(1-g(\theta ^Ty^{(k)}x^{(k)}))y^{(k)}x^{(k)}_i$
$\frac{\partial J(\theta)}{\partial \theta _i \partial \theta _j}=\frac{1}{m}\sum_{k=1}^{m}g(\theta ^Ty^{(k)}x^{(k)})(1-g(\theta ^Ty^{(k)}x^{(k)}))(y^{(k)})^2x^{(k)}_ix^{(k)}_j=H_{i,j}$
注意到 $y^{(k)}\in\{-1, 1\}$ ， $y^{(k)})^2=1$ ，
以及 $h_{\theta}(x)=g(\theta ^T x)$ ， $h_{\theta}(x)=h_{\theta}(-x)$ ，
$H_{i,j}=\frac{1}{m}\sum_{k=1}^{m}h_{\theta}(x^{(k)})(1-h_{\theta}(x^{(k)}))x^{(k)}_ix^{(k)}_j$ ，

进一步可得 $H=\frac{1}{m}\sum_{k=1}^{m}h_{\theta}(x^{(k)})(1-h_{\theta}(x^{(k)}))x^{(k)}x^{(k)T}$ ，
$z^THz=\frac{1}{m}\sum_{k=1}^{m}h_{\theta}(x^{(k)})(1-h_{\theta}(x^{(k)}))z^Tx^{(k)}x^{(k)T}z \\ =\frac{1}{m}\sum_{k=1}^{m}h_{\theta}(x^{(k)})(1-h_{\theta}(x^{(k)}))(z^Tx^{(k)})^2 \geq 0$ ，命题得证。

u小鬼

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
打赏
0
评论
Problem Set 1

1.先证：当矩阵A的列向量组线性无关，则矩阵ATAA^TAATA可逆。设ATAX=0A^TAX=0ATAX=0，如果ATAA^TAATA可逆则方程有唯一解X=0X=0X=0,原命题等价于证明当矩阵A的列向量组线性无关，则ATAX=0A^TAX=0ATAX=0有唯一解X=0X=0X=0，有XTATAX=0X^TA^TAX=0XTATAX=0，变换得(AX)TAX=0(AX)^TAX=0(AX)TAX=0，AX=0AX=0AX=0，设A=[a1,a2,...,an]A=[a_1,a_2,...,a_n]
复制链接

扫一扫