Linear Regression - Normal Equation and Regularization

最新推荐文章于 2024-08-13 23:24:48 发布

Luckyh6

最新推荐文章于 2024-08-13 23:24:48 发布

阅读量237

点赞数

分类专栏：数学机器学习文章标签：机器学习

本文链接：https://blog.csdn.net/weixin_44545603/article/details/104115131

版权

数学同时被 2 个专栏收录

1 篇文章 0 订阅

订阅专栏

机器学习

1 篇文章 0 订阅

订阅专栏

In linear regression problems, we can use a method called Normal Equation to fit the parameters.
Suppose we have a training set like this:
$\left[\begin{matrix}(x^{(1)})^T \\ (x^{(2)})^T \\ ... \\ (x^{(m)})^T\end{matrix}\right]$
where:
$x^{(i)} = \left[\begin{matrix}x_0^{(i)} \\ x_1^{(i)} \\ ... \\ x_n^{(i)}\end{matrix}\right]$
and the label set:
$\left[\begin{matrix}y^{(1)} \\ y^{(2)} \\ ... \\ y^{(m)}\end{matrix}\right]$
We wants to fit parameters
$\theta = \left[\begin{matrix}\theta_0 \\ \theta_1 \\ ... \\ \theta_n\end{matrix}\right]$
to make this equation:
$||X\cdot\theta - y||^2$
to have its global minimum. which is:
$\theta = \mathop {argmin}_{\theta} ||X\cdot\theta - y||^2 = (X^TX)^{-1}\cdot X^Ty$
Let’s prove it.
We take the partial derivatives of each parameters. for $\theta_j$ , we find that:
$\frac{\partial J}{\partial \theta_j} = \sum_{i = 1}^{m} ((x^{(i)})^T\theta -y^{(i)})\cdot x_j^{(i)} = 0$
tranform this quation, we find that:
$\left[ \begin{matrix}x_j^{(1)} & x_j^{(2)} & ... & x_j^{(m)}\end{matrix}\right]X\cdot\theta = \left[ \begin{matrix}x_j^{(1)} & x_j^{(2)} & ... & x_j^{(m)}\end{matrix}\right]\cdot y$
combine all the n+1 equations, we find that:
$X^TX\theta = X^Ty$
$\theta = (X^TX)^{-1}X^Ty$

Then we involve reguarization, which means, we want to change the function J to be:
$||X\cdot\theta - y||^2 + \lambda \sum_{j=1}^n \theta_j^2$
where $\lambda$ is a constant called the regularization parameter.
Still, we calculate the partial derivative for each $\theta_j$ . Note that the partial derivative for $\theta_0$ is not change.
$\frac{\partial J}{\partial \theta_j} = \sum_{i = 1}^{m} ((x^{(i)})^T\theta -y^{(i)})\cdot x_j^{(i)} + \lambda\theta_j= 0 \ \ \ (for\ j>0)$
$\left[ \begin{matrix}x_j^{(1)} & x_j^{(2)} & ... & x_j^{(m)}\end{matrix}\right]X\cdot\theta + \lambda\theta_j= \left[ \begin{matrix}x_j^{(1)} & x_j^{(2)} & ... & x_j^{(m)}\end{matrix}\right]\cdot y$
$\lambda\theta_j = \lambda e_j^T\theta$
where $e_j$ is the unit vector with the jth element be 1 and others be 0
We add all the n+1 equations up, to find :
$(X^TX+\lambda L)\theta = X^Ty$
$\theta = (X^TX + \lambda L)^{-1}X^Ty$
where
$L = d i a g (0, 1, 1, . . ., 1)$

Luckyh6

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Linear Regression - Normal Equation and Regularization

In linear regression problems, we can use a method called Normal Equation to fit the parameters.Suppose we have a training set like this:X=[(x(1))T(x(2))T...(x(m))T]X = \left[\begin{matrix}(x^{(1)...
复制链接

扫一扫

专栏目录