Normal Equation----machine learning

最新推荐文章于 2020-12-20 23:07:05 发布

三省少年

最新推荐文章于 2020-12-20 23:07:05 发布

阅读量143

点赞数

分类专栏：机器学习文章标签： Normal Equation

本文链接：https://blog.csdn.net/xd15010130025/article/details/97101723

版权

机器学习专栏收录该内容

36 篇文章 5 订阅

订阅专栏

The design matrix X (in the bottom right side of the slide) given in the example should have elements x with subscript 1 and superscripts varying from 1 to m because for all m training sets there are only 2 features $x_0$ and $x_1$ The X matrix is m by (n+1) and NOT n by n.

Gradient descent gives one way of minimizing J. Let’s discuss a second way of doing so, this time performing the minimization explicitly and without resorting to an iterative algorithm. In the “Normal Equation” method, we will minimize J by explicitly taking its derivatives with respect to the θj ’s, and setting them to zero. This allows us to find the optimum theta without iteration. The normal equation formula is given below:
$\theta=(X^TX)^{-1}X^Ty$
在这里插入图片描述
There is no need to do feature scaling with the normal equation.
The following is a comparison of gradient descent and the normal equation:

Gradient Descent	Normal Equation
Need to choose alpha	No need to choose alpha
Needs many iterations	No need to iterate
o(k $n^2)$	o( $n^3$ ),need to caculate inverse of $X^TX$
Works well when n is large	Slow if n is very large

With the normal equation, computing the inversion has complexity O( $n^3$ ). So if we have a very large number of features, the normal equation will be slow. In practice, when n exceeds 10,000 it might be a good time to go from a normal solution to an iterative process.

三省少年

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Normal Equation----machine learning

The design matrix X (in the bottom right side of the slide) given in the example should have elements x with subscript 1 and superscripts varying from 1 to m because for all m training sets there are ...
复制链接

扫一扫

专栏目录