In the PPT, andrew tells us that we can choose the new feature which means we can choose x x^2 x^3 as our feature.(For example , the hypohesis = x0 +theta1 *x1 +theta2*x2 ... now we can change it into
hypohesis = x0 +theta1 *(x1^2) +theta2*(x2^2))
Besides gradient decent , today i would introduce a new method to minimize the cost function. That's called normal equation(It's only useful to the linear regression problem) In order to apply to this method , we need to .
First we define :
![](https://i-blog.csdnimg.cn/blog_migrate/e0aa25ca36d6358fa3ad78a92467e410.png)
![](https://i-blog.csdnimg.cn/blog_migrate/62d3a81fbcbb2d67507f81d0f4f27e7e.png)
X is called design matrix , then we can the following formular .
![](https://i-blog.csdnimg.cn/blog_migrate/0c6f2cbd06f3215b38383be097df7093.png)