In the PPT, andrew tells us that we can choose the new feature which means we can choose x x^2 x^3 as our feature.(For example , the hypohesis = x0 +theta1 *x1 +theta2*x2 ... now we can change it into
hypohesis = x0 +theta1 *(x1^2) +theta2*(x2^2))
Besides gradient decent , today i would introduce a new method to minimize the cost function. That's called normal equation(It's only useful to the linear regression problem) In order to apply to this method , we need to .
First we define :
X is called design matrix , then we can the following formular .