Multiple Features
Linear regression with multiple variables is also known as “multivariate linear regression”.
We now introduce notation for equations where we can have any number of input variables.
The multivariable form of the hypothesis function accommodating these multiple features is as follows:
Gradient Descent For Multiple Variables
The gradient descent equation itself is generally the same form; we just have to repeat it for our ‘n’ features:
In other words:
The following image compares gradient descent with one variable to gradient descent with multiple variables:
Gradient Descent in Practice I - Feature Scaling
We can speed up gradient descent by having each of our input values in roughly the same range. This is because θ will descend quickly on small ranges and slowly on large ranges, and so will oscillate inefficiently down to the optimum when the variables are very uneven.
The way to prevent this is to modify the ranges of our input variables so that they are all roughly the same. Ideally:
Gradient Descent in Practice II - Learning Rate
Features and Polynomial Regression
Normal Equation
Normal Equation Noninvertibility