Performance of Regression Models
1. Simple Linear Regression :
SSres = SUM(yi - yi^)^2 -> min
SStot = SUM(yi - y_avg)^2 -> min
R^2 = 1 - SSres/SStot (from 0 to 1)
R^2 is the Goodness of fit. (The closer R^2 closer to 1, the better the performance is.)
2. Multiple Linear Regression
For multiple linear regression, R^2 will never decrease, so we need to use Adjusted R^2
* p is the independent variable ( number of regressors)
* n is the sample size
When adding more regressors, R^2 will decrease; By adding a new variable, R^2 increases, Adj R^2 also increases
Regression Model Selection
How do I know which regression model to choose for a particular problem/dataset?
from sklearn.metrics import r2_score
r2_score(y_test, y_pred)
The better the result is when it's closer to 1. Just try all models and select the best one for your regression problem.