05.08.2021
Part 2
Regression
(1) Simple Linear Regression
Regression models (both linear or non-linear) are used to predict a real continuous value (like salary), if your independent variable is time, then you are forecasting the future values, otherwise, your model is predicting the present but unknown values.
1. Import libraries
2. Import the dataset
3. Splitting the dataset into the training set and test set
4. Training the Simple Linear Regression model on the Training set
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train) # .fit()以后会经常遇到
5. Predicting the test set result
y_pred = regressor.predict(X_test)
6. Visualizing the training set result
plt.scatter(X_train, y_train, color="Red")
plt.plot(X_train, regressor.predict(X_train), color="Blue")
plt.xlabel("Years of experience")
plt.ylabel("Salary")
plt.show()
7. Visualizing the test set result
plt.scatter(X_test, y_test, color="Red")
plt.plot(X_train, regressor.predict(X_train), color="Blue") # 这里依然用X_train,因为训练结果是regression equation, 无论用训练集还是测试集都是有一样的
plt.xlabel("Years of experience")
plt.ylabel("Salary")
plt.show()
Q:How to make a single prediction? (Salary of an employee with 12 years of experience)
print(regressor.predict([[12]]))
* 12 - Scalar, [12] - 1D array, [[12]] - 2D array
Note that 'predict' method always expect a 2D array as the format of its input
Q: How to Getting the final linear regression equation with the values of the coefficients?
print(regressor.coef_) --> [9345.94244312]
print(regressor.intercept_) --> 26816.192244031183
# Salary=9345.94×YearsExperience+26816.19
Important Note: To get these coefficients we called the "coef_" and "intercept_" attributes from our regressor object. Attributes in Python are different than methods and usually return a simple value or an array of values.