机器学习100天系列学习笔记 机器学习100天(中文翻译版)机器学习100天(英文原版)
第一步:导包
#Step 1: Data Preprocessing
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
第二步:导入数据
# 28个样本
dataset = pd.read_csv('D:/daily/机器学习100天/100-Days-Of-ML-Code-中文版本/100-Days-Of-ML-Code-master/datasets/studentscores.csv')
X = dataset.iloc[ : , :-1].values
Y = dataset.iloc[ : , 1 ].values
第三步:划分训练集、测试集
#Step 3: Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split( X, Y, test_size = 1/4, random_state = 0)
第四步:简单线性回归拟合
#Step 4: Fitting Simple Linear Regression Model to the training set
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor = regressor.fit(X_train, Y_train)
第五步:预测
#Step 5: Predecting the Result
Y_pred = regressor.predict(X_test)
第六步:训练集可视化
#Step 6: Visualising the Training results
plt.scatter(X_train , Y_train, color = 'red')
plt.plot(X_train , regressor.predict(X_train), color ='blue')
plt.show()
第七步:测试集可视化
#Step 7: Visualizing the test results
plt.scatter(X_test , Y_test, color = 'red')
plt.plot(X_test , regressor.predict(X_test), color ='blue')
plt.show()
第八步:回归性能指标
#Step 8: regression evaluation
from sklearn.metrics import r2_score
y_pred = regressor.predict(X_test)
print(r2_score(Y_test, y_pred))
打印:0.30574547147699993
R2 决定系数(拟合优度),模型越好:r2→1;模型越差:r2→0
完整代码:
#Day 2: Simple Linear Regression 2022/4/5
#Step 1: Data Preprocessing
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#Step 2: Importing dataset
#28个样本
dataset = pd.read_csv('D:/daily/机器学习100天/100-Days-Of-ML-Code-中文版本/100-Days-Of-ML-Code-master/datasets/studentscores.csv')
X = dataset.iloc[ : , :-1].values
Y = dataset.iloc[ : , 1 ].values
#Step 3: Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split( X, Y, test_size = 1/4, random_state = 0)
#Step 4: Fitting Simple Linear Regression Model to the training set
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor = regressor.fit(X_train, Y_train)
#Step 5: Predecting the Result
Y_pred = regressor.predict(X_test)
#Step 6: Visualising the Training results
plt.scatter(X_train , Y_train, color = 'red')
plt.plot(X_train , regressor.predict(X_train), color ='blue')
plt.show()
#Step 7: Visualizing the test results
plt.scatter(X_test , Y_test, color = 'red')
plt.plot(X_test , regressor.predict(X_test), color ='blue')
plt.show()
#Step 8: regression evaluation
from sklearn.metrics import r2_score
y_pred = regressor.predict(X_test)
print(r2_score(Y_test, y_pred))