L2-2 机器学习——线性回归模型（二）

今天补充能量了吗

已于 2024-07-16 22:58:44 修改

阅读量654

点赞数 7

文章标签：线性回归算法回归

于 2024-07-15 10:00:00 首次发布

本文链接：https://blog.csdn.net/weixin_43934209/article/details/140337406

版权

🍨 本文為🔗365天深度學習訓練營中的學習紀錄博客
🍖 原作者：K同学啊 | 接輔導、項目定制

一、多元线性回归

本质为n元一次线性方程，代表目标变量（因变量）受多个因素（自变量）影响

二、代码实现

本次依然采用鸢尾花（iris）数据集，任务为用其他三个变量来预测花瓣的长度

第1步：导入需要的库

# Import the libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

第2步：导入数据集

# Load the data
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"  
names = ['花萼-length', '花萼-width', '花瓣-length', '花瓣-width', 'class'] 

df = pd.read_csv(url, names=names)
df.head()

第3步：数据分析可视化

通过图表，展示自变量（花萼-length, 花萼-width, 花瓣-width）与因变量（花瓣-length）的关系

# plot the relationship between the features and the target
plt.plot(df['花萼-length'], df['花瓣-length'], 'x', label="marker='x'")
plt.plot(df['花萼-width'], df['花瓣-length'], 'o', label="marker='o'")
plt.plot(df['花瓣-width'], df['花瓣-length'], 'v', label="marker='v'")
plt.title('Iris dataset')
plt.xlabel('Features')
plt.ylabel('Length of the petal')
plt.legend()
plt.show()

第4步：分割数据集

X为花萼-length, 花萼-width, 花瓣-width， y为花瓣-length。

# Slipt the data
X = df.iloc[:, [0, 1, 3]].values
y = df.iloc[:, 2].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

第5步：训练多元线性回归模型

# Linear regression model
regressor = LinearRegression()
regressor.fit(X_train, y_train)

第6步：在测试集上预测结果

# Predict the test set
y_pred = regressor.predict(X_test)

第7步：预测结果可视化

# Visualize the prediction as the dots and the actual values as the line
plt.plot(y_test, color='blue', label='Actual values')
plt.plot(y_pred, 'o', color='red', label='Predicted values')
plt.title('Iris dataset')
plt.xlabel('Features')
plt.ylabel('Length of the petal')
plt.legend()
plt.show()