sklearn实现12种回归模型

**

sklearn实现12种回归模型(LinearRegression,KNN,SVR,Ridge,Lasso,MLP,DecisionTree,ExtraTree,RandomForest,AdaBoost,GradientBoost,Bagging)

**
本文主要是针对本人做的一个项目需求,查找合适的回归模型,记录实现过程,仅方便自己以后查找。
本文主要参考的网站有:
scikit_learn官方网站
30分钟学会用scikit-learn
十三种回归模型预测房价

1.数据准备

import numpy as np
import pandas as pd
data = pd.read_excel(r"data.xlsx")
data = np.array(data)
a=data[:,3:8]
b=data[:,2] 

2.开始试验各种不同的回归方法

2.1线性回归模型

#LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = LinearRegression()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("LinearRegression结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
LinearRegression结果如下:
训练集分数: 0.591856113161297
验证集分数: 0.6214511243968527

2.2KNN回归模型

from sklearn.neighbors import KNeighborsRegressor
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = KNeighborsRegressor()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("KNeighborsRegressor结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
KNeighborsRegressor结果如下:
训练集分数: 0.7216991832348424
验证集分数: 0.601773245289923

2.3SVM回归模型

from sklearn.svm import SVR
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = SVR()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("SVR结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
SVR结果如下:
训练集分数: 0.37625861753449674
验证集分数: 0.4536826131027402

2.4岭回归模型

from sklearn.linear_model import Ridge
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = Ridge()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("Ridge结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
Ridge结果如下:
训练集分数: 0.5999728749276192
验证集分数: 0.5903386836435587

2.5LASSO回归模型

from sklearn.linear_model import Lasso
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = Lasso()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("Lasso结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
Lasso结果如下:
训练集分数: 0.5959277449910918
验证集分数: 0.6063097915792626

2.6多层感知机回归模型

from sklearn.neural_network import MLPRegressor
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = MLPRegressor()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("MLPRegressor结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
MLPRegressor结果如下:
训练集分数: 0.6260209012837945
验证集分数: 0.6234879650836542

2.7决策树回归模型

from sklearn.tree import DecisionTreeRegressor
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = DecisionTreeRegressor()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("DecisionTreeRegressor结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
DecisionTreeRegressor结果如下:
训练集分数: 0.9029579714124932
验证集分数: 0.5789140015732428

2.8极限树回归模型

from sklearn.tree import ExtraTreeRegressor
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = ExtraTreeRegressor()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("ExtraTreeRegressor结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
ExtraTreeRegressor结果如下:
训练集分数: 0.9037680679611091
验证集分数: 0.46830247193588975

2.9随机森林回归模型

from sklearn.ensemble import RandomForestRegressor
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = RandomForestRegressor()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("RandomForestRegressor结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
RandomForestRegressor结果如下:
训练集分数: 0.8692011534234785
验证集分数: 0.6943344063242647

2.10AdaBoost回归模型

from sklearn.ensemble import AdaBoostRegressor
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = AdaBoostRegressor()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("AdaBoostRegressor结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
AdaBoostRegressor结果如下:
训练集分数: 0.6384420177311011
验证集分数: 0.607339934856168

2.11梯度提升回归模型

from sklearn.ensemble import GradientBoostingRegressor
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = GradientBoostingRegressor()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("GradientBoostingRegressor结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
GradientBoostingRegressor结果如下:
训练集分数: 0.7484658216805864
验证集分数: 0.7122203061071664

2.12Bagging回归模型

from sklearn.ensemble import BaggingRegressor
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = BaggingRegressor()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("BaggingRegressor结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
BaggingRegressor结果如下:
训练集分数: 0.8641707121920719
验证集分数: 0.6610529256307627
### 使用 `sklearn` 实现 LightGBM 回归模型 值得注意的是,虽然 LightGBM 并不是直接隶属于 scikit-learn 库的一部分,但它提供了与 sklearn 兼容的 API 接口[^1]。这意味着可以像使用其他 sklearn 模型一样来训练和评估 LightGBM 模型。 下面是一个具体的例子,展示了如何利用 LightGBM 和 sklearn 来构建一个回归模型: ```python import lightgbm as lgb from sklearn.model_selection import train_test_split from sklearn.datasets import make_regression from sklearn.metrics import mean_squared_error # 创建模拟数据集 X, y = make_regression(n_samples=1000, n_features=20, noise=0.1) # 划分训练集和测试集 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # 定义参数字典 params = { 'objective': 'regression', 'metric': {'l2', 'l1'}, 'num_leaves': 31, 'learning_rate': 0.05, } # 将数据转换成 Dataset 类型 train_data = lgb.Dataset(X_train, label=y_train) test_data = lgb.Dataset(X_test, label=y_test, reference=train_data) # 训练模型 model = lgb.train(params, train_data, valid_sets=[test_data], num_boost_round=100, early_stopping_rounds=10) # 对测试集做预测并计算均方误差 y_pred = model.predict(X_test, num_iteration=model.best_iteration) mse = mean_squared_error(y_test, y_pred) print(f'Mean Squared Error: {mse}') ``` 上述代码片段中定义了一个简单的线性回归问题,并采用 LightGBM 进行建模。这里设置了目标函数为目标为最小化平方损失(`'regression'`),并通过指定 `'num_leaves'`, `'learning_rate'` 参数控制树的数量和每棵树的增长速度[^2]。
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值