机器学习sklearn线性回归之泰勒展开

最新推荐文章于 2022-08-01 16:26:48 发布

KitaShin

最新推荐文章于 2022-08-01 16:26:48 发布

阅读量1.1k

点赞数 4

文章标签：机器学习泰勒展开线性回归

本文链接：https://blog.csdn.net/KitaShin/article/details/103383385

版权

机器学习之 sklearn 线性回归之泰勒展开

版本：python3.7

废话不多说

导入相关依赖

import numpy as np		# 科学计算库
import pandas as pd		# 挂件

import matplotlib.pyplot as plt			# 绘图库
plt.rcParams['font.sans-serif']=[u'simHei']   #  防中文乱码
plt.rcParams['axes.unicode_minus']=False   #   正确显示负号

from sklearn.linear_model import LinearRegression 		# sklearn 线性模型
from sklearn.metrics import mean_squared_error			# mse评分（均方根误差）

构造数据

train = [] # 训练集
col = 1 # 只生成一个特征（方便画图）
W = [] # 系数
for i in range(col): 
    train.append(2*np.random.random(1000))

for i in range(col ): # 遍历生成的特征
    w = np.random.randint(1, 10) # 随机生成一个系数
    W.append(w)
    if i == 0:
    	# 构造标签 规则为 y = w * x^2 - x + 2
    	# np.random.randn(*train[i].shape) * (w/4) 添加的噪音
        label = w * train[i] ** 2 -  train[i] + 2 + np.random.randn(*train[i].shape) * (w/4)
    else:
        label += w * train[i] ** 2 -  train[i] + 2 + np.random.randn(*train[i].shape) * (w/4)

# 训练集 特征
X_train = np.array(train).T
# 训练集标签
y_train = label

按照同样的方法生成测试集数据

test = [] # 训练集
col = 1 # 只生成一个特征（方便画图）
for i in range(col): 
    test .append(2*np.random.random(1000))

for i in range(col ): # 遍历生成的特征
    if i == 0:
    	# 构造标签 规则为 y = w * x^2 - x + 2
    	# np.random.randn(*test [i].shape) * (W[i]/4) 添加的噪音
        label = W[i] * test [i] ** 2 -  test [i] + 2 + np.random.randn(*test [i].shape) * (W[i]/4)
    else:
        label += W[i] * test [i] ** 2 -  test [i] + 2 + np.random.randn(*test [i].shape) * (W[i]/4)

# 测试集 特征
X_test  = np.array(test).T
# 测试集标签
y_test  = label

打印数据形状

print('X_train的形状为：',X_train.shape)
print('y_train的形状为：',y_train.shape)
print('X_test的形状为：',X_test.shape)
print('y_test的形状为：',y_test.shape)

X_train的形状为： (1000, 1)
y_train的形状为： (1000,)
X_test的形状为： (1000, 1)
y_test的形状为： (1000,)

查看训练集数据分布

plt.scatter(X_train[:,0], y_train,c='r', marker='.')
plt.xlabel('特征')
plt.ylabel('标签')
plt.show()

在这里插入图片描述

训练普通线性模型

lr = LinearRegression()
lr.fit(X_train, y_train)

print('train 的均方误差为：', mean_squared_error(lr.predict(X_train), y_train))
print('test 的均方误差为：', mean_squared_error(lr.predict(X_test), y_test))

train 的均方误差为： 0.1567120785212317