Python 绘制线性回归散点图和置信区间线2

空中旋转篮球

已于 2022-07-17 23:15:08 修改

阅读量4.7k

点赞数 2

CC 4.0 BY-SA版权

分类专栏： Python 文章标签： python 线性回归开发语言

于 2022-07-17 23:13:54 首次发布

本文链接：https://blog.csdn.net/soderayer/article/details/125838635

Python 专栏收录该内容

33 篇文章

订阅专栏

本文通过随机生成二维数据，展示了如何使用Matplotlib和Seaborn绘制线性回归模型，并计算并可视化95%置信区间。通过sklearn库实现模型训练与R²分数计算，探讨了实际应用中数据分割的重要性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

和上一篇类似，绘制线性回归方程和置信区间线。使用到的库：

import matplotlib as mpl
import seaborn as sns
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score

使用的数据，采用随机创建方法，不同于上一篇，这里的x，y分别是二维数组。

完整代码如下：

import matplotlib as mpl
import seaborn as sns
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score

# 构造数据
np.random.seed(1000)
x = np.random.randint(1,100,(100,1))
y = [2*i+(np.random.randint(-9,9))**2+np.random.randint(100) for i in x]
print(x,y)

size1 = 20
fontdict = {'weight': 'bold','size':size1,'color':'k','family':'SimHei'}
mpl.rcParams.update(
    {
    'text.usetex': False,
    'font.family': 'stixgeneral',
    'mathtext.fontset': 'stix',
    "font.family":'serif',
    "font.size": size1,
    "font.serif": ['Times New Roman'],
    }
    )
fig,ax = plt.subplots(figsize = (8,6))
sns.regplot(x,y,ax = ax)
ax.set_xlim(0,100)
ax.set_ylim(0,400)
ax.set_xlabel('xlabel')
ax.set_ylabel('ylabel')

# 拟合方程
model = LinearRegression()
model.fit(x,y)
a = model.coef_[0][0]
b = model.intercept_[0]
ax.text(8,350,'$y$ = {:.2f}$x$ + {:.2f}'.format(a,b))

# R2
r2 = r2_score(y,model.predict(x)).round(2)
ax.text(8,300,f'$R$$^{2}$ = {r2}')

std=np.std(model.predict(x))
std_z = 1.96 # from z-table for 95%
confidence_interval = std * std_z
plt.plot(x, model.predict(x) - confidence_interval,label="95%-")
plt.plot(x, model.predict(x) + confidence_interval,label="95%+")
plt.tight_layout()
plt.savefig('out.png',dpi = 600)
plt.show()

显示效果如下：

实际应用中，一般选取不同比例数据作为训练数据和测试数据，我们可以采用：

from sklearn.model_selection import train_test_split库实现数据的分割。
具体代码如下：

# splits the training and test data set in 80% : 20%
# assign random_state to any value.This ensures consistency.
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=5)