Python - 监督学习

最新推荐文章于 2024-09-26 11:55:47 发布

一只菜GISER

最新推荐文章于 2024-09-26 11:55:47 发布

阅读量209

点赞数 7

分类专栏：数据挖掘十大算法-代码记录文章标签： python 学习开发语言

本文链接：https://blog.csdn.net/weixin_39419220/article/details/124382997

版权

数据挖掘十大算法-代码记录专栏收录该内容

1 篇文章 0 订阅

订阅专栏

文章目录

- `监督学习之回归`
一、Linear Regression
- 1.1 Python 简单使用
- 1.2 使用sklearn 库
二、Logistic回归

`监督学习之回归`

提示：以下是本篇文章正文内容，供参考

一、Linear Regression

目标：找到系数a，b，使得损失函数J(a,b)尽可能的小。

在这里插入图片描述

详情可点击https://www.bilibili.com/video/BV13J411g7SG?spm_id_from=333.337.search-card.all.click观看

1.1 Python 简单使用

步骤：
1.已知x、y等长数组
2.分别计算x、y数组的均值
3.计算线性回归中a、b的值
4.代入需预测的值

import numpy as np
import matplotlib.pyplot as plt

# 建立两个数组
x = np.array([1, 6, 7, 10, 15], dtype=np.float)
y = np.array([1, 2.7, 3.5, 7.8, 10.2])

# 计算均值
x_mean = np.mean(x)
y_mean = np.mean(y)
num = 0.0
j_ab = 0.0
# 计算a b 值
for x_i, y_i in zip(x, y):
    num += (x_i - x_mean) * (y_i - y_mean)
    j_ab += (x_i - x_mean) ** 2
    a = num / j_ab
    b = y_mean - a * x_mean
# 预测函数
y_hat = a * x + b
plt.scatter(x, y)
plt.plot(x, y_hat, c='r')
x_predict = 6.8
y_predict = a * x_predict + b
plt.scatter(x_predict, y_predict, c='b', marker='*',label='('+str(x_predict)+','+str(y_predict)+')')
plt.legend()
plt.show()

在这里插入图片描述

1.2 使用sklearn 库

具体实际案例，可以搜索sklearn中的波士顿房价预测模型

from sklearn.linear_model import LinearRegression # 最小二乘回归
# 加载模型
linreg = LinearRegression()
# 拟合数据
linreg.fit(train_X,train_Y)
# 进行预测
y_predict = linreg.predict(test_X)
# 计算均方差
metrics.mean_squared_error(y_predict,test_Y)

简单例子：

from sklearn import linear_model
reg = linear_model.LinearRegression()
reg.fit([[0, 0], [1, 1], [2, 2]], [0, 1, 2])
print(reg.coef_)

>Out:
array([0.5, 0.5])

二、Logistic回归

目标：面对一个回归或者分类问题，建立代价函数，通过优化方案迭代求解出最优的模型参数，测试验证

基本要点：
setp1：构建评估指标
step2：线性回归
step3：sigmoid（Logistic）函数

TIPS:
1.与线性回归结果不同，逻辑回归得到的结果是某个实践发生的概率（0-1）
2.逻辑回归可以减少某个极端异常值对整体结果的影响
3.但只适用于线性分布

一只菜GISER

关注

7
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录