回归与聚类算法01

最新推荐文章于 2023-09-07 17:13:21 发布

米卡粒

最新推荐文章于 2023-09-07 17:13:21 发布

阅读量172

点赞数

文章标签： java servlet junit

本文链接：https://blog.csdn.net/m0_62329504/article/details/126341813

版权

4.1线性回归

4.1.3线性回归API

sklearn.linear_model.LinearRegression(fit_intercept=True)

通过正规方程优化
fit_intercpt:是否计算偏置
LineatRergression.coef_:回归系数
LineatRegression.intercept_:偏置

skearn.linear_model.SGDRegressor(loss="squared_loss",fit_intercept=True,learning_rate='invscalling',eta=0.01)

SGDRegressor类实现了随机梯度下降学习，它支持不同的los函数和正则化惩罚项来拟合线性回归模型
loss：损失函数类型 loss="squared_loss":普通最小二乘法
fit_intercept:是否计算偏置
learning_rate:string,optional 学习率填充 'constant':eat=eta0 'optimal':eta=1.0/（alpha*(t+t0)）[default]
'invscaling':eta=eta0/pow(t,power_t) power_t=0.25:存在父类当中
对于一个常数值的学习效率来说，可以使用learning_rate='constant',并使用eta0来指定学习率
SGDRegressor.coef_:回归系数
SGDRegressor.intercept_:偏置

from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression, SGDRegressor

def linea1():
    """
    正规方程的优化方法对波士顿房价进行预测
    :return:
    """
    # 1）获取数据
    boston = load_boston()

    # 2）划分数据集
    x_train, x_test, y_train, y_test = train_test_split(boston.data, boston.target, random_state=22)

    # 3）标准化
    transfer = StandardScaler()
    x_train = transfer.fit_transform(x_train)
    x_test = transfer.transform(x_test)

    # 4）预估器
    estimator = LinearRegression()
    estimator.fit(x_train, y_train)

    # 5）得出模型
    print("正规方程的权重系数：\n",estimator.coef_)
    print("正规方程的偏置为：\n", estimator.intercept_)

    # 6）模型评估

    return None


def linea2():
    """
    梯度下降的优化方法对波士顿房价进行预测
    :return:
    """
    # 1）获取数据
    boston = load_boston()

    # 2）划分数据集
    x_train, x_test, y_train, y_test = train_test_split(boston.data, boston.target, random_state=22)

    # 3）标准化
    transfer = StandardScaler()
    x_train = transfer.fit_transform(x_train)
    x_test = transfer.transform(x_test)

    # 4）预估器
    estimator = SGDRegressor()
    estimator.fit(x_train, y_train)

    # 5）得出模型
    print("梯度下降的权重系数：\n", estimator.coef_)
    print("梯度下降的偏置为：\n", estimator.intercept_)

    # 6）模型评估

    return None

if __name__ == "__main__":
    # 代码1：正规方程的优化方法对波士顿房价进行预测
    linea1()
    # 代码2：梯度下降的优化方法对波士顿房价进行预测
    linea2()

梯度下降	正规方程
需要选择学习率	不需要
需要迭代求解	一次运算得出
特征数量较大可以使用	需要计算方程，时间复杂度高

小数据规模：LinearRegression（不能解决拟合问题）岭回归
大数据规模：SGDRegressor

拓展：关于优化方法GD，SGD，SAG

1 GD

梯度下降，原始梯度下降需要计算所有样本的值才能够得出梯度，计算量大，所以后面才有会一系列的改进

2 SGD

随机梯度下降是一个优化方法，他在一次迭代只考虑一个训练样本

优点：高效容易实现

缺点：需要许多超参数，比如正则项参数，迭代参数

对于特征标准化是敏感的

3 SAG

随机平均梯度法，由于收敛的速度太慢，有人提出SAG等基于梯度下降的算法

Scikit-learn 岭回归逻辑回归等当中都会有SAG优化

米卡粒

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫