机器学习笔记11_随机梯度下降法

最新推荐文章于 2024-05-27 20:03:59 发布

Yi_hua

最新推荐文章于 2024-05-27 20:03:59 发布

阅读量380

点赞数 1

分类专栏：机器学习入门文章标签：机器学习 python

本文链接：https://blog.csdn.net/Yi_hua/article/details/105188877

版权

机器学习入门专栏收录该内容

12 篇文章 0 订阅

订阅专栏

机器学习笔记11_随机梯度下降法

文章目录

机器学习笔记11_随机梯度下降法

1.随机梯度下降法的思想

1.1 批量梯度下降法

批量梯度下降为：

# 线性回归中的梯度下降
即：
批量梯度下降

1.2随机梯度下降法

每次只选取一个样本进行梯度下降。
在这里插入图片描述

批量梯下降法计算耗时过大，随机梯度法算量小，时间复杂度小。

每次寻找（迭代）改变步长 $\eta$ ，为模拟退火的思想。
在这里插入图片描述
其中,a,b为超参数。

2.随机梯度下降法的实现

import numpy as np
import matplotlib.pyplot as plt

import numpy as np
from matplotlib import pyplot as plt

# np.random.seed(665)
m = 10000
x = 2 * np.random.normal(size=m)
X = x.reshape(-1,1)
y = x * 3. + 4. + np.random.normal(size=m)

def J(theta, X_b, y):
    '''
    loss function
    '''
    try:
        return np.sum((y-X_b.dot(theta))**2)/len(X_b)
    except:
        return float('inf')

def dJ_sgd(theta, X_b_i, y_i):
    return X_b_i.T.dot(X_b_i.dot(theta) - y_i) * 2.

def sgd(X_b, y, initial_theta, n_iters):
    t0 = 5
    t1 = 50
    
    def learning_rate(t):
        return t0 / (t + t1)
    # 损失函数不一定一直减小,所以只限制迭代次数
    
    theta = initial_theta
    for cur_iter in range(n_iters):
        rand_i = np.random.randint(len(X_b))
        gradient = dJ_sgd(theta, X_b[rand_i], y[rand_i])
        theta = theta - learning_rate(cur_iter) * gradient
        
    return theta

%%time
X_b = np.hstack([np.ones((len(X),1)),X])
initial_theta = np.zeros(X_b.shape[1])
theta = sgd(X_b, y, initial_theta, n_iters=len(X_b)//3)

Wall time: 24 ms

theta

array([3.96962099, 2.91398727])

3.scikit-learn中的SGD

from sklearn.linear_model import SGDRegressor
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.preprocessing import StandardScaler

boston = datasets.load_boston()
X = boston.data
y = boston.target
X = X[y < 50.0]  # 因为上限为50.0，超过50.0的部分也按50算
y = y[y < 50.0]

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,random_state=1)

standard = StandardScaler()
standard.fit(X_train)
X_train_standard = standard.transform(X_train)

standard.fit(X_test)
X_test_standard = standard.transform(X_test)

sgd_reg = SGDRegressor()
%time sgd_reg.fit(X_train_standard, y_train)
sgd_reg.score(X_test_standard, y_test)

Wall time: 3.99 ms





0.7775560898753987

4总结

批量梯度下降法 Batch Gradient Descent
随机梯度下降法 Stochastic Descent
小批量梯度下降法 Mini-Batch Gradient Descent

随机

跳出局部最优解
更快的运行速度
随机搜索，随机森林

Yi_hua

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
机器学习笔记11_随机梯度下降法

机器学习笔记10_随机梯度下降法1.随机梯度下降法的思想1.1 批量梯度下降法1.2随机梯度下降法2.随机梯度下降法的实现3.scikit-learn中的SGD4总结
复制链接

扫一扫

专栏目录

机器学习笔记11_随机梯度下降法

机器学习笔记11_随机梯度下降法

文章目录

1.随机梯度下降法的思想

1.1 批量梯度下降法

1.2随机梯度下降法

2.随机梯度下降法的实现

3.scikit-learn中的SGD

4总结

“相关推荐”对你有帮助么？