pytorch——02梯度下降算法

什么是梯度算法?

梯度下降的主要目的是通过迭代找到目标函数的最小值,或者收敛到最小值。

梯度下降法的基本思想可以类比为一个下山的过程。
假设这样一个场景:一个人被困在山上,需要从山上下来(找到山的最低点,也就是山谷)。但此时山上的浓雾很大,导致可视度很低;因此,下山的路径就无法确定,必须利用自己周围的信息一步一步地找到下山的路。这个时候,便可利用梯度下降算法来帮助自己下山。怎么做呢,首先以他当前的所处的位置为基准,寻找这个位置最陡峭的地方,然后朝着下降方向走一步,然后又继续以当前位置为基准,再找最陡峭的地方,再走直到最后到达最低处;同理上山也是如此,只是这时候就变成梯度上升算法了
 

算法详解

首先,我们有一个可微分的函数。这个函数就代表着一座山。我们的目标就是找到这个函数的最小值,也就是山底。根据之前的场景假设,最快的下山的方式就是找到当前位置最陡峭的方向,然后沿着此方向向下走,对应到函数中,就是找到给定点的梯度 ,然后朝着梯度相反的方向,就能让函数值下降的最快!因为梯度的方向就是函数之变化最快的方向(在后面会详细解释)
 

梯度的方向是函数在给定点上升最快的方向,那么梯度的反方向就是函数在给定点下降最快的方向,这正是我们所需要的

import matplotlib.pyplot as plt
 
# 准备数据集的数据
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]
 
# initial guess of weight 
w = 1.0
 
# 定义模型
def forward(x):
    return x*w
 
#定义函数MSE 
def cost(xs, ys):
    cost = 0
    for x, y in zip(xs,ys):
        y_pred = forward(x)
        cost += (y_pred - y)**2
    return cost / len(xs)
 
# 计算梯度函数
def gradient(xs,ys):
    grad = 0
    for x, y in zip(xs,ys):
        grad += 2*x*(x*w - y)
    return grad / len(xs)
 
epoch_list = []
cost_list = []
print('predict (before training)', 4, forward(4))
#计算当前的损失值
for epoch in range(100):
    cost_val = cost(x_data, y_data)
    grad_val = gradient(x_data, y_data)
    #学习率乘以梯度
    w-= 0.01 * grad_val  # 0.01 learning rate
    
    print('epoch:', epoch, 'w=', w, 'loss=', cost_val)
    epoch_list.append(epoch)
    cost_list.append(cost_val)
 
print('predict (after training)', 4, forward(4))
plt.plot(epoch_list,cost_list)
plt.ylabel('cost')
plt.xlabel('epoch')
plt.show() 
import matplotlib.pyplot as plt
 
# 准备数据集的数据
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]
 
# initial guess of weight 
w = 1.0
 
# 定义模型
def forward(x):
    return x*w
 
#定义函数MSE 
def cost(xs, ys):
    cost = 0
    for x, y in zip(xs,ys):
        y_pred = forward(x)
        cost += (y_pred - y)**2
    return cost / len(xs)
 
# 计算梯度函数
def gradient(xs,ys):
    grad = 0
    for x, y in zip(xs,ys):
        grad += 2*x*(x*w - y)
    return grad / len(xs)
 
epoch_list = []
cost_list = []
print('predict (before training)', 4, forward(4))
#计算当前的损失值
for epoch in range(100):
    cost_val = cost(x_data, y_data)
    grad_val = gradient(x_data, y_data)
    #学习率乘以梯度
    w-= 0.01 * grad_val  # 0.01 learning rate
    
    print('epoch:', epoch, 'w=', w, 'loss=', cost_val)
    epoch_list.append(epoch)
    cost_list.append(cost_val)
 
print('predict (after training)', 4, forward(4))
plt.plot(epoch_list,cost_list)
plt.ylabel('cost')
plt.xlabel('epoch')
plt.show() 

import matplotlib.pyplot as plt
 
# 准备数据集的数据
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]
 
# initial guess of weight 
w = 1.0
 
# 定义模型
def forward(x):
    return x*w
 
#定义函数MSE 
def cost(xs, ys):
    cost = 0
    for x, y in zip(xs,ys):
        y_pred = forward(x)
        cost += (y_pred - y)**2
    return cost / len(xs)
 
# 计算梯度函数
def gradient(xs,ys):
    grad = 0
    for x, y in zip(xs,ys):
        grad += 2*x*(x*w - y)
    return grad / len(xs)
 
epoch_list = []
cost_list = []
print('predict (before training)', 4, forward(4))
#计算当前的损失值
for epoch in range(100):
    cost_val = cost(x_data, y_data)
    grad_val = gradient(x_data, y_data)
    #学习率乘以梯度
    w-= 0.01 * grad_val  # 0.01 learning rate
    
    print('epoch:', epoch, 'w=', w, 'loss=', cost_val)
    epoch_list.append(epoch)
    cost_list.append(cost_val)
 
print('predict (after training)', 4, forward(4))
plt.plot(epoch_list,cost_list)
plt.ylabel('cost')
plt.xlabel('epoch')
plt.show() 
​
predict (before training) 4 4.0
epoch: 0 w= 1.0933333333333333 loss= 4.666666666666667
epoch: 1 w= 1.1779555555555554 loss= 3.8362074074074086
epoch: 2 w= 1.2546797037037036 loss= 3.1535329869958857
epoch: 3 w= 1.3242429313580246 loss= 2.592344272332262
epoch: 4 w= 1.3873135910979424 loss= 2.1310222071581117
epoch: 5 w= 1.4444976559288012 loss= 1.7517949663820642
epoch: 6 w= 1.4963445413754464 loss= 1.440053319920117
epoch: 7 w= 1.5433523841804047 loss= 1.1837878313441108
epoch: 8 w= 1.5859728283235668 loss= 0.9731262101573632
epoch: 9 w= 1.6246153643467005 loss= 0.7999529948031382
epoch: 10 w= 1.659651263674342 loss= 0.6575969151946154
epoch: 11 w= 1.6914171457314033 loss= 0.5405738908195378
epoch: 12 w= 1.7202182121298057 loss= 0.44437576375991855
epoch: 13 w= 1.7463311789976905 loss= 0.365296627844598
epoch: 14 w= 1.7700069356245727 loss= 0.3002900634939416
epoch: 15 w= 1.7914729549662791 loss= 0.2468517784170642
epoch: 16 w= 1.8109354791694263 loss= 0.2029231330489788
epoch: 17 w= 1.8285815011136133 loss= 0.16681183417217407
epoch: 18 w= 1.8445805610096762 loss= 0.1371267415488235
epoch: 19 w= 1.8590863753154396 loss= 0.11272427607497944
epoch: 20 w= 1.872238313619332 loss= 0.09266436490145864
epoch: 21 w= 1.8841627376815275 loss= 0.07617422636521683
epoch: 22 w= 1.8949742154979183 loss= 0.06261859959338009
epoch: 23 w= 1.904776622051446 loss= 0.051475271914629306
epoch: 24 w= 1.9136641373266443 loss= 0.04231496130368814
epoch: 25 w= 1.9217221511761575 loss= 0.03478477885657844
epoch: 26 w= 1.9290280837330496 loss= 0.02859463421027894
epoch: 27 w= 1.9356521292512983 loss= 0.023506060193480772
epoch: 28 w= 1.9416579305211772 loss= 0.01932302619282764
epoch: 29 w= 1.9471031903392007 loss= 0.015884386331668398
epoch: 30 w= 1.952040225907542 loss= 0.01305767153735723
epoch: 31 w= 1.9565164714895047 loss= 0.010733986344664803
epoch: 32 w= 1.9605749341504843 loss= 0.008823813841374291
epoch: 33 w= 1.9642546069631057 loss= 0.007253567147113681
epoch: 34 w= 1.9675908436465492 loss= 0.005962754575689583
epoch: 35 w= 1.970615698239538 loss= 0.004901649272531298
epoch: 36 w= 1.9733582330705144 loss= 0.004029373553099482
epoch: 37 w= 1.975844797983933 loss= 0.0033123241439168096
epoch: 38 w= 1.9780992835054327 loss= 0.0027228776607060357
epoch: 39 w= 1.980143350378259 loss= 0.002238326453885249
epoch: 40 w= 1.9819966376762883 loss= 0.001840003826269386
epoch: 41 w= 1.983676951493168 loss= 0.0015125649231412608
epoch: 42 w= 1.9852004360204722 loss= 0.0012433955919298103
epoch: 43 w= 1.9865817286585614 loss= 0.0010221264385926248
epoch: 44 w= 1.987834100650429 loss= 0.0008402333603648631
epoch: 45 w= 1.9889695845897222 loss= 0.0006907091659248264
epoch: 46 w= 1.9899990900280147 loss= 0.0005677936325753796
epoch: 47 w= 1.9909325082920666 loss= 0.0004667516012495216
epoch: 48 w= 1.9917788075181404 loss= 0.000383690560742734
epoch: 49 w= 1.9925461188164473 loss= 0.00031541069384432885
epoch: 50 w= 1.9932418143935788 loss= 0.0002592816085930997
epoch: 51 w= 1.9938725783835114 loss= 0.0002131410058905752
epoch: 52 w= 1.994444471067717 loss= 0.00017521137977565514
epoch: 53 w= 1.9949629871013967 loss= 0.0001440315413480261
epoch: 54 w= 1.9954331083052663 loss= 0.0001184003283899171
epoch: 55 w= 1.9958593515301082 loss= 9.733033217332803e-05
epoch: 56 w= 1.9962458120539648 loss= 8.000985883901657e-05
epoch: 57 w= 1.9965962029289281 loss= 6.57716599593935e-05
epoch: 58 w= 1.9969138906555615 loss= 5.406722767150764e-05
epoch: 59 w= 1.997201927527709 loss= 4.444566413387458e-05
epoch: 60 w= 1.9974630809584561 loss= 3.65363112808981e-05
epoch: 61 w= 1.9976998600690001 loss= 3.0034471708953996e-05
epoch: 62 w= 1.9979145397958935 loss= 2.4689670610172655e-05
epoch: 63 w= 1.9981091827482769 loss= 2.0296006560253656e-05
epoch: 64 w= 1.9982856590251044 loss= 1.6684219437262796e-05
epoch: 65 w= 1.9984456641827613 loss= 1.3715169898293847e-05
epoch: 66 w= 1.9985907355257035 loss= 1.1274479219506377e-05
epoch: 67 w= 1.9987222668766378 loss= 9.268123006398985e-06
epoch: 68 w= 1.9988415219681517 loss= 7.61880902783969e-06
epoch: 69 w= 1.9989496465844576 loss= 6.262999634617916e-06
epoch: 70 w= 1.9990476795699081 loss= 5.1484640551938914e-06
epoch: 71 w= 1.9991365628100501 loss= 4.232266273994499e-06
epoch: 72 w= 1.999217150281112 loss= 3.479110977946351e-06
epoch: 73 w= 1.999290216254875 loss= 2.859983851026929e-06
epoch: 74 w= 1.9993564627377531 loss= 2.3510338359374262e-06
epoch: 75 w= 1.9994165262155628 loss= 1.932654303533636e-06
epoch: 76 w= 1.999470983768777 loss= 1.5887277332523938e-06
epoch: 77 w= 1.9995203586170245 loss= 1.3060048068548734e-06
epoch: 78 w= 1.9995651251461022 loss= 1.0735939958924364e-06
epoch: 79 w= 1.9996057134657994 loss= 8.825419799121559e-07
epoch: 80 w= 1.9996425135423248 loss= 7.254887315754342e-07
epoch: 81 w= 1.999675878945041 loss= 5.963839812987369e-07
epoch: 82 w= 1.999706130243504 loss= 4.902541385825727e-07
epoch: 83 w= 1.9997335580874436 loss= 4.0301069098738336e-07
epoch: 84 w= 1.9997584259992822 loss= 3.312926995781724e-07
epoch: 85 w= 1.9997809729060159 loss= 2.723373231729343e-07
epoch: 86 w= 1.9998014154347876 loss= 2.2387338352920307e-07
epoch: 87 w= 1.9998199499942075 loss= 1.8403387118941732e-07
epoch: 88 w= 1.9998367546614149 loss= 1.5128402140063082e-07
epoch: 89 w= 1.9998519908930161 loss= 1.2436218932547864e-07
epoch: 90 w= 1.9998658050763347 loss= 1.0223124683409346e-07
epoch: 91 w= 1.9998783299358769 loss= 8.403862850836479e-08
epoch: 92 w= 1.9998896858085284 loss= 6.908348768398496e-08
epoch: 93 w= 1.9998999817997325 loss= 5.678969725349543e-08
epoch: 94 w= 1.9999093168317574 loss= 4.66836551287917e-08
epoch: 95 w= 1.9999177805941268 loss= 3.8376039345125727e-08
epoch: 96 w= 1.9999254544053418 loss= 3.154680994333735e-08
epoch: 97 w= 1.9999324119941766 loss= 2.593287985380858e-08
epoch: 98 w= 1.9999387202080534 loss= 2.131797981222471e-08
epoch: 99 w= 1.9999444396553017 loss= 1.752432687141379e-08
predict (after training) 4 7.999777758621207

随机梯度下降法在神经网络中被证明是有效的。效率较低(时间复杂度较高),学习性能较好。

随机梯度下降法和梯度下降法的主要区别在于:

1、损失函数由cost()更改为loss()。cost是计算所有训练数据的损失,loss是计算一个训练数据的损失。对应于源代码则是少了两个for循环。

2、梯度函数gradient()由计算所有训练数据的梯度更改为计算一个训练数据的梯度。

3、本算法中的随机梯度主要是指,每次拿一个训练数据来训练,然后更新梯度参数。本算法中梯度总共更新100(epoch)x3 = 300次。梯度下降法中梯度总共更新100(epoch)次。
 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
梯度下降算法PyTorch中是一个常用的优化算法,用于更新模型参数以最小化损失函数。在PyTorch中,可以使用torch.optim模块中的优化器来实现梯度下降算法。 使用梯度下降算法的一般步骤如下: 1. 定义模型:首先,需要定义一个模型,并将其初始化为一些随机的参数。 2. 定义损失函数:在PyTorch中,通常使用torch.nn模块中的损失函数,例如均方误差损失函数(MSE loss)或交叉熵损失函数(CrossEntropyLoss)。 3. 定义优化器:选择一个合适的优化器,例如SGD(随机梯度下降法)或Adam优化器,并传入模型的参数和学习率作为参数。 4. 迭代更新参数:使用优化器的step()方法来更新模型的参数,同时计算损失函数,然后反向传播并调用优化器的zero_grad()方法清空之前的梯度。 5. 重复以上步骤:重复迭代更新参数的过程,直到达到停止条件(例如达到指定的迭代次数或损失函数的值收敛)。 梯度下降算法还有一个小批量梯度下降算法(mini-batch gradient descend),它是在每次迭代中使用一小部分样本来计算梯度,并更新模型参数。这种方法可以更快地收敛,并且可以在处理大量数据时节省计算资源。 需要注意的是,梯度下降算法可以通过梯度上升法进行转化。梯度上升法用于求解最大化的损失函数,而梯度下降法用于求解最小化的损失函数。实际上,两者是相互转化的。 因此,在PyTorch中,可以根据具体问题选择使用梯度下降算法或梯度上升算法来迭代求解损失函数的最小值或最大值。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* [PyTorch深度学习——梯度下降算法](https://blog.csdn.net/weixin_42603976/article/details/126043798)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT0_1"}}] [.reference_item style="max-width: 50%"] - *2* *3* [【Pytorch梯度下降算法](https://blog.csdn.net/weixin_45437022/article/details/113941191)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT0_1"}}] [.reference_item style="max-width: 50%"] [ .reference_list ]
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值