控制梯度计算(pytorch)

落雪snowflake

已于 2024-07-16 19:29:20 修改

阅读量292

点赞数 3

分类专栏： pytorch 文章标签： python pytorch 深度学习

于 2024-07-16 19:22:29 首次发布

本文链接：https://blog.csdn.net/weixin_38858860/article/details/140474651

版权

pytorch 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

import torch

#1.控制梯度计算
'''
什么时候会用控制梯度计算了，在训练的时候模型需要梯度计算，训练完毕不需要。所以需要控制什么时候计算什么时候不计算。
'''
def test01():
    x=torch.tensor(10,requires_grad=True,dtype=torch.float64)
    print(x.requires_grad)

    y=x**2
    print(y.requires_grad)#True 表示当你经过这个计算之后，需要通过这个计算进行梯度求导，需求是我们希望我只是对x计算，但是不希望这个计算会影响到我们的梯度
    #不希望通过这个计算去计算我们的梯度值  这个时候有两种办法

    #1.第一种方法
    with torch.no_grad():
        y=x**2 #这个计算只用x计算，但是这个计算过程不参与梯度的计算 ；反向传播的时候，这个计算仅仅是进行数值上的计算，它不会影响到反向传播的梯度

    print(y.requires_grad)

    #2.第二种方式 ：针对函数
    def my_func(x):
        return x**2
    #梯度计算需要依赖这个过程，但是我们希望对这个函数不进行任何梯度计算
    y=my_func(x)
    print(y.requires_grad)

    @torch.no_grad
    def my_func(x):
        return x**2
    #梯度计算需要依赖这个过程，但是我们希望对这个函数不进行任何梯度计算，我们可以使用装饰器的方法
    y=my_func(x)
    print(y.requires_grad)

    # 3.第三种方式 ：全局的方式
    torch.set_grad_enabled(False)
    y=x**2
    print(y.requires_grad)

#2.累计梯度和梯度清零
def test02():
    x=torch.tensor([10,20,30,40],requires_grad=True,dtype=torch.float64)
    #当我们重复进行梯度计算的时候，是会将历史的梯度值累加到x.grad属性中
    #希望不要去累加历史梯度 在backward自动微分之前进行梯度清零
    #不要去累加，那么每次下一次计算的时候呢，提前把上一轮计算的梯度值变成0，然后再进行下一轮计算
    for _ in range(10):

        #对输入x的计算过程
        f1=x**2+20
        #将向量转化为标量
        f2=f1.mean()
        #梯度清零
        if x.grad is not None:
            x.grad.data.zero_()

         #自动微分
        f2.backward()
        print(x.grad)

"""
   range(3):
      tensor([ 5., 10., 15., 20.], dtype=torch.float64)
      tensor([10., 20., 30., 40.], dtype=torch.float64)
      tensor([15., 30., 45., 60.], dtype=torch.float64)
   range(10):
        tensor([ 5., 10., 15., 20.], dtype=torch.float64)
        tensor([10., 20., 30., 40.], dtype=torch.float64)
        tensor([15., 30., 45., 60.], dtype=torch.float64)
        tensor([20., 40., 60., 80.], dtype=torch.float64)
        tensor([ 25.,  50.,  75., 100.], dtype=torch.float64)
        tensor([ 30.,  60.,  90., 120.], dtype=torch.float64)
        tensor([ 35.,  70., 105., 140.], dtype=torch.float64)
        tensor([ 40.,  80., 120., 160.], dtype=torch.float64)
        tensor([ 45.,  90., 135., 180.], dtype=torch.float64)
        tensor([ 50., 100., 150., 200.], dtype=torch.float64)
  意味着梯度值会累加的
  梯度清零后：
        tensor([ 5., 10., 15., 20.], dtype=torch.float64)
        tensor([ 5., 10., 15., 20.], dtype=torch.float64)
        tensor([ 5., 10., 15., 20.], dtype=torch.float64)
        tensor([ 5., 10., 15., 20.], dtype=torch.float64)
        tensor([ 5., 10., 15., 20.], dtype=torch.float64)
        tensor([ 5., 10., 15., 20.], dtype=torch.float64)
        tensor([ 5., 10., 15., 20.], dtype=torch.float64)
        tensor([ 5., 10., 15., 20.], dtype=torch.float64)
        tensor([ 5., 10., 15., 20.], dtype=torch.float64)
        tensor([ 5., 10., 15., 20.], dtype=torch.float64)
"""

#3.案例-梯度下降优化函数
def test03():
    #y=x**2
    #当x 为什么值的情况下，y最小。当参数为什么值的情况下，会使得我们损失函数值最小
    #一般先初始化x
    x=torch.tensor(10,requires_grad=True,dtype=torch.float64)
    for _ in range(5000):
        #正向计算
        y=x**2
        # 梯度清零
        if x.grad is not None:
            x.grad.data.zero_()
        #自动微分
        y.backward()
        #更新参数
        x.data=x.data-0.001*x.grad
        #打印x的值
        print('%.10f' % x.data)


if __name__=='__main__':
    test03()