动手学深度学习PyTorch版_第一次打卡

最新推荐文章于 2023-11-09 12:00:45 发布

Uncle_Sugar

最新推荐文章于 2023-11-09 12:00:45 发布

阅读量340

点赞数

分类专栏：动手学深度学习Pytorch版

本文链接：https://blog.csdn.net/sinat_29278271/article/details/104300062

版权

动手学深度学习Pytorch版专栏收录该内容

7 篇文章 0 订阅

订阅专栏

这篇博客记录了作者在伯禹教育的免费网课中学习PyTorch的基础知识，包括torch.ones(n)、Timer类、index_select与torch.LongTensor的使用、y.view(y_hat.size())以及zero_grad()的重要性。文章提到了LongTensor在索引操作中的必要性，以及zero_grad()在节省显存和实现复杂操作中的作用。

摘要由CSDN通过智能技术生成

被朋友拉着报名了伯禹教育的一门免费网课，算是熟悉一下PyTorch

全部课件来自

https://github.com/ShusenTang/Dive-into-DL-PyTorch

主要是做一些基础知识的记录

第一次的内容是线性回归，多层感知机以及softmax。

1.torch.ones(n)

返回的是一个1×n的向量，这个应该是不需要记录的，是很基础的东西，主要是为了和matlab 做区分，matlab 的 ones（n）返回n×n的矩阵

2.用于计时的Timer类

感觉这个东西不错，以后可以拿过来给自己的程序测时间。

# define a timer class to record time
class Timer(object):
    """Record multiple running times."""

    def __init__(self):
        self.times = []
        self.start()

    def start(self):
        # start the timer
        self.start_time = time.time()

    def stop(self):
        # stop the timer and record time into a list
        self.times.append(time.time() - self.start_time)
        return self.times[-1]

    def avg(self):
        # calculate the average and return
        return sum(self.times)/len(self.times)

    def sum(self):
        # return the sum of recorded time
        return sum(self.times)

3.index_select与torch.LongTensor

说以来pytorch 的明明风格真是让人疑惑，LongTensor和index_select一个是java 风格，一个是c语言风格。

def data_iter(batch_size, features, labels):
    print(features)
    num_examples = len(features)
    indices = list(range(num_examples))
    random.shuffle(indices)  # random read 10 samples
    for i in range(0, num_examples, batch_size):
        # the last time may be not enough for a whole batch
        j = torch.LongTensor(indices[i: min(i + batch_size, num_examples)])
        yield features.index_select(0, j), labels.index_select(0, j)

这里要注意的是第8行，要用LongTensor()，而不能使用Tensor()，LongTensor是64位的整数，也就是说32位不够装，这样想想真是细思极恐，谁家的训练数据爆了32位int吗，真是太可怕了。

index (LongTensor): the 1-D tensor containing the indices to index

4.y.view(y_hat.size())

这个东西没什么意思，就是把y的格式调整成和y_hat一样，代码片段如下。现在还没到纠结代码细节的时候，随便记录一下。

def squared_loss(y_hat, y): 
    return (y_hat - y.view(y_hat.size())) ** 2 / 2

5.zero_grad()

https://www.zhihu.com/question/303070254

关于为什么要进行显式的zero_grad()清空梯度。这里有几个回答，一个很重要的意义是节省显存，另一个是细粒度更高的函数，有助于实现更复杂的操作。

第二次课和第三次课好无聊啊，不记了，浪费时间

Uncle_Sugar

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录