动手学深度学习课程笔记ch02

最新推荐文章于 2022-09-15 12:11:08 发布

lazyoneguy

最新推荐文章于 2022-09-15 12:11:08 发布

阅读量478

点赞数 2

分类专栏：深度学习文章标签： python 深度学习

本文链接：https://blog.csdn.net/p3116002589/article/details/117520998

版权

深度学习专栏收录该内容

8 篇文章 2 订阅

订阅专栏

ch_02

线性代数

线性代数李老师讲得比较少，需要自己下去多看看书，后期还是需要一些矩阵论的知识。

基本知识

标量：由只有一个元素的张量表示（一般为数据的标签）。

# 创建标量进行运算
import torch

x = torch.tensor([3.0])
y = torch.tensor([2.0])

x + y, x * y, x / y, x**y

输出：
(tensor([5.]), tensor([6.]), tensor([1.5000]), tensor([9.]))

向量：多个元素组成的一行或一列张量（一维张量一般只有行向量，列向量只能用二维矩阵的形式表示，行数为1，列数为n），向量一般用来表示数据样本的一些特征，如二手房子的大小，楼层，使用年限，房子所在地的犯罪率等等。

# 创建一个向量
# 后面都省略 import
x = torch.arange(4)
x

输出：tensor([0, 1, 2, 3])
可以使用下标来引用向量的任一元素，例如x[3]
访问张量的长度len(x)
访问张量形状（只有一个轴的张量，形状只有一个元素）x.shape

矩阵：正如向量将标量从零阶推广到一阶，矩阵将向量从一阶推广到二阶。矩阵，我们通常用粗体、大写字母来表示（例如，X 、Y和 Z），在代码中表示为具有两个轴的张量。
torch中可以通过指定两个分量 m 和 n 来创建一个形状为 mxn 的矩阵

A = torch.arange(20).reshape(5, 4)
A
# 输出：（输出后面为程序运行结果，省略注释）
tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15],
        [16, 17, 18, 19]])

可以使用下标来引用矩阵的任一元素，例如A[3][4]
将矩阵转置A.T

张量算法的基本性质

就像向量是标量的推广，矩阵是向量的推广一样，我们可以构建具有更多轴的数据结构。

具有相同形状的任意两个张量，任何按元素二元运算的结果都将是相同形状的张量。

# 创建A和B矩阵并相加
A = torch.arange(20, dtype=torch.float32).reshape(5, 4)
B = A.clone()  # 通过分配新内存，将A的一个副本分配给B
A, A + B
# 输出：
(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.],
         [12., 13., 14., 15.],
         [16., 17., 18., 19.]]),
 tensor([[ 0.,  2.,  4.,  6.],
         [ 8., 10., 12., 14.],
         [16., 18., 20., 22.],
         [24., 26., 28., 30.],
         [32., 34., 36., 38.]]))

哈达玛积：两个矩阵的按元素乘法
数学公式：
$\mathbf{A} \odot \mathbf{B} = \begin{bmatrix} a_{11} b_{11} & a_{12} b_{12} & \dots & a_{1n} b_{1n} \\ a_{21} b_{21} & a_{22} b_{22} & \dots & a_{2n} b_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} b_{m1} & a_{m2} b_{m2} & \dots & a_{mn} b_{mn} \end{bmatrix}$

哈达玛积：A*B

tensor([[  0.,   1.,   4.,   9.],
        [ 16.,  25.,  36.,  49.],
        [ 64.,  81., 100., 121.],
        [144., 169., 196., 225.],
        [256., 289., 324., 361.]])

将张量乘以或加上一个标量不会改变张量的形状，其中张量的每个元素都将与标量相加或相乘。

# 创建矩阵与标量运算
a = 2
X = torch.arange(24).reshape(2, 3, 4)
a + X, (a * X).shape
# 输出：
(tensor([[[ 2,  3,  4,  5],
          [ 6,  7,  8,  9],
          [10, 11, 12, 13]],
 
         [[14, 15, 16, 17],
          [18, 19, 20, 21],
          [22, 23, 24, 25]]]),
 torch.Size([2, 3, 4]))

任意张量都能进行的操作是：计算其元素的和

# 创建x向量并求和
x = torch.arange(4, dtype=torch.float32)
x, x.sum()
# 输出：
(tensor([0., 1., 2., 3.]), tensor(6.))

降维

默认情况下，调用求和函数会沿所有的轴降低张量的维度，使它变为一个标量。我们还可以指定张量沿哪一个轴来通过求和降低维度：

# 利用之前矩阵A指定axis参数求和
A_sum_axis0 = A.sum(axis=0)
A_sum_axis1 = A.sum(axis=1)
A.shape, A.sum(),A_sum_axis0, A_sum_axis0.shape,A_sum_axis1, A_sum_axis1.shape
# 输出：
(torch.Size([5, 4]), tensor(190.))
(tensor([40., 45., 50., 55.]), torch.Size([4]))
(tensor([ 6., 22., 38., 54., 70.]), torch.Size([5]))

程序中可以看出矩阵A的形状为5x4，指定axis=0时，矩阵形状中的5即列被求和了，指定axis=1时，矩阵形状的中的4即行被求和了。

轴是老师课程中的一个概念，具体可以看看这篇博客
轴的概念

非降维求和

如果我们想沿某个轴计算 A 元素的累积总和，比如 axis=0（按行计算），我们可以调用 cumsum 函数。此函数不会沿任何轴降低输入张量的维度。

x = torch.arange(4, dtype=torch.float32)

课后习题

证明一个矩阵 $\mathbf{A}$ 的转置的转置是 $\mathbf{A}$ ： $(\mathbf{A}^\top)^\top = \mathbf{A}$ 。

# 练习1
A = torch.arange(24).reshape(2,3,4)
A.T.T == A
# 输出：
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])

给出两个矩阵 $\mathbf{A}$ 和 $\mathbf{B}$ , 显示转置的和等于和的转置： $\mathbf{A}^\top + \mathbf{B}^\top = (\mathbf{A} + \mathbf{B})^\top$ .

# 2
A = torch.arange(12).reshape(3, 4)
B = torch.arange(12).reshape(3,4)
A.T+ B.T == (A+B).T
# 输出:
tensor([[True, True, True],
        [True, True, True],
        [True, True, True],
        [True, True, True]])

给定任意方矩阵 $\mathbf{A}$ ， $\mathbf{A} + \mathbf{A}^\top$ 总是对称的吗?为什么?

# 3
A = torch.randn(3,3) # 0,1的随机数，正态分布
C = A + A.T
C.T == C
# 测试为True是的，原因C的元素是行和列对应元素的和，始终对称
输出：
tensor([[True, True, True],
        [True, True, True],
        [True, True, True]])

我们在本节中定义了形状（2, 3, 4）的张量 X。len(X)的输出结果是什么？

# 4
A = torch.arange(24).reshape(2,3,4)
A, len(A)

# 输出结果为2,因为存储方式是列表，第一个列表里包含两个3x4的列表
(tensor([[[ 0,  1,  2,  3],
          [ 4,  5,  6,  7],
          [ 8,  9, 10, 11]],
 
         [[12, 13, 14, 15],
          [16, 17, 18, 19],
          [20, 21, 22, 23]]]),
 2)

对于任意形状的张量X, len(X)是否总是对应于X特定轴的长度?这个轴是什么?

# 5
A = torch.arange(48).reshape(4,3,4)
A,len(A)
# 总返回第一个轴的长度，为行
(tensor([[[ 0,  1,  2,  3],
          [ 4,  5,  6,  7],
          [ 8,  9, 10, 11]],
 
         [[12, 13, 14, 15],
          [16, 17, 18, 19],
          [20, 21, 22, 23]],
 
         [[24, 25, 26, 27],
          [28, 29, 30, 31],
          [32, 33, 34, 35]],
 
         [[36, 37, 38, 39],
          [40, 41, 42, 43],
          [44, 45, 46, 47]]]),
 4)

运行 A / A.sum(axis=1)，看看会发生什么。你能分析原因吗？

# 6
A = torch.arange(12).reshape(3,4)
A / A.sum(axis=1)
# 维度不对
RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 1

当你在曼哈顿的两点之间旅行时，你需要在坐标上走多远，也就是说，就大街和街道而言？你能斜着走吗？

# 7
# 不能，在坐标上最短总距离可以这样计算：d(i,j) = |x1 - x2| + |y1 - y2|.

考虑一个具有形状（2, 3, 4）的张量，在轴 0,1,2 上的求和输出是什么形状?

# 8
C = torch.arange(24).reshape(2,3,4)
C_axis_0 = C.sum(axis=0)
C_axis_1 = C.sum(axis=1)
C_axis_2 = C.sum(axis=2)
C,C.shape,C_axis_0.shape,C_axis_1.shape,C_axis_2.shape
# 输出如下
(tensor([[[ 0,  1,  2,  3],
          [ 4,  5,  6,  7],
          [ 8,  9, 10, 11]],
 
         [[12, 13, 14, 15],
          [16, 17, 18, 19],
          [20, 21, 22, 23]]]),
 torch.Size([2, 3, 4]),
 torch.Size([3, 4]),
 torch.Size([2, 4]),
 torch.Size([2, 3]))

向 linalg.norm 函数提供 3 个或更多轴的张量，并观察其输出。对于任意形状的张量这个函数计算得到什么?

# 9
import numpy as np
import math
A = torch.arange(12).reshape(2,2,3)
def sum_power(n):
    sum = 0
    for i in range(n):
        sum += pow(i,2)
    return sum
b = math.sqrt(sum_power(12))
a=np.linalg.norm(A)
a, b, a == b
# 第二范数的计算结果
(22.494443758403985, 22.494443758403985, True)

关于np.linalg.norm()函数的详细介绍可以去看看其他博客

ps

写笔记太费时间了，最近接项目又忙起来了，之后应该不会更新笔记（看情况把）
努力、共勉

lazyoneguy

关注

2
点赞
踩
6

收藏

觉得还不错? 一键收藏
0
评论
动手学深度学习课程笔记ch02

ch_02线性代数线性代数李老师讲得比较少，需要自己下去多看看书，后期还是需要一些矩阵论的知识。基本知识标量：由只有一个元素的张量表示（一般为数据的标签）。# 创建标量进行运算import torchx = torch.tensor([3.0])y = torch.tensor([2.0])x + y, x * y, x / y, x**y输出：(tensor([5.]), tensor([6.]), tensor([1.5000]), tensor([9.]))向量：多
复制链接

扫一扫