目录
7.9 张量运算
1 Containers
1.1 Module
- 概念
所有神经网络模块的基类,自己定义的模型也成为这个类的子类。
- 用法
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
在对子类赋值之前,必须对父类进行__init__()调用。
1.2 Sequential
- 概念
顺序容器。模块将按照它们被传递到构造函数的顺序添加到其中。与手动调用一系列模块相比,Sequential提供的价值在于它允许将整个容器视为单个模块。
A ModuleList
is exactly what it sounds like–a list for storing Module
s! On the other hand, the layers in a Sequential
are connected in a cascading way.
- 用法
# Using Sequential to create a small model. When `model` is run,
# input will first be passed to `Conv2d(1,20,5)`. The output of
# `Conv2d(1,20,5)` will be used as the input to the first
# `ReLU`; the output of the first `ReLU` will become the input
# for `Conv2d(20,64,5)`. Finally, the output of
# `Conv2d(20,64,5)` will be used as input to the second `ReLU`
model = nn.Sequential(
nn.Conv2d(1,20,5),
nn.ReLU(),
nn.Conv2d(20,64,5),
nn.ReLU()
)
# Using Sequential with OrderedDict. This is functionally the
# same as the above code
model = nn.Sequential(OrderedDict([
('conv1', nn.Conv2d(1,20,5)),
('relu1', nn.ReLU()),
('conv2', nn.Conv2d(20,64,5)),
('relu2', nn.ReLU())
]))
1.3 Modulelist
- 概念
可以像常规的Python列表一样对ModuleList进行索引,但它包含的模块已正确注册,并且将被所有的模块方法看到。
- 用法
class MyModule(nn.Module):
def __init__(self):
super(MyModule, self).__init__()
self.linears = nn.ModuleList([nn.Linear(10, 10) for i in range(10)])
def forward(self, x):
# ModuleList can act as an iterable, or be indexed using ints
for i, l in enumerate(self.linears):
x = self.linears[i // 2](x) + l(x)
return x
2. ConvLayer
- in_channels(
int
) – 输入信号的通道数 - out_channels(
int
) – 卷积产生的通道数。有多少个out_channels,就需要多少个1维卷积 - kernel_size(
int
ortuple
) - 卷积核的尺寸,卷积核的大小为(k,*),第二个维度*是由in_channels来决定的,所以实际上卷积大小为kernel_size*in_channels - stride(
int
ortuple
,optional
) - 卷积步长,可选,默认为1 - padding(
int
ortuple
,optional
)- 输入的每一条边补充0的层数,可选,默认为0 - padding_mode(
string
,optional
) – 进行padding的模式,有'zeros'
,'reflect'
,'replicate'
,'circular'
. 默认的模式为'zeros'
- dilation(
int
ortuple
,optional
)- 卷积核元素之间的间距,可选,默认为1 - groups(
int
,optional
) – 从输入通道到输出通道的阻塞连接数,可选,默认为1 - bias(
bool
,optional
) - 如果bias=True
,添加偏置,可选,默认为True
2.1 Conv1d
- 概念:一维卷积一般在文本中用到的比较多,进行文本的分类
- 用法
torch.nn.Conv1d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)
2.2 Conv2d
- 概念:进行二维的卷积,一般在图像处理用的十分广泛。
- 用法
torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)
2.3 Conv3d
- 概念:在三维的卷积运算,例如点云。
- 用法
torch.nn.Conv3d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)
3. Pooling Layers
kernel_size – 池化窗口大小
stride – 步长. Default value is kernel_size
padding – padding的值,默认就是不padding
dilation – 控制扩张的参数
return_indices – if True, will return the max indices along with the outputs. Useful for torch.nn.MaxUnpool1d later
ceil_mode – when True, 会用向上取整而不是向下取整来计算output的shape
3.1 最大池化
1. nn.MaxPool1d
- 概念:一维最大池化,输入size为(N,C,L),在L维进行池化
- 用法
torch.nn.MaxPool1d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
m = nn.MaxPool1d(3, stride=2)
input = torch.randn(20, 16, 50)
output = m(input)
2. nn.MaxPool2d
- 概念:二维最大池化,输入size为(N,C,H,W),在(kH,kW)进行最大池化
- 用法
torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
>>> # pool of square window of size=3, stride=2
>>> m = nn.MaxPool2d(3, stride=2)
>>> # pool of non-square window
>>> m = nn.MaxPool2d((3, 2), stride=(2, 1))
>>> input = torch.randn(20, 16, 50, 32)
>>> output = m(input)
3. nn.MaxPool3d
- 概念:三维最大池化,输入size为 (N, C, D, H, W),在 (kD, kH, kW)进行最大池化。
- 用法
torch.nn.MaxPool3d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
>>> # pool of square window of size=3, stride=2
>>> m = nn.MaxPool3d(3, stride=2)
>>> # pool of non-square window
>>> m = nn.MaxPool3d((3, 2, 2), stride=(2, 1, 2))
>>> input = torch.randn(20, 16, 50,44, 31)
>>> output = m(input)
3.2 平均池化
平均池化就是取平均值,类似最大池化。
- 1d
torch.nn.AvgPool1d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True)
>>> # pool with window of size=3, stride=2
>>> m = nn.AvgPool1d(3, stride=2)
>>> m(torch.tensor([[[1.,2,3,4,5,6,7]]]))
tensor([[[2., 4., 6.]]])
- 2d
torch.nn.AvgPool2d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, divisor_override=None)
>>> # pool of square window of size=3, stride=2
>>> m = nn.AvgPool2d(3, stride=2)
>>> # pool of non-square window
>>> m = nn.AvgPool2d((3, 2), stride=(2, 1))
>>> input = torch.randn(20, 16, 50, 32)
>>> output = m(input)
- 3d
torch.nn.AvgPool3d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, divisor_override=None)
>>> # pool of square window of size=3, stride=2
>>> m = nn.AvgPool3d(3, stride=2)
>>> # pool of non-square window
>>> m = nn.AvgPool3d((3, 2, 2), stride=(2, 1, 2))
>>> input = torch.randn(20, 16, 50,44, 31)
>>> output = m(input)
3.3 Adaptive
- 概念:这种层和一般的池化层一样,都没有参数,都是对特征进行降采样,自适应的意思是在使用池化层时不需要指定核的大小步长等参数,只需要告诉池化层我们所需要的输出大小即可,池化层会自动计算核的大小以及步长,因此称为自适应。
- 用法
import torch.nn as nn
import torch
x = torch.rand(size=(1, 1, 5)) # 池化层在最后一个维度进行池化
print(x)
>>> tensor([[[0.6633, 0.0397, 0.5412, 0.0132, 0.7847]]])
out = nn.AdaptiveMaxPool1d(output_size=1)(x) # 最后一个维度输出大小为1
print(out)
>>> tensor([[[0.7847]]])
out = nn.AdaptiveMaxPool1d(output_size=2)(x) # 最后一个维度输出大小为2
print(out)
>>> tensor([[[0.6633, 0.7847]]])
4. Padding
4.1 ZEROPAD2D
- 概念:用零填充输入张量边界。
- 用法:
torch.nn.ZeroPad2d(padding)
>>> m = nn.ZeroPad2d(2)
>>> input = torch.randn(1, 1, 3, 3)
>>> input
tensor([[[[-0.1678, -0.4418, 1.9466],
[ 0.9604, -0.4219, -0.5241],
[-0.9162, -0.5436, -0.6446]]]])
>>> m(input)
tensor([[[[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, -0.1678, -0.4418, 1.9466, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.9604, -0.4219, -0.5241, 0.0000, 0.0000],
[ 0.0000, 0.0000, -0.9162, -0.5436, -0.6446, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000]]]])
>>> # using different paddings for different sides
>>> m = nn.ZeroPad2d((1, 1, 2, 0))
>>> m(input)
tensor([[[[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, -0.1678, -0.4418, 1.9466, 0.0000],
[ 0.0000, 0.9604, -0.4219, -0.5241, 0.0000],
[ 0.0000, -0.9162, -0.5436, -0.6446, 0.0000]]]])
4.2 ReplicationPad1d
- 概念:使用输入边界的复制来填充输入张量。
- 用法:
torch.nn.functional.pad()
>>> m = nn.ReplicationPad1d(2)
>>> input = torch.arange(8, dtype=torch.float).reshape(1, 2, 4)
>>> input
tensor([[[0., 1., 2., 3.],
[4., 5., 6., 7.]]])
>>> m(input)
tensor([[[0., 0., 0., 1., 2., 3., 3., 3.],
[4., 4., 4., 5., 6., 7., 7., 7.]]])
>>> # using different paddings for different sides
>>> m = nn.ReplicationPad1d((3, 1))
>>> m(input)
tensor([[[0., 0., 0., 0., 1., 2., 3., 3.],
[4., 4., 4., 4., 5., 6., 7., 7.]]])
4.3 ConstantPad1d
- 概念:使用常量值填充输入张量边界。
- 用法:
torch.nn.ConstantPad1d(padding, value)
>>> m = nn.ConstantPad1d(2, 3.5)
>>> input = torch.randn(1, 2, 4)
>>> input
tensor([[[-1.0491, -0.7152, -0.0749, 0.8530],
[-1.3287, 1.8966, 0.1466, -0.2771]]])
>>> m(input)
tensor([[[ 3.5000, 3.5000, -1.0491, -0.7152, -0.0749, 0.8530, 3.5000,
3.5000],
[ 3.5000, 3.5000, -1.3287, 1.8966, 0.1466, -0.2771, 3.5000,
3.5000]]])
>>> m = nn.ConstantPad1d(2, 3.5)
>>> input = torch.randn(1, 2, 3)
>>> input
tensor([[[ 1.6616, 1.4523, -1.1255],
[-3.6372, 0.1182, -1.8652]]])
>>> m(input)
tensor([[[ 3.5000, 3.5000, 1.6616, 1.4523, -1.1255, 3.5000, 3.5000],
[ 3.5000, 3.5000, -3.6372, 0.1182, -1.8652, 3.5000, 3.5000]]])
>>> # using different paddings for different sides
>>> m = nn.ConstantPad1d((3, 1), 3.5)
>>> m(input)
tensor([[[ 3.5000, 3.5000, 3.5000, 1.6616, 1.4523, -1.1255, 3.5000],
[ 3.5000, 3.5000, 3.5000, -3.6372, 0.1182, -1.8652, 3.5000]]])
5. Activations Layer
5.1 ELU
- 概念:
- 用法:
>>> m = nn.ELU()
>>> input = torch.randn(2)
>>> output = m(input)
5.2 Sigmoid
- 概念:
- 用法:
>>> m = nn.Sigmoid()
>>> input = torch.randn(2)
>>> output = m(input)
5.3 Tanh
概念:
用法:
>>> m = nn.Tanh()
>>> input = torch.randn(2)
>>> output = m(input)
6. Other Layer
6.1 Linear
概念:
用法:
torch.nn.Linear(in_features, out_features, bias=True, device=None, dtype=None)
>>> m = nn.Linear(20, 30)
>>> input = torch.randn(128, 20)
>>> output = m(input)
>>> print(output.size())
torch.Size([128, 30])
6.2 Dropout Layers
- 概念:一定概率参数随机清0,为防止过拟合
- 用法
torch.nn.Dropout2d(p=0.5, inplace=False)
>>> m = nn.Dropout2d(p=0.2)
>>> input = torch.randn(20, 16, 32, 32)
>>> output = m(input)
6.3 FeatureAlphaDropout
- 概念:
随机屏蔽掉整个通道(一个通道是一个特征图,例如批量输入中第 ii 个样本的第 jj 个通道是一个张量 \text{input}[i, j]input[i,j])输入张量)。不像在常规 Dropout 中那样将激活值设置为零,而是将激活值设置为 SELU 激活函数的负饱和值。
- 用法:
torch.nn.FeatureAlphaDropout(p=0.5, inplace=False)
>>> m = nn.FeatureAlphaDropout(p=0.2)
>>> input = torch.randn(20, 16, 4, 32, 32)
>>> output = m(input)
6.4 CosineSimilarity
- 概念:余弦相似度
- 用法:
torch.nn.CosineSimilarity(dim=1, eps=1e-08)
>>> input1 = torch.randn(100, 128)
>>> input2 = torch.randn(100, 128)
>>> cos = nn.CosineSimilarity(dim=1, eps=1e-6)
>>> output = cos(input1, input2)
6.5 Upsample
- 概念:上采样
- 用法:
torch.nn.Upsample(size=None, scale_factor=None, mode='nearest', align_corners=None, recompute_scale_factor=None)
>>> input = torch.arange(1, 5, dtype=torch.float32).view(1, 1, 2, 2)
>>> input
tensor([[[[1., 2.],
[3., 4.]]]])
>>> m = nn.Upsample(scale_factor=2, mode='nearest')
>>> m(input)
tensor([[[[1., 1., 2., 2.],
[1., 1., 2., 2.],
[3., 3., 4., 4.],
[3., 3., 4., 4.]]]])
>>> m = nn.Upsample(scale_factor=2, mode='bilinear') # align_corners=False
>>> m(input)
tensor([[[[1.0000, 1.2500, 1.7500, 2.0000],
[1.5000, 1.7500, 2.2500, 2.5000],
[2.5000, 2.7500, 3.2500, 3.5000],
[3.0000, 3.2500, 3.7500, 4.0000]]]])
>>> m = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
>>> m(input)
tensor([[[[1.0000, 1.3333, 1.6667, 2.0000],
[1.6667, 2.0000, 2.3333, 2.6667],
[2.3333, 2.6667, 3.0000, 3.3333],
[3.0000, 3.3333, 3.6667, 4.0000]]]])
7. Torch.Tensor
7.1 Torch.max
- 概念:返回n维的最大值
- 用法:
torch.max(input)
out (tuple, optional) – the result tuple of two output tensors (max, max_indices)
>>> a = torch.randn(4, 4)
>>> a
tensor([[-1.2360, -0.2942, -0.1222, 0.8475],
[ 1.1949, -1.1127, -2.2379, -0.6702],
[ 1.5717, -0.9207, 0.1297, -1.8768],
[-0.6172, 1.0036, -0.6060, -0.2432]])
>>> torch.max(a, 1)
torch.return_types.max(values=tensor([0.8475, 1.1949, 1.5717, 1.0036]), indices=tensor([3, 0, 0, 1]))
7.2 Torch.Maximum
- 概念:计算 input 和 other 的元素最大值。
- 用法:
torch.maximum(input, other, *, out=None)
>>> a = torch.tensor((1, 2, -1))
>>> b = torch.tensor((3, 0, 4))
>>> torch.maximum(a, b)
tensor([3, 2, 4])
7.3 Torch.Maximum
概念:计算 input 和 other 的元素最小值。
用法:
torch.minimum(input, other, *, out=None)
>>> a = torch.tensor((1, 2, -1))
>>> b = torch.tensor((3, 0, 4))
>>> torch.minimum(a, b)
tensor([1, 0, -1])
7.4 Torch.Mean
概念:返回所有量的平均值
用法:
torch.mean(input, *, dtype=None)
>>> a = torch.randn(4, 4)
>>> a
tensor([[-0.3841, 0.6320, 0.4254, -0.7384],
[-0.9644, 1.0131, -0.6549, -1.4279],
[-0.2951, -1.3350, -0.7694, 0.5600],
[ 1.0842, -0.9580, 0.3623, 0.2343]])
>>> torch.mean(a, 1)
tensor([-0.0163, -0.5085, -0.4599, 0.1807])
>>> torch.mean(a, 1, True)
tensor([[-0.0163],
[-0.5085],
[-0.4599],
[ 0.1807]])
7.5 Torch.Median
- 概念:返回中位数
- 用法:
torch.median(input)
>>> a = torch.randn(4, 5)
>>> a
tensor([[ 0.2505, -0.3982, -0.9948, 0.3518, -1.3131],
[ 0.3180, -0.6993, 1.0436, 0.0438, 0.2270],
[-0.2751, 0.7303, 0.2192, 0.3321, 0.2488],
[ 1.0778, -1.9510, 0.7048, 0.4742, -0.7125]])
>>> torch.median(a, 1)
torch.return_types.median(values=tensor([-0.3982, 0.2270, 0.2488, 0.4742]), indices=tensor([1, 4, 4, 3]))
7.6 Torch.min
- 概念:返回最小值
- 用法:
torch.min(input)
>>> a = torch.randn(1, 3)
>>> a
tensor([[ 0.6750, 1.0857, 1.7197]])
>>> torch.min(a)
tensor(0.6750)
7.7 Torch.size
概念:查看尺寸
用法:
Tensor.size(dim=None)
>>> t = torch.empty(3, 4, 5)
>>> t.size()
torch.Size([3, 4, 5])
>>> t.size(dim=1)
4
7.8 创建张量
- tensor.new_empty
Tensor.new_empty(size, dtype=None, device=None, requires_grad=False)
>>> tensor = torch.ones(())
>>> tensor.new_empty((2, 3))
tensor([[ 5.8182e-18, 4.5765e-41, -1.0545e+30],
[ 3.0949e-41, 4.4842e-44, 0.0000e+00]])
- tensor.new_full
Tensor.new_full(size, fill_value, dtype=None, device=None, requires_grad=False)
>>> tensor = torch.ones((2,), dtype=torch.float64)
>>> tensor.new_full((3, 4), 3.141592)
tensor([[ 3.1416, 3.1416, 3.1416, 3.1416],
[ 3.1416, 3.1416, 3.1416, 3.1416],
[ 3.1416, 3.1416, 3.1416, 3.1416]], dtype=torch.float64)
- tensor.new_ones
Tensor.new_ones(size, dtype=None, device=None, requires_grad=False)
>>> tensor = torch.tensor((), dtype=torch.int32)
>>> tensor.new_ones((2, 3))
tensor([[ 1, 1, 1],
[ 1, 1, 1]], dtype=torch.int32)
- tensor.new_zeros
Tensor.new_zeros(size, dtype=None, device=None, requires_grad=False)
>>> tensor = torch.tensor((), dtype=torch.float64)
>>> tensor.new_zeros((2, 3))
tensor([[ 0., 0., 0.],
[ 0., 0., 0.]], dtype=torch.float64)
7.9 张量运算
- 加减
torch.add(input, other, *, alpha=1, out=None)
torch.sub(input=y, alpha=1, other=x)
>>> a = torch.randn(4)
>>> a
tensor([ 0.0202, 1.0985, 1.3506, -0.6056])
>>> torch.add(a, 20)
tensor([ 20.0202, 21.0985, 21.3506, 19.3944])
>>> b = torch.randn(4)
>>> b
tensor([-0.9732, -0.3497, 0.6245, 0.4022])
>>> c = torch.randn(4, 1)
>>> c
tensor([[ 0.3743],
[-1.7724],
[-0.5811],
[-0.8017]])
>>> torch.add(b, c, alpha=10)
tensor([[ 2.7695, 3.3930, 4.3672, 4.1450],
[-18.6971, -18.0736, -17.0994, -17.3216],
[ -6.7845, -6.1610, -5.1868, -5.4090],
[ -8.9902, -8.3667, -7.3925, -7.6147]])
- 乘除
torch.mul(x, y)
torch.div(x, y)
- 绝对值
torch.abs(input, *, out=None)
>>> torch.abs(torch.tensor([-1, -2, 3]))
tensor([ 1, 2, 3])