资料
summary参考code
https://shenxiaohai.me/2018/10/23/pytorch-tutorial-TensorBoard/
参考docs
https://pytorch.org/docs/stable/
summary 源码列子 推荐阅读
https://www.cnblogs.com/kk17/p/10077335.html
model save/load
https://blog.csdn.net/sinat_40624829/article/details/96597078
仔细参考
https://dingguanglei.com/pytorch-mo-xing-bao-cun-he-du-qu/
data loader
https://blog.csdn.net/sinat_42239797/article/details/90641659
分布式训练,这一篇足够了!
https://mp.weixin.qq.com/s/hVcgcMYf9AaCHJ_2F-VyZQ
optimizer
Adam
class torch.optim.Adam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0)[source]
params (iterable) – 待优化参数的iterable或者是定义了参数组的dict
- lr (float, 可选) – 学习率(默认:1e-3)
- betas (Tuple[float, float], 可选) – 用于计算梯度以及梯度平方的运行平均值的系数(默认:0.9,0.999)
- eps (float, 可选) – 为了增加数值计算的稳定性而加到分母里的项(默认:1e-8)
- weight_decay (float, 可选) – 权重衰减(L2惩罚)(默认: 0)
Activation
ReLU
nn.ReLU() 和 nn.ReLU(inplace=True)区别:inplace=True
计算结果不会有影响。利用in-place计算可以节省内(显)存,同时还可以省去反复申请和释放内存的时间。但是会对原变量覆盖,只要不带来错误就用。
nn使用
nn.ModuleList
Python 自带的 list 一样,无非是 extend,append 等操作。但不同于一般的 list,加入到 nn.ModuleList 里面的 module 是会注册到整个网络上的,同时 module 的 parameters 也会自动添加到整个网络中。
使用 Python 的 list 添加的全连接层和它们的 parameters 并没有自动注册到我们的网络中。当然,我们还是可以使用 forward 来计算输出结果。但是如果用 net2 实例化的网络进行训练的时候,因为这些层的 parameters 不在整个网络之中,所以其网络参数也不会被更新。
nn.Module使用
nn.Module创建基本模块可以使用如下方式
#-------------------------------------------------#
# BasicConv: Conv2d + BatchNorm2d + Activation
#-------------------------------------------------#
class BasicConv(nn.Module):
def __init__(self, in_channels, out_channels, kernel_size, stride, activation, bn=True, bias=False):
super(BasicConv, self).__init__()
pad = (kernel_size - 1) // 2
self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, pad, bias=bias)
if bn is True:
self.bn = nn.BatchNorm2d(out_channels)
else:
self.bn = None
if activation.lower() == "mish":
self.activation = Mish()
elif activation.lower() == "relu":
self.activation = nn.ReLU(inplace=True)
elif activation.lower() == "leaky":
self.activation = nn.LeakyReLU(0.1, inplace=True)
elif activation.lower() == "linear":
self.activation = nn.Linear(inplace=True)
elif activation.lower() == "none":
self.activation = None
else:
raise ValueError('activate error !!! {activation}, only support: mish, relu, leaky, linear, none')
def forward(self, x):
x = self.conv(x)
if self.bn is not None:
x = self.bn(x)
if self.activation is not None:
x = self.activation(x)
return x
class BasicConv(nn.Module):
def __init__(self, in_channels, out_channels, kernel_size, stride, activation, bn=True, bias=False):
pad = (kernel_size - 1) // 2
m = nn.ModuleList()
m.append(nn.Conv2d(in_channels, out_channels, kernel_size, stride, pad, bias=bias))
if bn is True:
m.append(nn.BatchNorm2d(out_channels))
if activation.lower() == "mish":
m.append(Mish())
elif activation.lower() == "relu":
m.append(nn.ReLU(inplace=True))
elif activation.lower() == "leaky":
m.append(nn.LeakyReLU(0.1, inplace=True))
elif activation.lower() == "linear":
pass
elif activation.lower() == "none":
pass
else:
raise ValueError('activate error !!! {activation}, only support: mish, relu, leaky, linear, none')
self.m = m
def forward(self, x):
for l in self.m:
x = l(x)
return x
nn.Sequential(推荐使用)
nn.Sequential创建基本模块可以使用如下方式
class BasicConv(nn.Sequential):
def __init__(self, in_channels, out_channels, kernel_size, stride, activation, bn=True, bias=False):
pad = (kernel_size - 1) // 2
m = nn.ModuleList()
m.append(nn.Conv2d(in_channels, out_channels, kernel_size, stride, pad, bias=bias))
if bn is True:
m.append(nn.BatchNorm2d(out_channels))
if activation.lower() == "mish":
m.append(Mish())
elif activation.lower() == "relu":
m.append(nn.ReLU(inplace=True))
elif activation.lower() == "leaky":
m.append(nn.LeakyReLU(0.1, inplace=True))
elif activation.lower() == "linear":
pass
elif activation.lower() == "none":
pass
else:
raise ValueError('activate error !!! {activation}, only support: mish, relu, leaky, linear, none')
super(BasicConv, self).__init__(*m)
nn.Module与nn.Sequential混合使用
对于有旁路分支的,推荐Module和Sequential混合使用构建基本模块,如残差模块
class Resblock(nn.Module):
def __init__(self, channels, hidden_channels=None):
super(Resblock, self).__init__()
if hidden_channels is None:
hidden_channels = channels
self.block = nn.Sequential(
BasicConv(channels, hidden_channels, 1),
BasicConv(hidden_channels, channels, 3)
)
def forward(self, x):
return x + self.block(x)
常用函数
函数的inplace使用
比如:exp_(), add_()等函数
a = torch.tensor([2, 2])
print("a =", a)
a.add_(1)
print("a =", a)
b = a.add_(1)
print("a =", a)
print("b =", b)
d = a.add(1)
print("a =", a)
print("d =", d)
>>>
a = tensor([2, 2])
a = tensor([3, 3])
a = tensor([4, 4])
b = tensor([4, 4])
a = tensor([4, 4])
d = tensor([5, 5])
可以发现当使用add_()的时候a的值被修改了,而使用add()时没有被修改。
带 functionname_()计算后将结果存放到同一个内存,结果会覆盖原来的数据。
增加维度
建议使用None关键字,也可以使用reshape函数,另一种是unsqueeze函数
a = torch.tensor([2, 2])
b = a[None,...]
c = a.reshape(1,2)
d = a.unsqueeze(0)
print(a)
print(b)
print(c)
print(d)
>>>
tensor([2, 2])
tensor([[2, 2]])
tensor([[2, 2]])
permute,reshape
torch.reshape()、torch.view()可以调整Tensor的shape,返回一个新shape的Tensor,torch.view()是老版本的实现,view只适用于内存中连续存储的 tensor,若之前经过了 transpose, permute这种直接跨维度的操作,会使得内存不连续,不推荐使用,而reshape没有这种问题,所以一般情况下可以用reshape,torch.reshape()是最新的实现,两者在功能上是一样的。
将tensor的维度换位。
参数:参数是一系列的整数,代表原来张量的维度。比如三维就有0,1,2这些dimension
import torch
import numpy as np
a=np.array([[[1,2,3],[4,5,6]]])
unpermuted=torch.tensor(a) # np array -> torch tensor
print(unpermuted.size()) # ——> torch.Size([1, 2, 3])
permuted=unpermuted.permute(2,0,1)
print(permuted.size()) # ——> torch.Size([3, 1, 2])
a=torch.IntTensor([[[1,2,3],[4,5,6]],
[[7,8,9],[10,11,12]]])
a=torch.reshape(a,[4,3])
>>>tensor([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]], dtype=torch.int32)
当我们需要对矩阵进行维度变换的时候,就是直接改变矩阵的原有维度变成新的维度,我们采用reshape,而当需要对通道的位置进行改变的时候,而不改变原来固有的所有通道的值,我们采用permute
- tf.math.square ---------- torch.pow
- tf.reduce_sum ---------torch.sum
- tf.boolean_mask ----------torch.masked_select(必须是bool值)
- tf.reduce_mean ---------torch.mean
- tf.cast ---------- x.type() or x.to()
- tf.stack ------------ torch.stack()
torch.range()
y=torch.range(1,4)
>>> y
tensor([1., 2., 3., 4.])
>>> y.dtype
torch.float32
torch.arange()
y=torch.range(1,4)
>>> y
tensor([1, 2, 3])
>>> y.dtype
torch.int64