多分类简单实现
网络结构如下:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
w1, b1 = torch.randn(200, 784, requires_grad=True),\
torch.zeros(200, requires_grad=True)
w2, b2 = torch.randn(200, 200, requires_grad=True),\
torch.zeros(200, requires_grad=True)
w3, b3 = torch.randn(10, 200, requires_grad=True),\
torch.zeros(10, requires_grad=True)
维度设置时通常第一维是输出的维度,第二维是输入的维度
def forward(x):
x = x@w1.t() + b1
x = F.relu(x)
x = x@w2.t() + b2
x = F.relu(x)
x = x@w3.t() + b3
x = F.relu(x) #此处是logits,没有经过softmax,也可以不使用relu
return x
optimizer = optim.SGD([w1, b1, w2, b2, w3, b3], lr=learning_rate)
criteon = nn.CrossEntropyLoss()
for epoch in range(epochs):
for batch_idx, (data, target) in enumerate(train_loader):
data = data.view(-1, 28*28)
logits = forward(data) #无需softmax
loss = criteon(logits, target)
optimizer.zero_grad()
loss.backward()
# print(w1.grad.norm(), w2.grad.norm())
optimizer.step()
设置优化器及其需要优化的内容,学习率
设置一个CEL损失函数
需要注意默认的正态分布初始化会出现梯度弥散,因此需要专门初始化W:
torch.nn.init.kaiming_normal_(w1)
torch.nn.init.kaiming_normal_(w2)
torch.nn.init.kaiming_normal_(w3)
调用Linear API实现方式
nn.Linear(in, out)
layer1 = nn.Linear(784, 200)
x = layer1(x)
x = F.relu(x, inplace = True)
封装成类
只需要实现forward(),backward自动实现
class MLP(nn.Module):
def __init__(self):
super(MLP, self).__init__()
继承自nn.Module
实现__init__(self):
def __init__(self):
super(MLP, self).__init__()
self.model = nn.Sequential(
nn.Linear(784, 200),
nn.ReLU(inplace=True),
nn.Linear(200, 200),
nn.ReLU(inplace=True),
nn.Linear(200, 10),
nn.ReLU(inplace=True),
)
实现forward(self,x):
def forward(self, x):
x = self.model(x)
return x
**在pytorch中,
类风格API,如nn.Linear(),其首字母一般大写,且必须先实例化。
函数风格API,如F.relu(),其首字母一般小写。
优化器实现
net = MLP()
optimizer = optim.SGD(net.parameters(), lr=learning_rate)
criteon = nn.CrossEntropyLoss()
GPU加速
device = torch.device('cuda:0')
net = MLP().to(device)
optimizer = optim.SGD(net.parameters(), lr=learning_rate)
criteon = nn.CrossEntropyLoss().to(device)
data,target = data.to(device),target.cuda() #后者为老方法
.to(device) 放置到设备上
数据可视化
TensorboardX
from tensorboardX import SummaryWriter
writer = SummaryWriter()
writer.add_scalar() #需调研
注意其只支持np数据,所以需要将Tensor的数据转移:
a.clone().cpu().data.numpy()
Visdom
pip install visdom
推荐本地安装:
下载地址
之后命令行进行解压路径,运行
$ pip install -e .
首先开启一个监听进程,确保程序运行之前开启
&> python -m visdom.server