生物大分子平台(4)
2021SC@SDUSC
文章目录
0 前言
这个周的主要任务是完成了RNN部分的代码解读,以及ResNet网络代码的解读。
1 RNN
1.1 准备数据
- 下载数据并解压。有一些纯文本文件data/names/[Language].txt,每行都有一个名字。 按行将文本分割得到一个数组,将Unicode编码转化为ASCII编码,最终得到{language: [names …]}格式存储的字典变量。
from __future__ import unicode_literals, print_function, division
from io import open
import glob
import os
import unicodedata
import string
all_letters = string.ascii_letters + " .,;'-"
n_letters = len(all_letters) + 1 # Plus EOS marker
def findFiles(path): return glob.glob(path)
def unicodeToAscii(s):
return ''.join(
c for c in unicodedata.normalize('NFD', s)
if unicodedata.category(c) != 'Mn'
and c in all_letters
)
def readLines(filename):
lines = open(filename, encoding='utf-8').read().strip().split('\n')
return [unicodeToAscii(line) for line in lines]
category_lines = {}
all_categories = []
for filename in findFiles('data/names/*.txt'):
category = os.path.splitext(os.path.basename(filename))[0]
all_categories.append(category)
lines = readLines(filename)
category_lines[category] = lines
n_categories = len(all_categories)
if n_categories == 0:
raise RuntimeError('Data not found. Make sure that you downloaded data '
'from https://download.pytorch.org/tutorial/data.zip and extract it to '
'the current directory.')
print('# categories:', n_categories, all_categories)
print(unicodeToAscii("O'Néàl"))
1.2 构造神经网络
- 此神经网络增加了额外的参数,该参数和其他输入连接在一起。类别可以像字母一样组成one-hot向量构成张量输入。采样过程中,当前输出可能性最高的字母作为下一个代码阶段的输入。
- 在组合隐藏状态和输出之后增加第二个linear层o2o,使模型的性能增强。再增加一个dropout层,将输入部分替换为0。给出的参数来模糊处理输入以防止过拟合,我们将它添加到网络的末端,然后添加一些混乱来使得采样特征增加。
import torch
import torch.nn as nn
class RNN(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(RNN, self).__init__()
self.hidden_size = hidden_size
self.i2h = nn.Linear(n_categories + input_size + hidden_size, hidden_size)
self.i2o = nn.Linear(n_categories + input_size + hidden_size, output_size)
self.o2o = nn.Linear(hidden_size + output_size, output_size)
self.dropout = nn.Dropout(0.1)
self.softmax = nn.LogSoftmax(dim=1)
def forward(self, category, input, hidden):
input_combined = torch.cat((category, input, hidden), 1)
hidden = self.i2h(input_combined)
output = self.i2o(input_combined)
output_combined = torch.cat((hidden, output), 1)
output = self.o2o(output_combined)
output = self.dropout(output)
output = self.softmax(output)
return output, hidden
def initHidden(self):
return torch.zeros(1, self.hidden_size)
1.3 训练
1.3.1 训练准备
- 构造可以随机获取成对训练数据(category, line)的函数。
- 从该类别中获取随机类别和随机行
import random
def randomChoice(l):
return l[random.randint(0, len(l) - 1)]
def randomTrainingPair():
category = randomChoice(all_categories)
line = randomChoice(category_lines[category])
return category, line
- 对于以时间为标准的步长,网络的输入将是“(type,currentLetter,nullLetter)”,输出将是“(nextLetter,nextState)”。对于每个训练集,我在每一个时间序列,我们使用当前字母预测下一个字母,所以训练用的字母对来自于一个单词。们将需要类别、输入字母和输出/目标字母。类别张量是一个<1 x n_categories>尺寸的one-hot张量。训练时,我们在每一个时间序列都将其提供给神经网络。
- 我们可以将这种方式视为一种选择策略,也可以作为初始隐藏状态的一部分,或者其他的结构。
def categoryTensor(category):
li = all_categories.index(category)
tensor = torch.zeros(1, n_categories)
tensor[0][li] = 1
return tensor
def inputTensor(line):
tensor = torch.zeros(len(line), 1, n_letters)
for li in range(len(line)):
letter = line[li]
tensor[li][0][all_letters.find(letter)] = 1
return tensor
def targetTensor(line):
letter_indexes = [all_letters.find(line[li]) for li in range(1, len(line))]
letter_indexes.append(n_letters - 1) # EOS
return torch.LongTensor(letter_indexes)
1.3.2 训练神经网络
- 和只在最后一个时刻使输出的分类任务相比,这次我们会对每一个时间序列进行一次预测,所以我们应该对每一个时间序列计算损失。autograd 的神奇之处在于您可以在每一步中简单地累加这些损失,并在最后阶段进行反向传播。
criterion = nn.NLLLoss()
learning_rate = 0.0005
def train(category_tensor, input_line_tensor, target_line_tensor):
target_line_tensor.unsqueeze_(-1)
hidden = rnn.initHidden()
rnn.zero_grad()
loss = 0
for i in range(input_line_tensor.size(0)):
output, hidden = rnn(category_tensor, input_line_tensor[i], hidden)
l = criterion(output, target_line_tensor[i])
loss += l
loss.backward()
for p in rnn.parameters():
p.data.add_(-learning_rate, p.grad.data)
return output, loss.item() / input_line_tensor.size(0)
- 为了跟踪训练耗费的时间,我添加一个timeSince(timestamp)函数,它返回一个便于阅读的字符串:
import time
import math
def timeSince(since):
now = time.time()
s = now - since
m = math.floor(s / 60)
s -= m * 60
return '%dm %ds' % (m, s)
- 训练过程和平时一样。多次运行训练,等待几分钟,每print_every次打印当前时间和损失。在all_losses中保留每plot_every次的平 均损失,以便稍后进行绘图。
rnn = RNN(n_letters, 128, n_letters)
n_iters = 100000
print_every = 5000
plot_every = 500
all_losses = []
total_loss = 0 # Reset every plot_every iters
start = time.time()
for iter in range(1, n_iters + 1):
output, loss = train(*randomTrainingExample())
total_loss += loss
if iter % print_every == 0:
print('%s (%d %d%%) %.4f' % (timeSince(start), iter, iter / n_iters * 100, loss))
if iter % plot_every == 0:
all_losses.append(total_loss / plot_every)
total_loss = 0
1.3.3 数据作图,利用matplotlib和R语言进行作图
1.3.4 网络采样
max_length = 20
# 来自类别和首字母的样本
def sample(category, start_letter='A'):
with torch.no_grad(): # no need to track history in sampling
category_tensor = categoryTensor(category)
input = inputTensor(start_letter)
hidden = rnn.initHidden()
output_name = start_letter
for i in range(max_length):
output, hidden = rnn(category_tensor, input[0], hidden)
topv, topi = output.topk(1)
topi = topi[0][0]
if topi == n_letters - 1:
break
else:
letter = all_letters[topi]
output_name += letter
input = inputTensor(letter)
return output_name
def samples(category, start_letters='ABC'):
for start_letter in start_letters:
print(sample(category, start_letter))
samples('Russian', 'RUS')
samples('German', 'GER')
samples('Spanish', 'SPA')
samples('Chinese', 'CHI')
2 构造神经网络RNN进行名字分类
1.2.1 构建
- 在autograd之前,要在Torch中构建一个可以复制之前时刻层参数的循环神经网络。layer的隐藏状态和梯度将交给计算图自己处理。这个RNN (几乎是从这里复制的the PyTorch for Torch users tutorial) 仅使用两层 linear 层对输入和隐藏层做处理,在最后添加一层 LogSoftmax 层预测最终输出。
import torch.nn as nn
class RNN(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(RNN, self).__init__()
self.hidden_size = hidden_size
self.i2h = nn.Linear(input_size + hidden_size, hidden_size)
self.i2o = nn.Linear(input_size + hidden_size, output_size)
self.softmax = nn.LogSoftmax(dim=1)
def forward(self, input, hidden):
combined = torch.cat((input, hidden), 1)
hidden = self.i2h(combined)
output = self.i2o(combined)
output = self.softmax(output)
return output, hidden
def initHidden(self):
return torch.zeros(1, self.hidden_size)
n_hidden = 128
rnn = RNN(n_letters, n_hidden, n_categories)
1.2.2 训练
- 构建输入和目标张量
- 构建0初始化的隐藏状态
- 读入每一个字母
- 将当前隐藏状态传递给下一字母
- 比较最终结果和目标
- 反向传播
- 返回结果和损失
learning_rate = 0.005 # If you set this too high, it might explode. If too low, it might not learn
def train(category_tensor, line_tensor):
hidden = rnn.initHidden()
rnn.zero_grad()
for i in range(line_tensor.size()[0]):
output, hidden = rnn(line_tensor[i], hidden)
loss = criterion(output, category_tensor)
loss.backward()
# 将参数的梯度添加到其值中,乘以学习速率
for p in rnn.parameters():
p.data.add_(-learning_rate, p.grad.data)
return output, loss.item()
- 现在我们只需要准备一些例子来运行程序。由于train函数同时返回输出和损失,我们可以打印其输出结果并跟踪其损失画图。由于有1000个 示例,我们每print_every次打印样例,并求平均损失。
import time
import math
n_iters = 100000
print_every = 5000
plot_every = 1000
# 跟踪绘图的损失
current_loss = 0
all_losses = []
def timeSince(since):
now = time.time()
s = now - since
m = math.floor(s / 60)
s -= m * 60
return '%dm %ds' % (m, s)
start = time.time()
for iter in range(1, n_iters + 1):
category, line, category_tensor, line_tensor = randomTrainingExample()
output, loss = train(category_tensor, line_tensor)
current_loss += loss
# 打印迭代的编号,损失,名字和猜测
if iter % print_every == 0:
guess, guess_i = categoryFromOutput(output)
correct = '✓' if guess == category else '✗ (%s)' % category
print('%d %d%% (%s) %.4f %s / %s %s' % (iter, iter / n_iters * 100, timeSince(start), loss, line, guess, correct))
# 将当前损失平均值添加到损失列表中
if iter % plot_every == 0:
all_losses.append(current_loss / plot_every)
current_loss = 0
3 ResNet深度残差网络
3.1 原理
- 通过加一个shortcut以保证输出的结果最差是原来的结果,以使得网络可以叠加到很深的程度。
3.2 效果
- 可以看到层数增加之后效果可以变得更好,而不会出现衰减的状况。
- 可以看到建立的深度神经网络,这个网络在各个大比赛都有很好的表现。
- 可以看到效果展示方面,随着层数的增多参数数量急剧增长,错误率先增加后降低。
- 效果可视化总览图
3.3 代码解读
- 导入库函数,建立resnet对象
- 整体思路就是把每个stage保存为一个单元。在制作每一个stage的时候,把这stage中的 每个Basicblock,按照顺序append到一个 列表layer 中,当添加完这个stage中的所有block,把这个列表放入nn.Sequential()中,就把构建好的这个stage网络模型放到了计算图中。
from torchvision.models import *
m1 = resnet34()
- 建立十八层ResNet
def resnet18(pretrained=False, **kwargs):
"""Constructs a ResNet-18 model.
Args:
pretrained (bool): If True, returns a model pre-trained on ImageNet
"""
model = ResNet(BasicBlock, [2, 2, 2, 2], **kwargs)
if pretrained:
model.load_state_dict(model_zoo.load_url(model_urls['resnet18']))
return model
- 度量准则深度残差块的建立
class BasicBlock(nn.Module):
expansion = 1
def __init__(self, inplanes, planes, stride=1, downsample=None):
super(BasicBlock, self).__init__()
self.conv1 = conv3x3(inplanes, planes, stride)
self.bn1 = nn.BatchNorm2d(planes)
self.relu = nn.ReLU(inplace=True)
self.conv2 = conv3x3(planes, planes)
self.bn2 = nn.BatchNorm2d(planes)
self.downsample = downsample
self.stride = stride
def forward(self, x):
residual = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
if self.downsample is not None:
residual = self.downsample(x)
out += residual
out = self.relu(out)
return out