GFnet代码阅读笔记

github: 项目地址
论文: https://arxiv.org/abs/2010.05300
本文记录了在阅读GFNet源码时的一些笔记 包括train.py utils.py configs.py network.py

train.py

库的声明

import time
import torchvision.transforms as transforms
import torchvision.datasets as datasets
import torch.multiprocessing
torch.multiprocessing.set_sharing_strategy(‘file_system’)

这里multiprocessing模块用于不同进程的共享视图,
def set_sharing_strategy(new_strategy):
“”“Sets the strategy for sharing CPU tensors.
Arguments:
new_strategy (str): Name of the selected strategy. Should be one of
the values returned by :func:get_all_sharing_strategies().
“””
_global _sharing_strategy
assert new_strategy in _all_sharing_strategies
_sharing_strategy = new_strategy

这里的参数应该在_all_sharing_strategies中,_all_sharing_strategies的定义如下
if sys.platform == ‘darwin’ or sys.platform == ‘win32’:
_sharing_strategy = ‘file_system’
_all_sharing_strategies = {‘file_system’}
else:
_sharing_strategy = ‘file_descriptor’
_all_sharing_strategies = {‘file_descriptor’, ‘file_system’}
在darwin和win32平台下只有file_system
在其他平台下还有一个file_descriptor

from utils import *
from network import *
from configs import *

以上几个模块都是自定义的

import math
import argparse

这里argparse库用于命令行参数解析

import models.resnet as resnet
import models.densenet as densenet
from models import create_model

以上几个类都是在models中自定义的

参数解析

parser = argparse.ArgumentParser(description=‘Training code for GFNet’)

创建argparse.ArgumentParser类的实例

parser.add_argument(‘–data_url’, default=‘./data’, type=str,
help=‘path to the dataset (ImageNet)’)

parser.add_argument(‘–work_dirs’, default=‘./output’, type=str,
help=‘path to save log and checkpoints’)

parser.add_argument(‘–train_stage’, default=-1, type=int,
help=‘select training stage, see our paper for details
stage-1 : warm-up
stage-2 : learn to select patches with RL
stage-3 : finetune CNNs’)

parser.add_argument(‘–model_arch’, default=‘’, type=str,
help=‘architecture of the model to be trained
resnet50 / resnet101 /
densenet121 / densenet169 / densenet201 /
regnety_600m / regnety_800m / regnety_1.6g /
mobilenetv3_large_100 / mobilenetv3_large_125 /
efficientnet_b2 / efficientnet_b3’)

parser.add_argument(‘–patch_size’, default=96, type=int,
help=‘size of local patches (we recommend 96 / 128 / 144)’)

parser.add_argument(‘–T’, default=4, type=int,
help=‘maximum length of the sequence of Glance + Focus’)

parser.add_argument(‘–print_freq’, default=100, type=int,
help=‘the frequency of printing log’)

parser.add_argument(‘–model_prime_path’, default=‘’, type=str,
help=‘path to the pre-trained model of Global Encoder (for training stage-1)’)

parser.add_argument(‘–model_path’, default=‘’, type=str,
help=‘path to the pre-trained model of Local Encoder (for training stage-1)’)

parser.add_argument(‘–checkpoint_path’, default=‘’, type=str,
help=‘path to the stage-2/3 checkpoint (for training stage-2/3)’)

parser.add_argument(‘–resume’, default=‘’, type=str,
help=‘path to the checkpoint for resuming’)

这里是对命令行参数的定义与解析, 其中add_argument()方法第一项参数为输入的参数名,default为参数缺省默认值, type为读入参数的类型, help中可以写入该参数的作用和用法
参数:

  • –data_url

path to the dataset (ImageNet)

  • –work_dirs

输出日志和checkpoints的path 默认为./output

  • –train_stage

训练阶段 1,2,3

  • –model_arch

模型结构,resnet50 / resnet101 /
densenet121 / densenet169 / densenet201 /
regnety_600m / regnety_800m / regnety_1.6g /
mobilenetv3_large_100 / mobilenetv3_large_125 /
efficientnet_b2 / efficientnet_b3’这些是作者写在项目里的结构

  • –patch_size

字面意思

  • –T

字面意思

  • –print_freq

字面意思

  • –model_prime_path

全局encoder的模型位置

  • –model_path

local encoder模型位置

  • –checkpoint_path

字面意思

  • –resume

应该是重新训练?

args = parser.parse_args()
最后返回参数列表,使用的时候通过"args.参数名"调用

def main()

if not os.path.isdir(args.work_dirs):
mkdir_p(args.work_dirs)
检查工作目录是否存在
record_path = args.work_dirs + ‘/GF-’ + str(args.model_arch)
+ ‘_patch-size-’ + str(args.patch_size)
+ ‘_T’ + str(args.T)
+ ‘_train-stage’ + str(args.train_stage)
定义模型的path
if not os.path.isdir(record_path):
mkdir_p(record_path)
record_file = record_path + ‘/record.txt’
检查目录是否存在
model_configuration = model_configurations[args.model_arch]

这里model_configurations定义在configs.py中
model_configurations = {
‘resnet50’: {
‘feature_num’: 2048,
‘feature_map_channels’: 2048,
‘policy_conv’: False,
‘policy_hidden_dim’: 1024,
‘fc_rnn’: True,
‘fc_hidden_dim’: 1024,
‘image_size’: 32,
‘crop_pct’: 0.875,
‘dataset_interpolation’: Image.BILINEAR,
‘prime_interpolation’: ‘bicubic’
},
… …
这里是其他的模型名字
}
假如model_arch参数是resnet50
model_configuration即为
‘resnet50’: {
‘feature_num’: 2048,
‘feature_map_channels’: 2048,
‘policy_conv’: False,
‘policy_hidden_dim’: 1024,
‘fc_rnn’: True,
‘fc_hidden_dim’: 1024,
‘image_size’: 32,
‘crop_pct’: 0.875,
‘dataset_interpolation’: Image.BILINEAR,
‘prime_interpolation’: ‘bicubic’
}
这样一个字典

if ‘resnet’ in args.model_arch:
model_arch = ‘resnet’
model = resnet.resnet50(pretrained=False, num_classes=10)
model_prime = resnet.resnet50(pretrained=False, num_classes=10)
elif ‘densenet’ in args.model_arch:
model_arch = ‘densenet’
model = eval(‘densenet.’ + args.model_arch)(pretrained=False)
model_prime = eval(‘densenet.’ + args.model_arch)(pretrained=False)
elif ‘efficientnet’ in args.model_arch:
model_arch = ‘efficientnet’
model = create_model(args.model_arch, pretrained=False, num_classes=1000,
drop_rate=0.3, drop_connect_rate=0.2)
model_prime = create_model(args.model_arch, pretrained=False, num_classes=1000,
drop_rate=0.3, drop_connect_rate=0.2)
elif ‘mobilenetv3’ in args.model_arch:
model_arch = ‘mobilenetv3’
model = create_model(args.model_arch, pretrained=False, num_classes=1000,
drop_rate=0.2, drop_connect_rate=0.2)
model_prime = create_model(args.model_arch, pretrained=False, num_classes=1000,
drop_rate=0.2, drop_connect_rate=0.2)
elif ‘regnet’ in args.model_arch:
model_arch = ‘regnet’
import pycls.core.model_builder as model_builder
from pycls.core.config import cfg
cfg.merge_from_file(model_configuration[‘cfg_file’])
cfg.freeze()
model = model_builder.build_model()
model_prime = model_builder.build_model()

这里通过模型名字来确定用哪个模型的类创建实例

fc = Full_layer(model_configuration[‘feature_num’],
model_configuration[‘fc_hidden_dim’],
model_configuration[‘fc_rnn’])
创建全连接层,参数由model_configuration给出
if args.train_stage == 1:
model.load_state_dict(torch.load(args.model_path))
model_prime.load_state_dict(torch.load(args.model_prime_path))
如果是stage1 则导入预训练好的global encoder和local encoder的权值
else:
checkpoint = torch.load(args.checkpoint_path)
model.load_state_dict(checkpoint[‘model_state_dict’])
model_prime.load_state_dict(checkpoint[‘model_prime_state_dict’])
fc.load_state_dict(checkpoint[‘fc’])
如果是stage2,3则导入在前一阶段训练好的checkpoints

train_configuration = train_configurations[model_arch]

这里train_configurations依然也是在config.py中定义的
‘resnet’: {
‘backbone_lr’: 0.01,
‘fc_stage_1_lr’: 0.1,
‘fc_stage_3_lr’: 0.01,
‘weight_decay’: 1e-4,
‘momentum’: 0.9,
‘Nesterov’: True,
# ‘batch_size’: 256,
‘batch_size’:32,
‘dsn_ratio’: 1,
‘epoch_num’: 60,
‘train_model_prime’: True
},定义了一些训练时用到的参数

if args.train_stage != 2:
if train_configuration[‘train_model_prime’]:
optimizer = torch.optim.SGD([{‘params’: model.parameters()},
{‘params’: model_prime.parameters()},
{‘params’: fc.parameters()}],
lr=0, # specify in adjust_learning_rate()
momentum=train_configuration[‘momentum’],
nesterov=train_configuration[‘Nesterov’],
weight_decay=train_configuration[‘weight_decay’])
else:
optimizer = torch.optim.SGD([{‘params’: model.parameters()},
{‘params’: fc.parameters()}],
lr=0, # specify in adjust_learning_rate()
momentum=train_configuration[‘momentum’],
nesterov=train_configuration[‘Nesterov’],
weight_decay=train_configuration[‘weight_decay’])
training_epoch_num = train_configuration[‘epoch_num’]
else:
optimizer = None
training_epoch_num = 15

这里定义的是优化方法, 由于stage2使用强化学习算法 所以在13stage中定义 optimizer = torch.optim.SGD(…)
第一个参数是一个列表,每个元素是一个字典 字典中有一个key-value对 key是’params’ value是model的parameters()方法 这个方法是torch.nn.module.py中定义的 返回模型参数的迭代器
lr是学习率但是这里设置为0 注解说是后面会用adjust_learning_rate()去定义这个参数
nesterov类似动量, 是一个改进的梯度下降算法中的一个超参
weight_decay权值衰减 防止过拟合
如果训练global encoder 则在SGD参数中增加model_prime的参数
如果是stage2的话 定义优化器为None training_epoch_num = 15

criterion = nn.CrossEntropyLoss().cuda()
这里调用torch里的交叉熵 其中CrossEntropyLoss()继承自torch.nn.loss._WeightedLoss,这个类继承自_Loss 这个类继承自Module 这个类是基类 .cuda()也是在这个类上定义的 子类继承了这个方法
model = nn.DataParallel(model.cuda())
model_prime = nn.DataParallel(model_prime.cuda())
fc = fc.cuda()
前两行是使用多GPU运算 第三行使用cuda加速fc的计算
traindir = args.data_url + ‘train/’
valdir = args.data_url + ‘val/’
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
这里是数据集的path normalize是数据集预处理过程中对像素归一化处理
transform = transforms.Compose([
transforms.RandomResizedCrop((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
使用transforms对数据预处理,裁剪为224*224 然后转换为tensor类型 然后归一化处理
train_set = datasets.ImageFolder(traindir, transforms.Compose([
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
normalize,
]))

这里就是定义的train_set 采用torchvision的datasets.ImageFolder处理,输入trainset的路径和transforms中的参数
transforms.RandomHorizontalFlip()的作用是随机水平翻转 默认概率为0.5

train_set_index = torch.randperm(len(train_set))
打乱索引

train_loader = torch.utils.data.DataLoader(train_set, batch_size=256, num_workers=32, pin_memory=False,
sampler=torch.utils.data.sampler.SubsetRandomSampler(
train_set_index[:]))

val_loader = torch.utils.data.DataLoader(
datasets.ImageFolder(valdir, transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
normalize, ])),
batch_size=train_configuration[‘batch_size’], shuffle=False, num_workers=32, pin_memory=False)

使用DataLoader将trainset和valset加载
if args.train_stage != 1:
state_dim = model_configuration[‘feature_map_channels’] * math.ceil(args.patch_size / 32) * math.ceil(
args.patch_size / 32)
ppo = PPO(model_configuration[‘feature_map_channels’], state_dim,
model_configuration[‘policy_hidden_dim’], model_configuration[‘policy_conv’])

在stage2,3中 输入ppo模型的维度为CHW
model_configuration[‘feature_map_channels’]就是通道数
math.ceil(args.patch_size / 32)这个是patch_size除以32再取向下取整
ppo模型在network.py中定义
有两个方法 一个是select_action 一个是update

if args.train_stage == 3:
ppo.policy.load_state_dict(checkpoint[‘policy’])
ppo.policy_old.load_state_dict(checkpoint[‘policy’])

第三阶段 加载上阶段训练好的checkpoint的参数

else:
ppo = None
memory = Memory()
第一阶段不需要ppo所以设置为None

Memory类在network.py中定义如下
class Memory:
def init(self):
self.actions = []
self.states = []
self.logprobs = []
self.rewards = []
self.is_terminals = []
self.hidden = []

def clear_memory(self):
    del self.actions[:]
    del self.states[:]
    del self.logprobs[:]
    del self.rewards[:]
    del self.is_terminals[:]
    del self.hidden[:]

if args.resume:
resume_ckp = torch.load(args.resume)

start_epoch = resume_ckp['epoch']
print('resume from epoch: {}'.format(start_epoch))

model.module.load_state_dict(resume_ckp['model_state_dict'])
model_prime.module.load_state_dict(resume_ckp['model_prime_state_dict'])
fc.load_state_dict(resume_ckp['fc'])

if optimizer:
    optimizer.load_state_dict(resume_ckp['optimizer'])

if ppo:
    ppo.policy.load_state_dict(resume_ckp['policy'])
    ppo.policy_old.load_state_dict(resume_ckp['policy'])
    ppo.optimizer.load_state_dict(resume_ckp['ppo_optimizer'])

best_acc = resume_ckp['best_acc']

else:
start_epoch = 0
best_acc = 0
感觉这段没什么用 跳过

for epoch in range(start_epoch, training_epoch_num):
这下开始epoch循环了
if args.train_stage != 2:
print(‘Training Stage: {}, lr:’.format(args.train_stage))
adjust_learning_rate(optimizer, train_configuration,
epoch, training_epoch_num, args)
如果不是stage2 打印stage和lr
这里lr使用utils.py中的adjust_learning_rate()函数进行调整

这是adjust_learning_rate函数的定义
def adjust_learning_rate(optimizer, train_configuration, epoch, training_epoch_num, args):
_“”“Sets the learning rate”“”

_backbone_lr = 0.5 * train_configuration['backbone_lr'] * \
              (1 + math.cos(math.pi * epoch / training_epoch_num))

这里采用了余弦半周期decay的方法进行调整学习率
if args.train_stage == 1:
fc_lr = 0.5 * train_configuration[‘fc_stage_1_lr’] *
(1 + math.cos(math.pi * epoch / training_epoch_num))
elif args.train_stage == 3:
fc_lr = 0.5 * train_configuration[‘fc_stage_3_lr’] *
(1 + math.cos(math.pi * epoch / training_epoch_num))
这里根据stage1或者3确定fc层的learning rate 也是余弦法
if train_configuration[‘train_model_prime’]:
optimizer.param_groups[0][‘lr’] = backbone_lr
optimizer.param_groups[1][‘lr’] = backbone_lr
optimizer.param_groups[2][‘lr’] = fc_lr
这里param_groups在torch.optim.optimizer.pyi中定义的
‘’’
class Optimizer:
defaults: dict
state: dict
param_groups: List[dict]
‘’’
上面即param_groups的定义 这个param_groups在SGD中被赋值 SGD是Optimizer的子类

else:
    optimizer.param_groups[0]['lr'] = backbone_lr
    optimizer.param_groups[1]['lr'] = fc_lr

for param_group in optimizer.param_groups:
    print(param_group['lr'])

lr的传入路径是 在定义SGD的时候输入的参数lr 然后在SGD的init函数中定义
defaults = dict(lr=lr, momentum=momentum, dampening=dampening,
weight_decay=weight_decay, nesterov=nesterov)
然后调用super方法将defaults传给父类Optimizer
在Optimizer类中add_param_group类中会将defaults的内容传给param_groups

else:
print(‘Training Stage: {}, train ppo only’.format(args.train_stage))
打印第二阶段的输出信息
train(model_prime, model, fc, memory, ppo, optimizer, train_loader, criterion,
args.print_freq, epoch, train_configuration[‘batch_size’], record_file, train_configuration, args)

acc = validate(model_prime, model, fc, memory, ppo, optimizer, val_loader, criterion,
args.print_freq, epoch, train_configuration[‘batch_size’], record_file, train_configuration,
args)
这两句在train和validate函数中再看
if acc > best_acc:
best_acc = acc
is_best = True
else:
is_best = False
取最大值作为最佳的准确率

save_checkpoint({
‘epoch’: epoch + 1,
‘model_state_dict’: model.module.state_dict(),
‘model_prime_state_dict’: model_prime.module.state_dict(),
‘fc’: fc.state_dict(),
‘acc’: acc,
‘best_acc’: best_acc,
‘optimizer’: optimizer.state_dict() if optimizer else None,
‘ppo_optimizer’: ppo.optimizer.state_dict() if ppo else None,
‘policy’: ppo.policy.state_dict() if ppo else None,
}, is_best, checkpoint=record_path)

这个函数在utils.py中定义
def save_checkpoint(state, is_best, checkpoint=‘checkpoint’, filename=‘checkpoint.pth.tar’):
filepath = checkpoint + ‘/’ + filename
torch.save(state, filepath)
if is_best:
shutil.copyfile(filepath, checkpoint + ‘/model_best.pth.tar’)
这里会对is_best做一个检测 如果准确率最高则复制进另一个model文件
state是一个字典 存储了状态信息 调用 torch.save(state, filepath)保存

def train()

def train(model_prime, model, fc, memory, ppo, optimizer, train_loader, criterion,
print_freq, epoch, batch_size, record_file, train_configuration, args):
batch_time = AverageMeter()

utils中定义了这个函数
class AverageMeter(object):
“”“Computes and stores the average and current value”“”
计算并存储平均值和当前值
_def init(self):
self.reset()
初始把value ave sum count设置为0
def reset(self):
self.value = 0
self.ave = 0
self.sum = 0
self.count = 0

def update(self, val, n=1):
    self.value = val
    self.sum += val * n
    self.count += n
    self.ave = self.sum / self.count

update有两个值 一个是value 一个是n
sum记录value*n的和
count记录update的n的总数
ave计算平均值

losses = [AverageMeter() for _ in range(args.T)]

这里用了一个列表生成器 生成了T个AverageMeter()实例 T应该就是focus阶段的次数

top1 = [AverageMeter() for _ in range(args.T)]

这里与上一句一样

reward_list = [AverageMeter() for _ in range(args.T - 1)]

reward_list设置的大小是T-1

train_batches_num = len(train_loader)

train_loader是在main()中使用Dataloader类创建的一个实例
在DataLoader类中定义了__len__() 就是定义了len函数对train_loader返回的值的定义
def len(self):
if self._dataset_kind == _DatasetKind.Iterable:

    length = self._IterableDataset_len_called = len(self.dataset)
    if self.batch_size is not None:  # IterableDataset doesn't allow custom sampler or batch_sampler
        from math import ceil
        if self.drop_last:
            length = length // self.batch_size
        else:
            length = ceil(length / self.batch_size)
    return length
else:
    return len(self._index_sampler)

可以看到是返回的数据长度除以batch size 应该表示的就是传入的batch的数量

if args.train_stage == 2:
model_prime.eval()
model.eval()
fc.eval()

model.eval()的作用 把bn层和dropout层设置为预测的状态 应该是就不进行训练了

else:
if train_configuration[‘train_model_prime’]:
model_prime.train()

如果设置了训练global encoder 那就把它设置成train的状态

else:
    model_prime.eval()
model.train()
fc.train()

在1,3阶段设置为train状态

if ‘resnet’ in args.model_arch or ‘densenet’ in args.model_arch or ‘regnet’ in args.model_arch:
dsn_fc_prime = model_prime.module.fc
dsn_fc = model.module.fc
else:
dsn_fc_prime = model_prime.module.classifier
dsn_fc = model.module.classifier

设置了dsn_fc和dsn_fc_prime 虽然不知道是干什么的

fd = open(record_file, ‘a+’)

fd是record_file的文件 record_file是train()的其中一个参数 具体定义在main()函数前几行

end = time.time()

记录当前时间戳

for i, (x, target) in enumerate(train_loader):

开始对train_loader进行遍历

loss_cla = []
loss_list_dsn = []

target_var = target.cuda()
input_var = x.cuda()

使target和input的方差用GPU计算

input_prime = get_prime(input_var, args.patch_size)

utils中定义了get_prime函数
def get_prime(images, patch_size, interpolation=‘bicubic’):
_“”“Get down-sampled original image”“”
_prime = F.interpolate(images, size=[patch_size, patch_size], mode=interpolation, align_corners=True)
return prime
这里的F是torch.nn.functional
下采样
这里采用的插值算法为bicubic

if train_configuration[‘train_model_prime’] and args.train_stage != 2:
output, state = model_prime(input_prime)

这句没看懂
应该是用模型计算前向传播?暂且这么理解吧

assert 'resnet' in args.model_arch or 'densenet' in args.model_arch or 'regnet' in args.model_arch
output_dsn = dsn_fc_prime(output)
output = fc(output, restart=True)

如果在1,3阶段需要训练global encoder
判断用的arch是不是resnet或者densenet或者regnet
如果是的话计算前向传播(?)

else:
with torch.no_grad():

不计算梯度

    output, state = model_prime(input_prime)
    if 'resnet' in args.model_arch or 'densenet' in args.model_arch or 'regnet' in args.model_arch:
        output_dsn = dsn_fc_prime(output)
        output = fc(output, restart=True)

这段和前面一样 如果是御三家就计算前向传播

    else:
        _ = fc(output, restart=True)
        output = model_prime.module.classifier(output)
        output_dsn = output

没看懂这里 之前定义了dsn_fc_prime = model_prime.module.classifier
但是这里没用

loss_prime = criterion(output, target_var)
loss_cla.append(loss_prime)

计算交叉熵 criterion的定义:
criterion = nn.CrossEntropyLoss().cuda()
然后加入到loss_cla这个list里面

loss_dsn = criterion(output_dsn, target_var)
loss_list_dsn.append(loss_dsn)

也是在计算loss
但不知道这个dsn和cla是什么意思

losses[0].update(loss_prime.data.item(), x.size(0))
acc = accuracy(output, target_var, topk=(1,))
top1[0].update(acc.sum(0).mul_(100.0 / batch_size).data.item(), x.size(0))

这里的losses是losses = [AverageMeter() for _ in range(args.T)]
这个AverageMeter类保存了值和均值等信息
update方法把对象的value值更新为输入的参数
x.size(0)是输入数据的batchsize
accuracy()是utils中定义的方法
计算top-k的准确率
保存top1准确率到top1[0]

confidence_last = torch.gather(F.softmax(output.detach(), 1), dim=1, index=target_var.view(-1, 1)).view(1, -1)

到这一步应该是计算完glance阶段了

for patch_step in range(1, args.T):

下面开始对T进行遍历 开始focus阶段

if args.train_stage == 1:
action = torch.rand(x.size(0), 2).cuda()

如果是stage1 输出[0,1)均匀分布的随机采样
输出向量数量为x.size(0)个 维度为2
这里应该就是随机输出patch的坐标

else:
if patch_step == 1:
action = ppo.select_action(state.to(0), memory, restart_batch=True)
else:
action = ppo.select_action(state.to(0), memory)

在stage23里 采用ppo策略给出action
这个to(0)不是很懂什么意思
还有state是(model_prime(input_prime))[1]也不知道是什么

patches = get_patch(input_var, action, args.patch_size)

input_var=x.cuda()
根据action裁剪crop

if args.train_stage != 2:
output, state = model(patches)
output_dsn = dsn_fc(output)
output = fc(output, restart=False)
else:
with torch.no_grad():
output, state = model(patches)
output_dsn = dsn_fc(output)
output = fc(output, restart=False)

计算feed-forward

loss = criterion(output, target_var)
loss_cla.append(loss)
losses[patch_step].update(loss.data.item(), x.size(0))
loss_dsn = criterion(output_dsn, target_var)
loss_list_dsn.append(loss_dsn)

计算loss 和前面一样

acc = accuracy(output, target_var, topk=(1,))
top1[patch_step].update(acc.sum(0).mul_(100.0 / batch_size).data.item(), x.size(0))

计算top1

confidence = torch.gather(F.softmax(output.detach(), 1), dim=1, index=target_var.view(-1, 1)).view(1, -1)
reward = confidence - confidence_last
confidence_last = confidence

计算两次概率的差作为reward

reward_list[patch_step - 1].update(reward.data.mean(), x.size(0))
memory.rewards.append(reward)

保存reward

loss = (sum(loss_cla) + train_configuration[‘dsn_ratio’] * sum(loss_list_dsn)) / args.T

结束focus之后计算总loss

if args.train_stage != 2:
optimizer.zero_grad()
loss.backward()
optimizer.step()
else:
ppo.update(memory)

在stage1,3中计算反向传播
在stage2中更新策略

memory.clear_memory()

重置memory中的信息

batch_time.update(time.time() - end)
end = time.time()

计算一个batch的时间

if (i + 1) % print_freq == 0 or i == train_batches_num - 1:
string = (‘Epoch: [{0}][{1}/{2}]\t’
‘Time {batch_time.value:.3f} ({batch_time.ave:.3f})\t’
‘Loss {loss.value:.4f} ({loss.ave:.4f})\t’.format(
epoch, i + 1, train_batches_num, batch_time=batch_time, loss=losses[-1]))
print(string)
fd.write(string + ‘\n’)

_acc = [acc.ave for acc in top1]
print('accuracy of each step:')
print(_acc)
fd.write('accuracy of each step:\n')
fd.write(str(_acc) + '\n')

_reward = [reward.ave for reward in reward_list]
print('reward of each step:')
print(_reward)
fd.write('reward of each step:\n')
fd.write(str(_reward) + '\n')

打印本次batch的计算信息

fd.close()

关闭record_file

def validate()

def validate(model_prime, model, fc, memory, ppo, _, val_loader, criterion,
print_freq, epoch, batch_size, record_file, __, args):
和train()基本上一样
返回准确率的平均值

utils.py

库的声明

import os
import errno
import math
import shutil

shutil库用于对压缩包的操作

import torch
import torch.nn as nn //这个没用
import torch.nn.functional as F

def mkdir_p()

def mkdir_p(path):
_‘’‘make dir if not exist’‘’
_try:
os.mkdir(path)
except OSError as exc: # Python >2.5
if exc.errno == errno.EEXIST and os.path.isdir(path):
pass
else:
raise

如果没有则创建一个文件夹

class AverageMeter()

class AverageMeter(object):
_“”“Computes and stores the average and current value”“”

_

def init()

def __init__(self):
    self.reset()

def reset()

def reset(self):
    self.value = 0
    self.ave = 0
    self.sum = 0
    self.count = 0

def update()

def update(self, val, n=1):
    self.value = val

//保存当前值
self.sum += val * n
//保存总和
self.count += n
//保存数量
self.ave = self.sum / self.count
//计算均值

def accuracy()

def accuracy(output, target, topk=(1,)):
“”“Computes the precision@k for the specified values of k”“”

计算topk

__ _maxk = max(topk)

计算topk的最大值

_, pred = output.topk(maxk, 1, True, True)
pred = pred.t()
correct = pred.eq(target.view(1, -1).expand_as(pred))
correct_k = correct[:1].view(-1).float()

return correct_k

def get_prime()

def get_patch()

def adjust_learning_rate()

def save_checkpoint()

configs.py

model_configurations={}

保存了model的配置信息
以resnet50为例子
‘resnet50’: {
‘feature_num’: 2048,
‘feature_map_channels’: 2048,
‘policy_conv’: False,
‘policy_hidden_dim’: 1024,
‘fc_rnn’: True,
‘fc_hidden_dim’: 1024,
‘image_size’: 32,
‘crop_pct’: 0.875,
‘dataset_interpolation’: Image.BILINEAR,
‘prime_interpolation’: ‘bicubic’
},

train_configurations = {}

保存了训练时的配置信息
‘resnet’: {
‘backbone_lr’: 0.01,
‘fc_stage_1_lr’: 0.1,
‘fc_stage_3_lr’: 0.01,
‘weight_decay’: 1e-4,
‘momentum’: 0.9,
‘Nesterov’: True,
# ‘batch_size’: 256,
‘batch_size’:32,
‘dsn_ratio’: 1,
‘epoch_num’: 60,
‘train_model_prime’: True
},

network.py

库的声明

import torch
import torchvision //没用到
import torch.nn as nn
import torch.nn.functional as F
import math

class Memory()

class Memory:

类似结构体
用来存储信息

def init()

def __init__(self):
    self.actions = []
    self.states = []
    self.logprobs = []
    self.rewards = []
    self.is_terminals = []
    self.hidden = []

def clear_memory()

def clear_memory(self):
    del self.actions[:]
    del self.states[:]
    del self.logprobs[:]
    del self.rewards[:]
    del self.is_terminals[:]
    del self.hidden[:]

class ActorCritic()

class ActorCritic(nn.Module):

这里应该是演员评论家的定义
继承了nn.Module

def init()

def init(self, feature_dim, state_dim, hidden_state_dim=1024, policy_conv=True, action_std=0.1):
super(ActorCritic, self).init()

调用超类的__init__()

encoder with convolution layer for MobileNetV3, EfficientNet and RegNet

if policy_conv:
self.state_encoder = nn.Sequential(
nn.Conv2d(feature_dim, 32, kernel_size=1, stride=1, padding=0, bias=False),
nn.ReLU(),
nn.Flatten(),
nn.Linear(int(state_dim * 32 / feature_dim), hidden_state_dim),
nn.ReLU()
)

定义一个卷积网络
将feature map经过一个1*1卷积变成32个channels
relu激活
展平成一维向量
经过一个线性层
relu激活

encoder with linear layer for ResNet and DenseNet

else:
self.state_encoder = nn.Sequential(
nn.Linear(state_dim, 2048),
nn.ReLU(),
nn.Linear(2048, hidden_state_dim),
nn.ReLU()
)

这里为resnet和densenet设置卷积层
linear => relu => linear => relu

self.gru = nn.GRU(hidden_state_dim, hidden_state_dim, batch_first=False)

这里GRU的实现直接调用nn.GRU
GRU的定义在torch.nn.modules.rnn.py中
参数分别是 输入的向量维度 输出的hidden state维度
hidden_state_dim默认1024
这里的的batch_fist应该是根据输入的tensor的格式调整
下面是文档注释

  • batch_first – If True, then the input and output tensors are provided as (batch, seq, feature) instead of (seq, batch, feature). Note that this does not apply to hidden or cell states. See the Inputs/Outputs sections below for details. Default: False

self.actor = nn.Sequential(
nn.Linear(hidden_state_dim, 2),
nn.Sigmoid())

actor的输入维度就是hidden_state_dim
输出维度是2
再通过一个sigmoid激活

self.critic = nn.Sequential(
nn.Linear(hidden_state_dim, 1))

定义一个评论家
就是个linear变换

self.action_var = torch.full((2,), action_std).cuda()

torch.full的作用是填充
Creates a tensor of size size filled with fill_value. The tensor’s dtype is inferred from fill_value
两个参数分别是size和full_value

self.hidden_state_dim = hidden_state_dim
self.policy_conv = policy_conv
self.feature_dim = feature_dim
self.feature_ratio = int(math.sqrt(state_dim/feature_dim))

def forward()

def forward(self):
raise NotImplementedError

好像没什么用

def act()

def act(self, state_ini, memory, restart_batch=False, training=False):
if restart_batch:
del memory.hidden[:]
memory.hidden.append(torch.zeros(1, state_ini.size(0), self.hidden_state_dim).cuda())

如果这里restart_batch是True 应该就是重置memory.hidden的数据 然后重新计算

if not self.policy_conv:
state = state_ini.flatten(1)
else:
state = state_ini

state = self.state_encoder(state)

self.state_encoder在前面定义过 是一个两层的fc

state, hidden_output = self.gru(state.view(1, state.size(0), state.size(1)), memory.hidden[-1])
memory.hidden.append(hidden_output)

调用GRU
把hidden_output添加到memory.hidden中

state = state[0]
action_mean = self.actor(state)

得到均值

cov_mat = torch.diag(self.action_var).cuda()

diag可以取矩阵对角线元素

dist = torch.distributions.multivariate_normal.MultivariateNormal(action_mean, scale_tril=cov_mat)
action = dist.sample().cuda()

这里创建一个高斯分布 均值是action_mean
然后采样得到action

if training:
action = F.relu(action)
action = 1 - F.relu(1 - action)
action_logprob = dist.log_prob(action).cuda()
memory.states.append(state_ini)
memory.actions.append(action)
memory.logprobs.append(action_logprob)
else:
action = action_mean

return action.detach()

当我们再训练网络的时候可能希望保持一部分的网络参数不变,只对其中一部分的参数进行调整;或者值训练部分分支网络,并不让其梯度对主网络的梯度造成影响,这时候我们就需要使用detach()函数来切断一些分支的反向传播

def evaluate()

class PPO

def init()

def init(self, feature_dim, state_dim, hidden_state_dim, policy_conv,
action_std=0.1, lr=0.0003, betas=(0.9, 0.999), gamma=0.7, K_epochs=1, eps_clip=0.2):
self.lr = lr
self.betas = betas
self.gamma = gamma
self.eps_clip = eps_clip
self.K_epochs = K_epochs

self.policy = ActorCritic(feature_dim, state_dim, hidden_state_dim, policy_conv, action_std).cuda()

self.optimizer = torch.optim.Adam(self.policy.parameters(), lr=lr, betas=betas)

self.policy_old = ActorCritic(feature_dim, state_dim, hidden_state_dim, policy_conv, action_std).cuda()
self.policy_old.load_state_dict(self.policy.state_dict())

self.MseLoss = nn.MSELoss()

def selection()

def select_action(self, state, memory, restart_batch=False, training=True):
return self.policy_old.act(state, memory, restart_batch, training)

这里只是把参数传给policy_ol.act()

def update()

def update(self, memory):
rewards = []
discounted_reward = 0

传入memory对象
设置rewards为一个空列表
设置discounted_reward为0

for reward in reversed(memory.rewards):
discounted_reward = reward + (self.gamma * discounted_reward)
rewards.insert(0, discounted_reward)

从后往前遍历rewards列表
在rewards列表中的0位置插入discounted_reward

rewards = torch.cat(rewards, 0).cuda()

为多个tensor进行拼接操作

rewards = (rewards - rewards.mean()) / (rewards.std() + 1e-5)

old_states = torch.stack(memory.states, 0).cuda().detach()
old_actions = torch.stack(memory.actions, 0).cuda().detach()
old_logprobs = torch.stack(memory.logprobs, 0).cuda().detach()
for _ in range(self.K_epochs):
logprobs, state_values, dist_entropy = self.policy.evaluate(old_states, old_actions)

ratios = torch.exp(logprobs - old_logprobs.detach())

advantages = rewards - state_values.detach()
surr1 = ratios * advantages
surr2 = torch.clamp(ratios, 1 - self.eps_clip, 1 + self.eps_clip) * advantages

loss = -torch.min(surr1, surr2) + 0.5 * self.MseLoss(state_values, rewards) - 0.01 * dist_entropy

self.optimizer.zero_grad()
loss.mean().backward()
self.optimizer.step()

self.policy_old.load_state_dict(self.policy.state_dict())

这段每太看懂
还需要结合论文的公式再看一下

  • 1
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 6
    评论
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值