使用之前
概述
- NNI是一个工具包。用户自己选择一种算法或网络模型,NNI可以有效地帮助用户设计并调优机器学习模型的神经网络结构(目前只支持NAS中的PPO Tuner)、参数输出lr、batch_size等
- NNI易于使用,可扩展,灵活高效
- NNI的训练平台可以是本机、远程服务器组或训练集群
主要概念
- trial:使用某一组配置的一次运行
- configuration:超参数的搜索范围(可以由用户手动选择)
- tuner:一种自动机器学习算法,为下一个trial生成新的超参数组合
- experiment:一次任务,由trial和tuner组成
- assessor:评估器,判断trial是否应该被提前终止
运行过程
开始一次experiment
自动超参数调优
定义搜索空间
{
"batch_size": {"_type":"choice", "_value": [16, 32, 64, 128]},
"hidden_size":{"_type":"choice","_value":[128, 256, 512, 1024]},
"lr":{"_type":"choice","_value":[0.0001, 0.001, 0.01, 0.1]},
"momentum":{"_type":"uniform","_value":[0, 1]}//动量:计算梯度的指数加权平均数,并以此来更新权重,它的运行速度几乎总是快于标准的梯度下降算法。
}
改变模型代码
- 按照所需要的网络模型编写或修改,如下为pytorch版本的网络结构
class Net(nn.Module):
def __init__(self, hidden_size):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 20, 5, 1)
self.conv2 = nn.Conv2d(20, 50, 5, 1)
self.fc1 = nn.Linear(4*4*50, hidden_size)
self.fc2 = nn.Linear(hidden_size, 10)
def forward(self, x):
x = F.relu(self.conv1(x)) # 线性整流
x = F.max_pool2d(x, 2, 2) # 池化
x = F.relu(self.conv2(x))
x = F.max_pool2d(x, 2, 2)
x = x.view(-1, 4*4*50)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return F.log_softmax(x, dim=1) # log(softmax(x)),softmax输出的是概率分布,总和为1。
定义实验配置
searchSpaceFile: search_space.json
trialCommand: python3 mnist.py # ubuntu:"python3",win10:"python"
trialGpuNumber: 0
trialConcurrency: 1
tuner:
name: TPE # TPE,Random,Anneal(退火),Evolution(进化),Grid Search(网格搜索)...
#优缺点:https://nni.readthedocs.io/zh/stable/Tuner/BuiltinTuner.html
classArgs:
optimize_mode: maximize # minimize
trainingService:
platform: local
运行
nnictl create --config config.yml
神经网络架构搜索
改变模型代码
class Net(nn.Module):
def __init__(self, hidden_size):
super(Net, self).__init__()
# 给conv1配置两个选择,使用有序字典的形式创建选择,便于后续直观地看到结果
self.conv1 = LayerChoice(OrderedDict([
("conv5x5", nn.Conv2d(1, 20, 5, 1)),
("conv3x3", nn.Conv2d(1, 20, 3, 1))
]), key='first_conv')
# 给conv2配置两个选择,直接配置不搞花哨,代码简洁
self.mid_conv = LayerChoice([
nn.Conv2d(20, 20, 3, 1, padding=1),
nn.Conv2d(20, 20, 5, 1, padding=2)
], key='mid_conv')
self.conv2 = nn.Conv2d(20, 50, 5, 1)
self.fc1 = nn.Linear(4*4*50, hidden_size)
self.fc2 = nn.Linear(hidden_size, 10)
# skip connection over mid_conv
self.input_switch = InputChoice(n_candidates=2,
n_chosen=1,
key='skip')
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.max_pool2d(x, 2, 2)
old_x = x
x = F.relu(self.mid_conv(x))
zero_x = torch.zeros_like(old_x)
skip_x = self.input_switch([zero_x, old_x])
x = torch.add(x, skip_x)
x = F.relu(self.conv2(x))
x = F.max_pool2d(x, 2, 2)
x = x.view(-1, 4*4*50)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return F.log_softmax(x, dim=1)
生成搜索空间
nnictl ss_gen --trial_command="python mnist.py" --trial_dir=./ --file=search.json
- 以下内容完全是自动化生成的,可以看到给网络层创建的所有选择
{
"first_conv": {
"_type": "layer_choice",
"_value": [
"conv5x5",
"conv3x3"
]
},
"mid_conv": {
"_type": "layer_choice",
"_value": [
"0",
"1"
]
},
"skip": {
"_type": "input_choice",
"_value": {
"candidates": [
"",
""
],
"n_chosen": 1
}
}
}
定义实验配置
authorName: default
experimentName: example_mnist
trialConcurrency: 1
maxExecDuration: 1h
maxTrialNum: 10
#choice: local, remote, pai
trainingServicePlatform: local
searchSpacePath: search.json
useAnnotation: False
tuner:
codeDir: ../../../tuners/random_nas_tuner
classFileName: random_nas_tuner.py
className: RandomNASTuner
trial:
command: python3 mnist.py
codeDir: .
gpuNum: 0
运行
nnictl create --config config_random_search.yml