MXNet的API
mxnet里面的model API不是真的API,它只不过是一个对ndarray的一个封装,使其更容易使用。
训练一个模型
为了训练一个模型,你需要遵循以下两步,第一步是使用symbol来构造,然后调用model.Feedforward.create这个方法来创建一个model。下面的代码创建了一个两层的神经网络。
# configure a two layer neuralnetwork data = mx.symbol.Variable('data') fc1 = mx.symbol.FullyConnected(data, name='fc1', num_hidden=128) act1 = mx.symbol.Activation(fc1, name='relu1', act_type='relu') fc2 = mx.symbol.FullyConnected(act1, name='fc2', num_hidden=64) softmax = mx.symbol.SoftmaxOutput(fc2, name='sm') # create a model model = mx.model.FeedForward.create( softmax, X=data_set, num_epoch=num_epoch, learning_rate=0.01)
你还可以使用scikit-learn一样的风格来构造和拟合一个模型
# create a model using sklearn-style two step way model = mx.model.FeedForward( softmax, num_epoch=num_epoch, learning_rate=0.01) model.fit(X=data_set)
你如果想看更多的功能,请看Model API Reference
保存模型
# save a model to mymodel-symbol.json and mymodel-0100.params prefix = 'mymodel' iteration = 100 model.save(prefix, iteration) # load model back model_loaded = mx.model.FeedForward.load(prefix, iteration)
我们往往用一个脚本进行对数据的训练,往往以前缀加序号的形式如mymodel-0100.params这样的形式保存,然后用另一个脚本加载模型,并进行预测来完成相应的功能。
阶段性的点检测(Checkpoint)
我们进行周期性的点检测是很有必要的。为了做这个,你只要简单的加一个回调函数do_checkpoint(path)在函数里面。这个训练的过程将会自动的在每次迭代的时候,在特殊的位置进行点检测。
prefix='models/chkpt' model = mx.model.FeedForward.create( softmax, X=data_set, iter_end_callback=mx.callback.do_checkpoint(prefix), ...)
你可以加载模型的点检测在使用Feedforward.load之后。
使用多个设备
简单的设置ctx,其内容为你要训练设备(cpu,gpu)的列表。
devices = [mx.gpu(i) for i in range(num_device)] model = mx.model.FeedForward.create( softmax, X=dataset, ctx=devices, ...)
这个训练过程将会通过一个并行的方式在你指定的GPUS进行。
模型API
MXNet模型模块
mxnet.model.
BatchEndParam
¶
alias of BatchEndParams
BatchEndParam是BatchEndParams的参数
mxnet.model.
save_checkpoint
(prefix, epoch, symbol, arg_params, aux_params)
Checkpoint the model data into file.
Parameters: |
|
---|
Notes
prefix-symbol.json
will be saved for symbol.prefix-epoch.params
will be saved for parameters.
类功能:对模型数据点检测后存入到文件中。
参数:
prefix(str)-模型名的前缀(可以是个文件夹)
epoch(int)-模型的epoch的数量(epoch在机器学习里面指的是把所有的样本进行一次全部操作(前向传播,反向传播等等),和普通的迭代相比,epoch的尺度比较大)
symbol(Symbol)-输入的symbol。
arg_params(一个NDArray的字符字典)-模型参数,以及网络权重字典。
aux_params(一个NDArray的字符字典)-模型参数,以及一些附加状态的字典。
Notes
prefix-symbol.json
will be saved for symbol.prefix-epoch.params
will be saved for parameters.
注意:
prefix-symbol.json将会存储symbol。
prefix-epoch.params会存储参数。
一个模型的symbol文件往往是唯一确定的,而params文件可以很多,最后你可以把一些没用的params文件给删掉。一般params的个数等于epoch的个数,因为越往后面的params越好,所以你可以只保留最后一个的params文件。
mxnet.model.
load_checkpoint
(prefix, epoch)
Load model checkpoint from file.
Parameters: |
|
---|---|
Returns: |
|
类功能:加载检测点(感觉还是翻译成检测点比较好)
参数:
prefix(str)-模型名称的前缀
epoch(int)-你想加载的模型的epoch的序号,一般是最大的那个。
返回值:
symbol(Symbol)-我们要计算网络的模型配置
arg_params(一个NDArray的字符字典)-模型参数,以及网络权重字典。
aux_params(一个NDArray的字符字典)-模型参数,以及一些附加状态的字典。
class mxnet.model.
FeedForward
(symbol, ctx=None, num_epoch=None, epoch_size=None,optimizer='sgd', initializer=<mxnet.initializer.Uniform object>, numpy_batch_size=128,arg_params=None, aux_params=None, allow_extra_params=False, begin_epoch=0, **kwargs)¶
Model class of MXNet for training and predicting feedforward nets. This class is designed for a single-data single output supervised network.
Parameters: |
|
---|
类功能:
MXNet的用来训练和预测前向传播网络的模型类。这个类设计来是为了得到一个单一输出的监督网络。
参数:
symbol(Symbol)-计算网络的symbol构造。
ctx(Context or list of Context,optional)-用来训练和预测的设备。如果要使用多个GPU,请传入gpu上下文。
num_epoch(int,optional)-训练epoches的个数。
epoch_size(int,optional)- 一个epoch里面的batch的个数。默认ceil(num_train_examples/batch_size)即训练的样本的个数/batch的大小然后取整。
optimizer(str or Optimizer,optional)-训练参数,名字或者相应的优化类用来训练的。
initializer(initializer function,optional)-训练参数,用来初始化的组合。
numpy_batch_size(int,optional)-训练集的batch尺寸。只有当输入的数组是numpy的时候需要。
arg_params(一个NDArray的字符字典)-模型参数,以及网络权重字典。
aux_params(一个NDArray的字符字典)-模型参数,以及一些附加状态的字典。
allow_extra_params(boolean,optional)-是否需要一些额外的参数,aux_params和arg_params不需要的。如果这是真的,那么就不会抛出错误当参数的个数超出所需要的参数的时候。
begin_epoch(int,optional)-开始训练的epoch,也就是说这一epoch后面的epoch都会重新训练。
kwargs(dict)-额外的关键参数被传到optimizer里面的。
predict
(X, num_batch=None, return_data=False, reset=True)¶
Run the prediction, always only use one device. :param X: :type X: mxnet.DataIter :param num_batch: the number of batch to run. Go though all batches if None :type num_batch: int or None
Returns: | y – The predicted value of the output. |
---|---|
Return type: | numpy.ndarray or a list of numpy.ndarray if the network has multiple outputs. |
类方法功能:进行预测,只能使用一个device.参数X是X类型的,batch的运行数量,如果被设置为None的话,会对里面的所有的批进行处理。
返回值:我们的预测值。
score
(X, eval_metric='acc', num_batch=None, batch_end_callback=None, reset=True)
Run the model on X and calculate the score with eval_metric :param X: :type X: mxnet.DataIter :param eval_metric: The metric for calculating score :type eval_metric: metric.metric :param num_batch: the number of batch to run. Go though all batches if None :type num_batch: int or None
Returns: | s – the final score |
---|---|
Return type: | float |
类方法功能:在X上运行模型并且用评估矩阵计算分数。
返回值:我们的最终分数。
fit
(X, y=None, eval_data=None, eval_metric='acc', epoch_end_callback=None,batch_end_callback=None, kvstore='local', logger=None, work_load_list=None, monitor=None,eval_batch_end_callback=None)
Fit the model.
Parameters: |
|
---|
类方法功能:模型拟合
参数:
X:训练集。
Y:训练集标签。可以是二维的,不过第二维是一,标签的个数需要和输入点的个数一致。
eval_data:解析数据(和javascript里面的eval函数差不多),输入应该是(vaild_data,vaild_label)
eval_metric评估矩阵
epoch_end_callback-在执行到每一epoch的结尾的时候调用。通常用来点检测。
batch_end_callback-在每一批结尾都会调用,只是为了打印出来看。
kvstore:这个通常不用改,基本上都是'local'
logger:当没有指定的时候,会用默认的logger。
work_load_list:不同设备的工作流列表,和ctx的顺序一样。
save
(prefix, epoch=None)
Checkpoint the model checkpoint into file. You can also use pickle to do the job if you only work on python. The advantage of load/save is the file is language agnostic. This means the file saved using save can be loaded by other language binding of mxnet. You also get the benefit being able to directly load/save from cloud storage(S3, HDFS)
Parameters: | prefix (str) – Prefix of model name. |
---|
Notes
prefix-symbol.json
will be saved for symbol.prefix-epoch.params
will be saved for parameters.
static load
(prefix, epoch, ctx=None, **kwargs)
Load model checkpoint from file.
Parameters: |
|
---|---|
Returns: | model – The loaded model that can be used for prediction. |
Return type: |
保存和加载的比较简单,我就不说了。
static create
(symbol, X, y=None, ctx=None, num_epoch=None, epoch_size=None,optimizer='sgd', initializer=<mxnet.initializer.Uniform object>, eval_data=None, eval_metric='acc',epoch_end_callback=None, batch_end_callback=None, kvstore='local', logger=None,work_load_list=None, eval_batch_end_callback=None, **kwargs)¶
Functional style to create a model. This function will be more consistent with functional languages such as R, where mutation is not allowed.
Parameters: |
|
---|
创建模型这个API和前面也是大同小异。
接下去的这些API不常用到
初使化的API参考
class mxnet.initializer.
Initializer
¶
Base class for Initializer.
__call__
(name, arr)
Override () function to do Initialization
Parameters: |
|
---|
class mxnet.initializer.
Load
(param, default_init=None, verbose=False)
Initialize by loading pretrained param from file or dict
Parameters: |
|
---|
class mxnet.initializer.
Mixed
(patterns, initializers)
Initialize with mixed Initializer
Parameters: |
|
---|
class mxnet.initializer.
Uniform
(scale=0.07)
Initialize the weight with uniform [-scale, scale]
Parameters: | scale (float, optional) – The scale of uniform distribution |
---|
class mxnet.initializer.
Normal
(sigma=0.01)
Initialize the weight with normal(0, sigma)
Parameters: | sigma (float, optional) – Standard deviation for gaussian distribution. |
---|
class mxnet.initializer.
Orthogonal
(scale=1.414, rand_type='uniform')
Intialize weight as Orthogonal matrix
Parameters: |
|
---|
class mxnet.initializer.
Xavier
(rnd_type='uniform', factor_type='avg', magnitude=3)
Initialize the weight with Xavier or similar initialization scheme.
Parameters: |
|
---|
评估矩阵(Evalution Metric)API
Online evaluation metric module.
mxnet.metric.
check_label_shapes
(labels, preds, shape=0)
Check to see if the two arrays are the same size.
class mxnet.metric.
EvalMetric
(name, num=None)
Base class of all evaluation metrics.
update
(label, pred)
Update the internal evaluation.
Parameters: |
|
---|
reset
()
Clear the internal statistics to initial state.
get
()
Get the current evaluation result.
Returns: |
|
---|
get_name_value
()
Get zipped name and value pairs
class mxnet.metric.
CompositeEvalMetric
(**kwargs)
Manage multiple evaluation metrics.
add
(metric)
Add a child metric.
get_metric
(index)
Get a child metric.
class mxnet.metric.
Accuracy
Calculate accuracy
class mxnet.metric.
TopKAccuracy
(**kwargs)
Calculate top k predictions accuracy
class mxnet.metric.
F1
Calculate the F1 score of a binary classification problem.
class mxnet.metric.
MAE
Calculate Mean Absolute Error loss
class mxnet.metric.
MSE
Calculate Mean Squared Error loss
class mxnet.metric.
RMSE
Calculate Root Mean Squred Error loss
class mxnet.metric.
CrossEntropy
Calculate Cross Entropy loss
class mxnet.metric.
Torch
Dummy metric for torch criterions
class mxnet.metric.
CustomMetric
(feval, name=None, allow_extra_outputs=False)
Custom evaluation metric that takes a NDArray function.
Parameters: |
|
---|
mxnet.metric.
np
(numpy_feval, name=None, allow_extra_outputs=False)
Create a customized metric from numpy function.
Parameters: |
|
---|
mxnet.metric.
create
(metric, **kwargs)
Create an evaluation metric.
Parameters: | metric (str or callable) – The name of the metric, or a function providing statistics given pred, label NDArray |
---|
优化API
Common Optimization algorithms with regularizations.
class mxnet.optimizer.
Optimizer
(rescale_grad=1.0, param_idx2name=None, wd=0.0,clip_gradient=None, learning_rate=0.01, lr_scheduler=None, sym=None)
Base class of all optimizers.
static register
(klass)
Register optimizers to the optimizer factory
static create_optimizer
(name, rescale_grad=1, **kwargs)
Create an optimizer with specified name.
Parameters: |
|
---|---|
Returns: | opt – The result optimizer. |
Return type: |
create_state
(index, weight)
Create additional optimizer state such as momentum. override in implementations.
update
(index, weight, grad, state)
Update the parameters. override in implementations
set_lr_scale
(args_lrscale)
set lr scale is deprecated. Use set_lr_mult instead.
set_lr_mult
(args_lr_mult)
Set individual learning rate multipler for parameters
Parameters: | args_lr_mult (dict of string/int to float) – set the lr multipler for name/index to float. setting multipler by index is supported for backward compatibility, but we recommend using name and symbol. |
---|
set_wd_mult
(args_wd_mult)
Set individual weight decay multipler for parameters. By default wd multipler is 0 for all params whose name doesn’t end with _weight, if param_idx2name is provided.
Parameters: | args_wd_mult (dict of string/int to float) – set the wd multipler for name/index to float. setting multipler by index is supported for backward compatibility, but we recommend using name and symbol. |
---|
mxnet.optimizer.
register
(klass)
Register optimizers to the optimizer factory
class mxnet.optimizer.
SGD
(momentum=0.0, **kwargs)
A very simple SGD optimizer with momentum and weight regularization.
Parameters: |
|
---|
create_state
(index, weight)
Create additional optimizer state such as momentum.
Parameters: | weight (NDArray) – The weight data |
---|
update
(index, weight, grad, state)
Update the parameters.
Parameters: |
---|
class mxnet.optimizer.
NAG
(**kwargs)
SGD with nesterov It is implemented according to https://github.com/torch/optim/blob/master/sgd.lua
update
(index, weight, grad, state)
Update the parameters.
Parameters: |
---|
class mxnet.optimizer.
SGLD
(**kwargs)
Stochastic Langevin Dynamics Updater to sample from a distribution.
Parameters: |
|
---|
create_state
(index, weight)
Create additional optimizer state such as momentum.
Parameters: | weight (NDArray) – The weight data |
---|
update
(index, weight, grad, state)
Update the parameters.
Parameters: |
---|
class mxnet.optimizer.
ccSGD
(momentum=0.0, **kwargs)
A very simple SGD optimizer with momentum and weight regularization. Implemented in C++.
Parameters: |
|
---|
update
(index, weight, grad, state)
Update the parameters.
Parameters: |
---|
class mxnet.optimizer.
Adam
(learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-08,decay_factor=0.99999999, **kwargs)
Adam optimizer as described in [King2014].
[King2014] | Diederik Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization,http://arxiv.org/abs/1412.6980 |
the code in this class was adapted from https://github.com/mila-udem/blocks/blob/master/blocks/algorithms/__init__.py#L765
Parameters: |
|
---|
create_state
(index, weight)
Create additional optimizer state: mean, variance
Parameters: | weight (NDArray) – The weight data |
---|
update
(index, weight, grad, state)
Update the parameters.
Parameters: |
---|
class mxnet.optimizer.
AdaGrad
(eps=1e-07, **kwargs)
AdaGrad optimizer of Duchi et al., 2011,
This code follows the version in http://arxiv.org/pdf/1212.5701v1.pdf Eq(5) by Matthew D. Zeiler, 2012. AdaGrad will help the network to converge faster in some cases.
Parameters: |
|
---|
class mxnet.optimizer.
RMSProp
(gamma1=0.95, gamma2=0.9, **kwargs)
RMSProp optimizer of Tieleman & Hinton, 2012,
This code follows the version in http://arxiv.org/pdf/1308.0850v5.pdf Eq(38) - Eq(45) by Alex Graves, 2013.
Parameters: |
|
---|
create_state
(index, weight)
Create additional optimizer state: mean, variance :param weight: The weight data :type weight: NDArray
update
(index, weight, grad, state)
Update the parameters. :param index: An unique integer key used to index the parameters
Parameters: |
---|
class mxnet.optimizer.
AdaDelta
(rho=0.9, epsilon=1e-05, **kwargs)
AdaDelta optimizer as described in Zeiler, M. D. (2012). ADADELTA: An adaptive learning rate method.
http://arxiv.org/abs/1212.5701
Parameters: |
|
---|
class mxnet.optimizer.
Test
(**kwargs)
For test use
create_state
(index, weight)
Create a state to duplicate weight
update
(index, weight, grad, state)
performs w += rescale_grad * grad
mxnet.optimizer.
create
(name, rescale_grad=1, **kwargs)
Create an optimizer with specified name.
Parameters: |
|
---|---|
Returns: | opt – The result optimizer. |
Return type: |
mxnet.optimizer.
get_updater
(optimizer)
Return a clossure of the updater needed for kvstore
Parameters: | optimizer (Optimizer) – The optimizer |
---|---|
Returns: | updater – The clossure of the updater |
Return type: | function |
--------------------- 作者:Spongelady 来源:CSDN 原文:https://blog.csdn.net/qq_25491201/article/details/51386435?utm_source=copy 版权声明:本文为博主原创文章,转载请附上博文链接!