mxnet优化器 SGD_GC

11年it研发经验,从一个会计转行为算法工程师,学过C#,c++,java,android,php,go,js,python,CNN神经网络,四千多篇博文,三千多篇原创,只为与你分享,共同成长,一起进步,关注我,给你分享更多干货知识!

mxnet优化器 sgd_gc代码:

原文:https://github.com/mnikitin/Gradient-Centralization

调用代码:

import optimizer
opt_params = {'learning_rate': 0.001}
sgd_gc = optimizer.SGDGC(gc_type='gc', **opt_params)
sgd_gcc = optimizer.SGDGC(gc_type='gcc', **opt_params)
adam_gc = optimizer.AdamGC(gc_type='gc', **opt_params)
adam_gcc = optimizer.AdamGC(gc_type='gcc', **opt_params)

 

python3 mnist.py --optimizer sgdgc --gc-type gc --lr 0.1 --seed 42
python3 mnist.py --optimizer adamgc --gc-type gcc --lr 0.001 --seed 42

 

import mxnet as mx

__all__ = []


def _register_gc_opt():
    optimizers = dict()
    for name in dir(mx.optimizer):
        obj = getattr(mx.optimizer, name)
        if hasattr(obj, '__base__') and obj.__base__ == mx.optimizer.Optimizer:
            optimizers[name] = obj
    suffix = 'GC'

    def __init__(self, gc_type='gc', **kwargs):
        assert gc_type.lower() in ['gc', 'gcc']
        self.gc_ndim_thr = 1 if gc_type.lower() == 'gc' else 3
        super(self.__class__, self).__init__(**kwargs)

    def update(self, index, weight, grad, state):
        self._gc_update_impl(
            index, weight, grad, state,
            super(self.__class__, self).update)

    def update_multi_precision(self, index, weight, grad, state):
        self._gc_update_impl(
            index, weight, grad, state,
            super(self.__class__, self).update_multi_precision)

    def _gc_update_impl(self, indexes, weights, grads, states, update_func):
        # centralize gradients
        if isinstance(indexes, (list, tuple)):
            # multi index case: SGD optimizer
            for grad in grads:
                if len(grad.shape) > self.gc_ndim_thr:
                    grad -= grad.mean(axis=tuple(range(1, len(grad.shape))), keepdims=True)
        else:
            # single index case: all other optimizers
            if len(grads.shape) > self.gc_ndim_thr:
                grads -= grads.mean(axis=tuple(range(1, len(grads.shape))), keepdims=True)
        # update weights using centralized gradients
        update_func(indexes, weights, grads, states)

    inst_dict = dict(
        __init__=__init__,
        update=update,
        update_multi_precision=update_multi_precision,
        _gc_update_impl=_gc_update_impl,
    )

    for k, v in optimizers.items():
        name = k + suffix
        inst = type(name, (v, ), inst_dict)
        mx.optimizer.Optimizer.register(inst)
        globals()[name] = inst
        __all__.append(name)


_register_gc_opt()

if __name__ == '__main__':
    import optimizer

    # opt_params = {'learning_rate': 0.001}
    # sgd_gc = optimizer.SGDGC(gc_type='gc', **opt_params)
    # sgd_gcc = optimizer.SGDGC(gc_type='gcc', **opt_params)
    # adam_gc = optimizer.AdamGC(gc_type='gc', **opt_params)
    # adam_gcc = optimizer.AdamGC(gc_type='gcc', **opt_params)

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
SGD (Stochastic Gradient Descent)是一种常用的优化方法,用于训练机器学习模型。sgd_experimental是MXNet的一个实验性功能,用于加速SGD的收敛速度和稳定性。 以下是sgd_experimental的使用方法: 1. 导入相关包 ```python import mxnet as mx from mxnet import gluon from mxnet.gluon import nn from mxnet import autograd ``` 2. 定义模型 ```python net = nn.Sequential() net.add(nn.Dense(128, activation='relu')) net.add(nn.Dense(64, activation='relu')) net.add(nn.Dense(10)) ``` 3. 初始化SGD优化器 ```python optimizer = mx.optimizer.SGD(momentum=0.9, wd=0.001, learning_rate=0.1, rescale_grad=1.0/128) optimizer = mx.optimizer.sgd_experimental.SGDEx(optimizer) ``` 4. 定义损失函数 ```python loss_fn = gluon.loss.SoftmaxCrossEntropyLoss() ``` 5. 定义训练函数 ```python def train(net, dataloader, loss_fn, optimizer, ctx): cumulative_loss = 0.0 cumulative_accuracy = 0.0 total_samples = 0 for data, label in dataloader: data = data.as_in_context(ctx) label = label.as_in_context(ctx) with autograd.record(): output = net(data) loss = loss_fn(output, label) loss.backward() optimizer.update(data.shape[0]) cumulative_loss += mx.nd.sum(loss).asscalar() cumulative_accuracy += mx.nd.sum(output.argmax(axis=1) == label).asscalar() total_samples += label.size return cumulative_loss / total_samples, cumulative_accuracy / total_samples ``` 6. 训练模型 ```python ctx = mx.cpu() epochs = 10 batch_size = 128 train_dataset = gluon.data.vision.datasets.MNIST(train=True) train_dataloader = gluon.data.DataLoader(train_dataset.transform_first(transformer), batch_size=batch_size, shuffle=True) for epoch in range(epochs): train_loss, train_accuracy = train(net, train_dataloader, loss_fn, optimizer, ctx) print('epoch: %d, train_loss: %.4f, train_accuracy: %.4f' % (epoch+1, train_loss, train_accuracy)) ``` 以上就是sgd_experimental的使用方法。需要注意的是,由于sgd_experimental是实验性功能,可能在未来版本中发生变化。建议在使用时查看最新的MXNet文档。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

AI算法网奇

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值