tensorflow使用双GPU训练自己的神经网络,解决各种error

前言

好久没写博客了,快没积分了,赶紧写一点,分享一些demo,希望能混点积分。。。。。。。。

模型

使用简单的模型,主要讲的是并行训练,但网络自己裁剪下vgg,对于小型的分类识别也能用。因为我是做针对工业上的识别,所以输入和模型较为简单,输入是128*128*1,输出num_class

模型代码slim实现

with slim.arg_scope([slim.conv2d, slim.fully_connected],
                    activation_fn=tf.nn.relu):
    net = preprocessed_inputs
    net =slim.repeat(net,1,slim.conv2d,32,[5,5],scope = 'conv1')
    net = slim.max_pool2d(net,[2,2],scope ='pool1',stride = 2)
    net = slim.repeat(net, 1, slim.conv2d, 64, [3, 3], scope='conv2')
    net = slim.max_pool2d(net, [2, 2], scope='pool2',stride = 2)
    net = slim.repeat(net, 1, slim.conv2d, 128, [3, 3], scope='conv3')
    net = slim.max_pool2d(net, [2, 2], scope='pool3',stride = 2)
    net = slim.repeat(net, 1, slim.conv2d, 256, [3, 3], scope='conv4')
    net = slim.max_pool2d(net, [2, 2], scope='pool4',stride = 2)
    net = slim.repeat(net, 1, slim.conv2d, 512, [3, 3], scope='conv5')
    net = slim.max_pool2d(net, [2, 2], scope='pool5',stride = 2)
    net = slim.flatten(net, scope='flatten')
    net = slim.dropout(net, keep_prob=0.7,
                       is_training=self._is_training)
    net = slim.fully_connected(net, 1024, scope='fc1')
    net = slim.fully_connected(net, 64, scope='fc2')
    net = slim.fully_connected(net, self.num_classes,
                               activation_fn=None, scope='fc3')
prediction_dict = {'logits': net}

训练

训练的时候,首先需要一个计算平均灰度的函数

def average_gradients(tower_grads):
    average_grads = []
    for grad_and_vars in zip(*tower_grads):
        grads = []
        for g, _ in grad_and_vars:
            expend_g = tf.expand_dims(g, 0)
            grads.append(expend_g)
        grad = tf.concat(grads, 0)
        grad = tf.reduce_mean(grad, 0)
        v = grad_and_vars[0][1]
        grad_and_var = (grad, v)
        average_grads.append(grad_and_var)
    return average_grads

然后双gpu训练

with tf.variable_scope(tf.get_variable_scope()):
    for i in range(2):
        with tf.device('/gpu:%d' % i):
            cls_model = model_1_128.Model(is_training=True, num_classes=7)
            preprocessed_inputs = cls_model.preprocess(inputs)
            prediction_dict = cls_model.predict_2(preprocessed_inputs)
            postprocessed_dict = cls_model.postprocess(prediction_dict)
            classes = postprocessed_dict['classes']
            classes_ = tf.identity(classes, name='classes')
            tf.get_variable_scope().reuse_variables()
            loss_dict = cls_model.loss(prediction_dict, labels)
            loss = loss_dict['loss']
            grads = optimizer.compute_gradients(loss)
            tower_grads.append(grads)
grads = average_gradients(tower_grads)
apply_gradient_op = optimizer.apply_gradients(grads)
train_op = apply_gradient_op

这里在设置session的时候一定要加上

config=tf.ConfigProto(allow_soft_placement=True)

不然很容易爆找不到设备的错误,版本是tensorflow1.6

全代码下载地址https://download.csdn.net/download/bomingzi/11564725,数据集以image_%d_%d {count,num_class}命名。

 

 

 

 

 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值