如何使用Keras进行分布式/多GPU运算？

最新推荐文章于 2024-10-01 09:30:00 发布

mishidemudong

最新推荐文章于 2024-10-01 09:30:00 发布

阅读量1.7w

点赞数 1

分类专栏： tensorflow分布式部署 Deep Learning 分布式计算

Deep Learning 同时被 3 个专栏收录

243 篇文章 9 订阅

订阅专栏

tensorflow分布式部署

58 篇文章 0 订阅

订阅专栏

分布式计算

41 篇文章 0 订阅

订阅专栏

如何使用Keras进行分布式/多GPU运算？

Keras在使用TensorFlow作为后端的时候可以进行分布式/多GPU的运算，Keras对多GPU和分布式的支持是通过TF完成的。

with tf.device('/gpu:0'):
    x = tf.placeholder(tf.float32, shape=(None, 20, 64))
    y = LSTM(32)(x)  # all ops in the LSTM layer will live on GPU:0

with tf.device('/gpu:1'):
    x = tf.placeholder(tf.float32, shape=(None, 20, 64))
    y = LSTM(32)(x)  # all ops in the LSTM layer will live on GPU:1

注意，上例中由LSTM创建的变量不在GPU上：所有的TensorFlow变量总是在CPU上生存，而与它们在哪创建无关。各个设备上的变量转换TensorFlow会自动完成。

如果你想在不同的GPU上训练同一个模型的不同副本，但在不同的副本中共享权重，你应该首先在一个设备上实例化你的模型，然后在不同的设备上多次调用该对象，例如：

with tf.device('/cpu:0'):
    x = tf.placeholder(tf.float32, shape=(None, 784))

    # shared model living on CPU:0
    # it won't actually be run during training; it acts as an op template
    # and as a repository for shared variables
    model = Sequential()
    model.add(Dense(32, activation='relu', input_dim=784))
    model.add(Dense(10, activation='softmax'))

# replica 0
with tf.device('/gpu:0'):
    output_0 = model(x)  # all ops in the replica will live on GPU:0

# replica 1
with tf.device('/gpu:1'):
    output_1 = model(x)  # all ops in the replica will live on GPU:1

# merge outputs on CPU
with tf.device('/cpu:0'):
    preds = 0.5 * (output_0 + output_1)

# we only run the `preds` tensor, so that only the two
# replicas on GPU get run (plus the merge op on CPU)
output_value = sess.run([preds], feed_dict={x: data})

要想完成分布式的训练，你需要将Keras注册在连接一个集群的TensorFlow会话上：

server = tf.train.Server.create_local_server()
sess = tf.Session(server.target)

from keras import backend as K
K.set_session(sess)

关于分布式训练的更多信息，请参考这里

mishidemudong

关注

1
点赞
踩
6

收藏

觉得还不错? 一键收藏
6
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录