Thensorflow 在GPU上运行

本文介绍了如何在TensorFlow中使用GPU进行计算,包括通过设置log_device_placement观察操作分配,使用tf.device指定GPU,以及如何在多个GPU上并行执行任务。示例展示了在不同GPU上手动和自动分配计算的操作。
摘要由CSDN通过智能技术生成

Thensorflow 在GPU上运行

 

首先,可以在sess中使得log_device_placementTrue来查看每一部分在哪里执行

 

import tensorflow astf

a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')

b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')

c = tf.matmul(a, b)

 

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

printsess.run(c)

 

得到的结果如下:

 

. . .

Device mapping:

/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40c, pci bus id: 0000:08:00.0

. . .

b: /job:localhost/replica:0/task:0/gpu:0

a: /job:localhost/replica:0/task:0/gpu:0

MatMul: /job:localhost/replica:0/task:0/gpu:0

[[ 22.28.]

[ 49.64.]]

 

 

如果你不想让系统给你选,而想自己设置就需要用tf.device来创建一个device上下文。当在一个系统中有好几个GPU的时候,低层次的GPU会被自动选择。但是如果我们想让它运行在另一个GPU上,可以用tf.device来实现:

 

import tensorflow astf

 

with tf.device('/gpu:2'):

a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')

b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')

c = tf.matmul(a, b)

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

printsess.run(c)

 

 

如果我们有很多GPU,希望它们并行去解决一个问题,其实还是挺简单的,就是在前面加上with tf.device():, 给个栗子,可以按下面例子来配置GPU(这里作者偷懒,让两个GPU做一样的工作)

 

import tensorflow astf

 

c = []

for d in ['/gpu:2', '/gpu:3']:

with tf.device(d):

a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3])

b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2])

c.append(tf.matmul(a, b))

with tf.device('/cpu:0'):

sum = tf.add_n(c)

 

# Creates a session with log_device_placement set toTrue.

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

print sess.run(sum)

 

产生的结果是这样的:

 

. . .

Device mapping:

/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40c

/job:localhost/replica:0/task:0/gpu:1 -> device: 1, name: Tesla K40c

/job:localhost/replica:0/task:0/gpu:2 -> device: 2, name: Tesla K40c

/job:localhost/replica:0/task:0/gpu:3 -> device: 3, name: Tesla K40c

. . .

. . .

Const_3: /job:localhost/replica:0/task:0/gpu:3

I tensorflow/core/common_runtime/simple_placer.cc:289] Const_3: /job:localhost/replica:0/task:0/gpu:3

Const_2: /job:localhost/replica:0/task:0/gpu:3

I tensorflow/core/common_runtime/simple_placer.cc:289] Const_2: /job:localhost/replica:0/task:0/gpu:3

MatMul_1: /job:localhost/replica:0/task:0/gpu:3

I tensorflow/core/common_runtime/simple_placer.cc:289] MatMul_1: /job:localhost/replica:0/task:0/gpu:3

Const_1: /job:localhost/replica:0/task:0/gpu:2

I tensorflow/core/common_runtime/simple_placer.cc:289] Const_1: /job:localhost/replica:0/task:0/gpu:2

Const: /job:localhost/replica:0/task:0/gpu:2

I tensorflow/core/common_runtime/simple_placer.cc:289] Const: /job:localhost/replica:0/task:0/gpu:2

MatMul: /job:localhost/replica:0/task:0/gpu:2

I tensorflow/core/common_runtime/simple_placer.cc:289] MatMul: /job:localhost/replica:0/task:0/gpu:2

AddN: /job:localhost/replica:0/task:0/cpu:0

I tensorflow/core/common_runtime/simple_placer.cc:289] AddN: /job:localhost/replica:0/task:0/cpu:0

[[44.56.]

[98.128.]]

. . .

 当然,也可以来个复杂点的��:


import numpy as np
import tensorflow as tf
import datetime

用numpy package创建了两个随机数:

A = np.random.rand(1e4, 1e4).astype('float32')
B = np.random.rand(1e4, 1e4).astype('float32')

n = 10

这两个是用来存结果的:

c1 = []
c2 = []

定义来一个matpow() 方程:

defmatpow(M, n):
    if n < 1: #Abstract cases where n < 1
       return M
    else:
       return tf.matmul(M, matpow(M, n-1))

接下来才是正题,在一个gpu上算:

with tf.device('/gpu:0'):
    a = tf.constant(A)
    b = tf.constant(B)
    c1.append(matpow(a, n))
    c1.append(matpow(b, n))

with tf.device('/cpu:0'):
sum = tf.add_n(c1)

t1_1 = datetime.datetime.now()

with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as sess:
sess.run(sum)
t2_1 = datetime.datetime.now()

在两个gpu上算,gpu0上存储A,然后算A^n,gpu1上存储B,然后算B^n,最后用CPU加起来:

with tf.device('/gpu:0'):
    #compute A^n and store result in c2
    a = tf.constant(A)
    c2.append(matpow(a, n))
 
with tf.device('/gpu:1'):
    #compute B^n and store result in c2
    b = tf.constant(B)
    c2.append(matpow(b, n))

with tf.device('/cpu:0'):
    sum = tf.add_n(c2) #Addition of all elements in c2, i.e. A^n + B^n
    t1_2 = datetime.datetime.now()

with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as sess:
    # Runs the op.
    sess.run(sum)
t2_2 = datetime.datetime.now()

最后看下运行时间:

print "Single GPU computation time: " + str(t2_1-t1_1)
print "Multi GPU computation time: " + str(t2_2-t1_2)





 

附上英文的原文链接

http://www.jorditorres.org/first-contact-with-tensorflow/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值