Tensorflow2.0 之开启 GPU 模式

最新推荐文章于 2025-03-29 10:47:59 发布

cofisher

最新推荐文章于 2025-03-29 10:47:59 发布

阅读量1.6w

点赞数 8

分类专栏： Tensorflow 2.0 python 文章标签： tensorflow python gpu cpu

本文链接：https://blog.csdn.net/qq_36758914/article/details/107152997

版权

Tensorflow 2.0 同时被 2 个专栏收录

98 篇文章

订阅专栏

python

58 篇文章

订阅专栏

文章目录

一、查看设备是否有合适的 GPU
二、日志设备放置
三、为程序指定特定的 GPU
四、内存分配
- 1、按需分配
- 2、设置 GPU 显存为固定使用量
五、显式指定 GPU
六、多 GPU 的使用
七、GPU vs CPU

一、查看设备是否有合适的 GPU

首先，我们需要先确认所用设备是否支持 Tensorflow-gpu 的使用：

print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

Num GPUs Available:  1

这说明当前设备中有一个 GPU 可供 Tensorflow 使用。

二、日志设备放置

为了查出我们的操作和张量被配置到哪个 GPU 或 CPU 上，我们可以在程序起始位置加上：

tf.debugging.set_log_device_placement(True)

三、为程序指定特定的 GPU

如果想要在所有 GPU 中指定只使用第一个 GPU，那么需要添加以下语句。

tf.config.experimental.set_visible_devices(gpus[0], 'GPU')

四、内存分配

1、按需分配

第一个选项是通过调用 tf.config.experimental.set_memory_growth 来打开内存增长，它试图只分配运行时所需的 GPU 内存：它开始分配非常少的内存，随着程序运行和更多的 GPU 内存需要，我们扩展分配给 Tensorflow 进程的 GPU 内存区域。

tf.config.experimental.set_memory_growth(gpu[0], True)

2、设置 GPU 显存为固定使用量

设置使用第一个 GPU 的显存为 1G。

tf.config.experimental.set_virtual_device_configuration(
    gpus[0],
    [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])

五、显式指定 GPU

如果我们的系统里有不止一个 GPU，则默认情况下，ID 最小的 GPU 将被选用。如果想在不同的 GPU 上运行，我们需要显式地指定优先项。

with tf.device("/gpu:0"):
    tf.random.set_seed(0)
    a = tf.random.uniform((10000,10000),minval = 0,maxval = 3.0)
    c = tf.matmul(a, tf.transpose(a))
    d = tf.reduce_sum(c)

此处显式指定了使用 GPU 0，如果指定的 GPU 不存在，则程序会报错。

如果希望 TensorFlow 自动选择一个现有且受支持的设备来运行操作，以避免指定的设备不存在，那么可以在程序起始位置加上：

tf.config.set_soft_device_placement(True)

当然，显式指定使用 CPU 也是可以的，只需要把 tf.device("/gpu:0") 改成 tf.device("/cpu:0") 即可。并且，如果一个 TensorFlow 操作同时具有 CPU 和 GPU 两种实现，在默认情况下，当操作被分配给一个设备时，GPU 设备将被给予优先级。

六、多 GPU 的使用

下面是一个简单的例子说明多 GPU 的同时使用：

tf.debugging.set_log_device_placement(True)

strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
  inputs = tf.keras.layers.Input(shape=(1,))
  predictions = tf.keras.layers.Dense(1)(inputs)
  model = tf.keras.models.Model(inputs=inputs, outputs=predictions)
  model.compile(loss='mse',
                optimizer=tf.keras.optimizers.SGD(learning_rate=0.2))

七、GPU vs CPU

import tensorflow as tf
from tensorflow.keras import * 
import time

tf.config.set_soft_device_placement(True)
tf.debugging.set_log_device_placement(True)

gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_visible_devices(gpus[0], 'GPU')
tf.config.experimental.set_memory_growth(gpus[0], True)

t=time.time()
with tf.device("/gpu:0"):
    tf.random.set_seed(0)
    a = tf.random.uniform((10000,10000),minval = 0,maxval = 3.0)
    c = tf.matmul(a, tf.transpose(a))
    d = tf.reduce_sum(c)
print('gpu: ', time.time()-t)

t=time.time()
with tf.device("/cpu:0"):
    tf.random.set_seed(0)
    a = tf.random.uniform((10000,10000),minval = 0,maxval = 3.0)
    c = tf.matmul(a, tf.transpose(a))
    d = tf.reduce_sum(c)
print('cpu: ', time.time()-t)

Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Sub in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Mul in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Add in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Transpose in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Sum in device /job:localhost/replica:0/task:0/device:GPU:0
gpu:  0.9708232879638672
Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Sub in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Mul in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Add in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Transpose in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op MatMul in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Sum in device /job:localhost/replica:0/task:0/device:CPU:0
cpu:  4.51805853843689