Mac 安装 TensorFlow 2.7 环境，支持 AMD 显卡

最新推荐文章于 2025-03-27 15:33:54 发布

韦胖漫谈IT

最新推荐文章于 2025-03-27 15:33:54 发布

阅读量2.7k

点赞数

分类专栏：手记文章标签： tensorflow macos 深度学习 radeon amd

本文链接：https://blog.csdn.net/weixin_45919616/article/details/122369508

版权

手记专栏收录该内容

21 篇文章

订阅专栏

作者在一台I9 CPU和AMD Radeon Pro 5500M GPU的设备上运行TensorFlow MNIST示例，发现GPU训练速度反而比CPU慢。文章探讨了可能的原因，如CPU优化、GPU配置问题及TensorFlow插件等。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

先抛出问题，望高人指点迷津：

跑 mnist 的 demo，GPU 比 CPU 慢了很多，为啥呢？

本机环境：

CPU：Intel I9 8核16线程

内存：64G

显卡：AMD Radeon Pro 5500M

示例代码

import tensorflow as tf

def run():
    mnist = tf.keras.datasets.mnist
    (x_train, y_train), (x_test, y_test) = mnist.load_data()
    x_train, x_test = x_train / 255.0, x_test / 255.0
    print(len(x_train), len(y_train), len(x_test), len(y_test))
    model = tf.keras.models.Sequential([
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(128, activation='relu'),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation='softmax')
    ])
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])

    model.fit(x_train, y_train, epochs=5)

    model.evaluate(x_test, y_test, verbose=2)

if __name__ == '__main__':
    devices = tf.config.list_physical_devices()
    print(devices)
    with tf.device("cpu:0"):
        print('start with cpu')
        run()
    with tf.device("gpu:0"):
        print('start with gpu')
        run()

示例结果

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2022-01-07 17:49:09.013880: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Metal device set to: AMD Radeon Pro 5500M

systemMemory: 64.00 GB
maxCacheSize: 3.99 GB

2022-01-07 17:49:09.014762: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2022-01-07 17:49:09.015263: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
start with cpu
60000 60000 10000 10000
Epoch 1/5
1875/1875 [==============================] - 2s 836us/step - loss: 0.2982 - accuracy: 0.9135
Epoch 2/5
1875/1875 [==============================] - 1s 795us/step - loss: 0.1431 - accuracy: 0.9569
Epoch 3/5
1875/1875 [==============================] - 2s 806us/step - loss: 0.1059 - accuracy: 0.9679
Epoch 4/5
1875/1875 [==============================] - 2s 1ms/step - loss: 0.0884 - accuracy: 0.9732
Epoch 5/5
1875/1875 [==============================] - 2s 848us/step - loss: 0.0744 - accuracy: 0.9764
313/313 - 0s - loss: 0.0780 - accuracy: 0.9770 - 254ms/epoch - 810us/step
start with gpu
60000 60000 10000 10000
Epoch 1/5
2022-01-07 17:49:18.964441: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
1875/1875 [==============================] - 13s 7ms/step - loss: 0.2918 - accuracy: 0.9146
Epoch 2/5
1875/1875 [==============================] - 12s 6ms/step - loss: 0.1402 - accuracy: 0.9591
Epoch 3/5
1875/1875 [==============================] - 12s 6ms/step - loss: 0.1055 - accuracy: 0.9683
Epoch 4/5
1875/1875 [==============================] - 12s 6ms/step - loss: 0.0851 - accuracy: 0.9741
Epoch 5/5
1875/1875 [==============================] - 12s 6ms/step - loss: 0.0703 - accuracy: 0.9781
2022-01-07 17:50:19.926187: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
313/313 - 1s - loss: 0.0784 - accuracy: 0.9759 - 1s/epoch - 4ms/step

环境安装

英文好的可以看参考原文 Tensorflow Plugin - Metal - Apple Developer

确保Python 是 3.8版本。不是的话， brew 安装一下

#查看py版本
python3 -V

# 不是3.8的话，安装一下
brew install python@3.8

# 创建虚拟环境

python3 -m venv ~/tensorflow-metal
source ~/tensorflow-metal/bin/activate
python -m pip install -U pip

# 安装 tensorflow-macos
SYSTEM_VERSION_COMPAT=0 python -m pip install。tensorflow-macos
python -m pip install tensorflow-metal

# 现在就可以跑上面的 demo 了

坑1: 本来使用anaconda来装，死活报下面的错，浪费了很多时间

PSGraph adamUpdateWithLearningRateTensor:beta1Tensor:beta2Tensor:epsilonTensor:beta1PowerTensor:beta2PowerTensor:valuesTensor:momentumTensor:velocityTensor:maximumVelocityTensor:gradientTensor:name:]: unrecognized selector sent to instance

坑2: 如果无法再现获取数据集，参考keras - How can I import the MNIST dataset that has been manually downloaded? - Stack Overflow

python 环境详情

❯ pip list
Package                      Version
---------------------------- ---------
absl-py                      1.0.0
astunparse                   1.6.3
cachetools                   4.2.4
certifi                      2021.10.8
charset-normalizer           2.0.10
flatbuffers                  2.0
gast                         0.4.0
google-auth                  2.3.3
google-auth-oauthlib         0.4.6
google-pasta                 0.2.0
grpcio                       1.43.0
h5py                         3.6.0
idna                         3.3
importlib-metadata           4.10.0
keras                        2.7.0
Keras-Preprocessing          1.1.2
libclang                     12.0.0
Markdown                     3.3.6
numpy                        1.22.0
oauthlib                     3.1.1
opt-einsum                   3.3.0
pip                          21.3.1
protobuf                     3.19.1
pyasn1                       0.4.8
pyasn1-modules               0.2.8
requests                     2.27.1
requests-oauthlib            1.3.0
rsa                          4.8
setuptools                   56.0.0
six                          1.15.0
tensorboard                  2.7.0
tensorboard-data-server      0.6.1
tensorboard-plugin-wit       1.8.1
tensorflow-estimator         2.7.0
tensorflow-io-gcs-filesystem 0.23.1
tensorflow-macos             2.7.0
tensorflow-metal             0.3.0
termcolor                    1.1.0
typing_extensions            4.0.1
urllib3                      1.26.7
Werkzeug                     2.0.2
wheel                        0.37.1
wrapt                        1.13.3
zipp                         3.7.0