今天用keras内置的VGG16跑模型时遇到了这个报错,在确定不是CUDA等环境版本问题后,矛头指向了是因为显存分配没搞好造成的。(我的电脑只有一块菜卡4G显存)
2020-05-08 00:59:24.206906: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-05-08 00:59:24.207493: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-05-08 00:59:24.207802: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Unknown: Failed to get convolution algorithm.
This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
解决办法:
在程序开头加上这段代码
import tensorflow as tf
config = tf.compat.v1.ConfigProto(allow_soft_placement=True)
config.gpu_options.per_process_gpu_memory_fraction = 0.3
tf.compat.v1.keras.backen