详解Keras(tf)报错:"BaseCollectiveExecutor::StartAbort Unknown: Failed to get convolution algorithm"

最新推荐文章于 2022-07-13 15:49:30 发布

CxsGhost

最新推荐文章于 2022-07-13 15:49:30 发布

阅读量1.9k

点赞数 2

分类专栏：深度学习文章标签： python 深度学习大数据 keras

本文链接：https://blog.csdn.net/cxsghost/article/details/105985955

版权

今天用keras内置的VGG16跑模型时遇到了这个报错，在确定不是CUDA等环境版本问题后，矛头指向了是因为显存分配没搞好造成的。（我的电脑只有一块菜卡4G显存）

2020-05-08 00:59:24.206906: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-05-08 00:59:24.207493: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-05-08 00:59:24.207802: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Unknown: Failed to get convolution algorithm. 
This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.

解决办法：

在程序开头加上这段代码

import tensorflow as tf
config = tf.compat.v1.ConfigProto(allow_soft_placement=True)
config.gpu_options.per_process_gpu_memory_fraction = 0.3
tf.compat.v1.keras.backen

最低0.47元/天解锁文章

CxsGhost

关注

2
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
详解Keras(tf)报错:"BaseCollectiveExecutor::StartAbort Unknown: Failed to get convolution algorithm"

今天用keras内置的VGG16跑模型时遇到了这个报错，在确定不是CUDA等环境版本问题后，矛头指向了是因为显存分配没搞好造成的。（我的电脑只有一块菜卡4G显存）2020-05-08 00:59:24.206906: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_ST...
复制链接

扫一扫

专栏目录