2020-10-19 06:04:36.775861: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-10-19 06:04:36.793778: E tensorflow/stream_executor/cuda/cuda_driver.cc:300] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2020-10-19 06:04:36.793825: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] retrieving CUDA diagnostic information for host: 08008f9324d6
2020-10-19 06:04:36.793840: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:170] hostname: 08008f9324d6
2020-10-19 06:04:36.793895: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:194] libcuda reported version is: 430.64.0
2020-10-19 06:04:36.793934: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:198] kernel reported version is: 430.64.0
2020-10-19 06:04:36.793948: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:305] kernel version seems to match DSO: 430.64.0
使用nvidia-smi 查看发现gpu没有被占用,后来发现代码中写死了使用的GPU为1,而我的gpu只有一块,所以,修改为第0块即可
原始代码
os.environ['CUDA_VISIBLE_DEVICES'] = '1'
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config)
修改后
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config)
020-10-19 06:11:14.069832: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1411] Found device 0 with properties:
name: Tesla V100-PCIE-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.38
pciBusID: 0001:00:00.0
totalMemory: 15.78GiB freeMemory: 14.54GiB
2020-10-19 06:11:14.069891: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0
2020-10-19 06:11:14.472237: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-10-19 06:11:14.472291: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] 0
2020-10-19 06:11:14.472307: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0: N
2020-10-19 06:11:14.472418: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14060 MB memory) -> physical GPU (device: 0, name: Tesla V100-PCIE-16GB, pci bus id: 0001:00:00.0, compute capability: 7.0)