如果对执行Apollo的脚本docker/scripts/dev_start.sh生成的docker容器内的环境里的cuda版本作了修改,例如用比较新一点的cuda11.8代替了容器内默认安装的cuda11.1(卸载掉后安装新版本或者把新版本文件考本进容器后设置cuda相关环境变量),编译Apolllo时会报下面的错误:
(07:43:07) INFO: Repository local_config_cuda instantiated at:
/apollo/WORKSPACE:20:20: in <toplevel>
/apollo/tools/workspace.bzl:94:19: in apollo_repositories
Repository rule cuda_configure defined at:
/apollo/third_party/gpus/cuda_configure.bzl:1276:33: in <toplevel>
(07:43:07) ERROR: An error occurred during the fetch of repository 'local_config_cuda':
Traceback (most recent call last):
File "/apollo/third_party/gpus/cuda_configure.bzl", line 1247, column 38, in _cuda_autoconf_impl
_create_local_cuda_repository(repository_ctx)
File "/apollo/third_party/gpus/cuda_configure.bzl", line 844, column 35, in _create_local_cuda_repository
cuda_config = _get_cuda_config(repository_ctx, find_cuda_config_script)
File "/apollo/third_party/gpus/cuda_configure.bzl", line 537, column 30, in _get_cuda_config
config = find_cuda_config(repository_ctx, find_cuda_config_script, ["cuda", "cudnn"])
File "/apollo/third_party/gpus/cuda_configure.bzl", line 514, column 41, in find_cuda_config
exec_result = _exec_find_cuda_config(repository_ctx, script_path, cuda_libraries)
File "/apollo/third_party/gpus/cuda_configure.bzl", line 508, column 19, in _exec_find_cuda_config
return execute(repository_ctx, [python_bin, "-c", decompress_and_execute_cmd])
File "/apollo/tools/platform/common.bzl", line 166, column 13, in execute
fail(
Error in fail: Repository command failed
Could not find any cuda.h matching version '' in any subdirectory:
''
'include'
'include/cuda'
'include/*-linux-gnu'
'extras/CUPTI/include'
'include/cuda/CUPTI'
'local/cuda/extras/CUPTI/include'
of:
'/usr/local/cuda-11.1'
需要修改.apollo.bazelrc里的cuda相关设置:
#build --action_env CUDA_TOOLKIT_PATH="/usr/local/cuda-11.1"
build --action_env CUDA_TOOLKIT_PATH="/usr/local/cuda-11.8"
build --action_env TF_CUDA_COMPUTE_CAPABILITIES="3.7,5.2,6.0,6.1,7.0,7.2,7.5,8.6,8.7"
然后编译可以过了。