本文首发于个人博客https://kezunlin.me/post/6580691f/,欢迎阅读!
compile opencv with CUDA support on windows 10
Series
- Part 1: compile opencv on ubuntu 16.04
- Part 2: compile opencv with CUDA support on windows 10
- Part 3: opencv mat for loop
- Part 4: speed up opencv image processing with openmp
Guide
requirements:
- windows: 10
- opencv: 3.1.0
- nvidia driver: gtx 1060 382.05 (gtx 970m)
- GPU arch(s): sm61 (sm52)
- cuda: 8.0
- cudnn: 5.0.5
- cmake: 3.10.0
- vs: vs2015 64
nvidia cuda CC
笔记本版本的显卡和台式机的计算能力是有差距的。
cpu vs gpu
for opencv functions
get source
Get opencv 3.1.0 for git and fix some bugs
git clone https://github.com/opencv/opencv.git
cd opencv
git checkout -b v3.1.0 3.1.0
# fix bugs for 3.1.0
git cherry-pick 10896
git cherry-pick cdb9c
git cherry-pick 24dbb
git branch
master
* v3.1.0
compile
mkdir build && cd build && cmake-gui ..
config
configure with VS 2015 win64
with options
BUILD_SHARED_LIBS ON
CMAKE_CONFIGURATION_TYPES Release # Release
CMAKE_CXX_FLAGS_RELEASE /MD /O2 /Ob2 /DNDEBUG /MP # for multiple processor
WITH_VTK OFF
BUILD_PERF_TESTS OFF # if ON, build errors occur
WITH_CUDA ON
CUDA_TOOLKIT_ROOT_DIR C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v8.0
#CUDA_ARCH_BIN 3.0 3.5 5.0 5.2 6.0 6.1 # very time-consuming
CUDA_ARCH_PTX 3.0
for opencv
CUDA_ARCH_BIN 3.0 3.5 5.0 5.2 6.0 6.1
relate with
-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;
CUDA_ARCH_PTX 3.0
relate with
-gencode;arch=compute_30,code=compute_30;
for caffe
the
CUDA_ARCH_BIN
parameter specifies multiple architectures so as to support a variety of GPU boards. otherwise, cuda programs will not run with other type of GPU boards.为了支持在多个不同计算能力的GPU上运行可执行程序,opencv/caffe编译过程中需要支持多个不同架构,
eg. CUDA_ARCH_BIN 3.0 3.5 5.0 5.2 6.0 6.1
, 因此编译过程非常耗时。在编译的而过程中尽可能选择需要发布release版本的GPU架构进行配置编译。
configure and output:
Selecting Windows SDK version 10.0.14393.0 to target Windows 10.0.17134.
found IPP (ICV version): 9.0.1 [9.0.1]
at: C:/compile/opencv/3rdparty/ippicv/unpack/ippicv_win
CUDA detected: 8.0
CUDA NVCC target flags: -gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_30,code=compute_30
Could NOT find Doxy