System : ubuntu 16.04 LTS
linux kernel : 4.4.0-36-generic
VGA controller : Intel Corporation Sky Lake Integrated Graphics & NVIDIA Corporation GM107M [GeForce GTX 950M]
首先,从ubuntu官网下载16.04的官方版本,安装系统。
进入系统后先更新源,北京使用北大或清华的源速度较快。
更新后apt-get update 不要upgrade 不要upgrade 不要upgrade
upgrade会升级系统软件,包括内核的更新,更新后是一定会导致第三方驱动不兼容!
在系统设置,更新管理中,附加驱动中,从nouveau开源驱动,切换成nvidia 361 private test
此时系统自动安装N卡驱动,连好网络不要动,等待安装完毕
在/etc/modprobe.d/blacklist.conf中加入 blacklist nouveau 重启系统
如果你不黑屏,不卡在登陆界面循环,恭喜,驱动安装成功了!如果有上述问题存在,一定是内核版本和驱动版本不兼容,请多尝试几个版本的系统和驱动搭配
apt-get install nvidia-settings nvidia-prime 这两个可能上面已经自动安装,若没有安装就自己apt装一下
安装prime后 就可以使用prime-select 切换intel和nvidia显卡了
prime-select query显示当前使用的显卡
终端输入 glxgears看5秒fps数,几千的话肯定是N卡在跑,如果只有几十那么是intel集显在跑
root@jinmengze-HP-ENVY-Notebook:/etc/modprobe.d# glxgears
Running synchronized to the vertical refresh. The framerate should be
approximately the same as the monitor refresh rate.
43691 frames in 5.0 seconds = 8738.167 FPS
43681 frames in 5.0 seconds = 8736.123 FPS
XIO: fatal IO error 11 (Resource temporarily unavailable) on X server “:0”
after 52 requests (52 known processed) with 0 events remaining.
同时可以用nvidia-smi命令查看N卡状态
root@jinmengze-HP-ENVY-Notebook:/etc/modprobe.d# nvidia-smi
Fri Sep 2 15:54:52 2016
+——————————————————+
| NVIDIA-SMI 361.42 Driver Version: 361.42 |
|——————————-+———————-+———————-+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 950M Off | 0000:01:00.0 Off | N/A |
| N/A 44C P8 N/A / N/A | 356MiB / 4095MiB | 14% Default |
+——————————-+———————-+———————-+
+—————————————————————————–+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 900 G /usr/lib/xorg/Xorg 187MiB |
| 0 1530 G compiz 154MiB |
| 0 1946 G /usr/lib/firefox/firefox 1MiB |
| 0 25033 G /usr/lib/firefox/plugin-container 1MiB |
+—————————————————————————–+
nvidia-smi是nvidia cuda带的一个工具,在16.04上自动安装驱动后就自己安装了,如果没有,需要自己下载cuda toolkit编译samples来获得
在NVIDIA官网,下载CUDA 7.5 的run文件,官网推荐的deb文件不好使!因为目前7.5只支持到ubuntu15.04,下载deb安装会出现大量问题,比如Public key不识别,toollit安装不上等(因为GCC版本问题,apt自动安装时会出问题)
使用run文件
cd到下载的.run文件夹执行:
sudo ./cuda_7.5.18_linux.run –override
选择安装选项:(注意:应为之前已经安装过NVIDIA Display Driver的驱动了(如第一步),所以这里选择不安装NVIDIA Accelerated Graphics Driver for Linux-x86_64 352.39。如果选择安装会出现错误,并且导致安装CUDA失败。
添加cuda到环境变量里面
sudo nano /etc/profile.d/cuda.sh
export PATH=$PATH:/usr/local/cuda/bin
sudo nano /etc/ld.so.conf.d/cuda.conf
/usr/local/cuda/lib64
source /etc/ld.so.conf.d/cuda.confcd /usr/local/cuda-7.5/samples/1_Utilities/deviceQuery
make
sudo ./deviceQuery
sudo ldconfig
由于cuda7.5不支持gcc4.9以上的版本,而ubuntu16.04默认的是gcc5,这会造成安装的失败,解决方法有两种,亲测第二种可用,第一种隐患太大,不建议使用
1、对gcc进行降级,让系统采用低版本的GCC,但是这种方式有一定的问题,因为Ubuntu16.04很多系统文件默认是使用gcc5编译的,因此如 果用降级的方法安装完cuda后,在后期安装caffe的时候,可能会报类似于undefined的错误,这是因为你的gcc版本低,其所需的系统文件找 不到
sudo apt-get install g++-4.9
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 20
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 10
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 20
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-5 10
sudo update-alternatives --install /usr/bin/cc cc /usr/bin/gcc 30
sudo update-alternatives --set cc /usr/bin/gcc
sudo update-alternatives --install /usr/bin/c++ c++ /usr/bin/g++ 30
sudo update-alternatives --set c++ /usr/bin/g++
2、这种方式有点类似于黑箱方式,即强制让他不报错。编辑/usr/local/cuda/include/host_config.h,将其中的第115行注释掉:
将
error – unsupported GNU version! gcc versions later than 4.9 are not supported!
改为
//#error – unsupported GNU version! gcc versions later than 4.9 are not supported!
测试cuda的Samples
cd /usr/local/cuda-7.5/samples/1_Utilities/deviceQuery
make
sudo ./deviceQuery
显示OK则成功
root@jinmengze-HP-ENVY-Notebook:/home/jinmengze/NVIDIA_CUDA-7.5_Samples/1_Utilities/deviceQuery# ./deviceQuery
./deviceQuery Starting…
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: “GeForce GTX 950M”
CUDA Driver Version / Runtime Version 8.0 / 7.5
CUDA Capability Major/Minor version number: 5.0
Total amount of global memory: 4096 MBytes (4294836224 bytes)
( 5) Multiprocessors, (128) CUDA Cores/MP: 640 CUDA Cores
GPU Max Clock rate: 1124 MHz (1.12 GHz)
Memory Clock rate: 900 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 7.5, NumDevs = 1, Device0 = GeForce GTX 950M
Result = PASS
参考:http://blog.csdn.net/autocyz/article/details/51841157
安装opencv并附带gpu模块
apt-get install build-essential libgtk2.0-dev libavcodec-dev libavformat-dev libjpeg62-dev libtiff4-dev cmake libswscale-dev libjasper-dev
解压源码后,在源码目录中
cd ~/opencv
mkdir release
cd release
cmake -D CMAKE_BUILD_TYPE=bulid -D BUILD_EXAMPLES=true -D WITH_CUDA=true -D CMAKE_INSTALL_PREFIX=/usr/local -D CUDA_ARCH_BIN=5.0 -D CUDA_GENERATION=Auto ..
要编译GPU模块就要选上 WITH_CUDA 选择CUDA之后 要选择匹配的CUDA_ARCH_BIN,可以看上面的cuda samples中显示的“CUDA Capability Major/Minor version number: 5.0”这就是-D CUDA_ARCH_BIN=5.0
这个选项一定要选对,否则编译成功也会导致调用gpu的api失败no available device之类的
就是因为 Capability compute 不匹配
3.0 3.5对应选择 CUDA_GENERATION=Kepler 其他的选择 CUDA_GENERATION=Auto
Generation是编译gpu模块二进制的方式,若选错则编译会失败
之后make
make install
默认opencv安装到/usr/local的lib和include中
然后就可以激活使用opencv和gpu加速了!
CUDA_GENERATION模式设定为auto,不是kepler。 我的显卡 Compute Capability是5.0,kepler默认3.0(3.5),导致编译之后的opencv调用cuda,出现api不符合,因为CUDA_ARCH_BIN与电脑显卡不一致,具体请查看,确认CUDA_ARCH_BIN与你显卡的Compute Capability一致,如果不一致,CUDA_GENERATION请选择auto或者空白,然后填入显卡对应的Compute Capability
参考http://blog.csdn.net/Swearos/article/details/51307304
最终GPU模块是否可以使用,可以用下面代码检验:
int main(){
int tmp = gpu::getCudaEnabledDeviceCount();
cout<<tmp<<endl;
id = gpu::getDevice();
cout<<id<<endl;
gpu::setDevice(id);
gpu::DeviceInfo dev;
string str1 = dev.name();
cout<<str1<<endl;
cout<<dev.majorVersion()<<endl;
cout<<dev.minorVersion()<<endl;
cout<<dev.isCompatible()<<endl;
cout<<dev.deviceID()<<endl;
cout<<dev.supports((gpu::FeatureSet)20)<<endl;
}
如果getCudaEnabledDeviceCount不为0 且 isCompatible返回1 说明gpu模块已经可以使用了
getCudaEnabledDeviceCount为0说明检测不到cuda设备,驱动安装的不正确
getCudaEnabledDeviceCount不为0 isCompatible为0 说明 驱动安装正确, opencv的gpu模块编译不正确