学习Caffe（一）Ubuntu 14.04 安装Caffe+cuda+cudnn+pycaffe+matcaffe

最新推荐文章于 2019-09-05 11:30:50 发布

goodluckcwl

最新推荐文章于 2019-09-05 11:30:50 发布

阅读量1.4k

点赞数

分类专栏： DL-Frameworks-Caffe 文章标签： Caffe cuda cudnn

本文链接：https://blog.csdn.net/u014230646/article/details/51821551

版权

DL-Frameworks-Caffe 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

Caffe是一个深度学习框架，本文讲阐述如何在linux下安装GPU加速的caffe。
系统配置是：

OS: Ubuntu14.04
CPU: i5-4690
GPU： GTX960
RAM: 8G

安装方法参见caffe的官方文档：http://caffe.berkeleyvision.org/installation.html#compilation
依赖项：

CUDA:推荐7.0以上的cuda和最新的显卡驱动。
BLAS:ATLAS, MKL, or OpenBLAS。C++矩阵运算库。
Boost >= 1.55。用到一些数学函数等。
protobuf：是一种轻便、高效的结构化数据存储格式，可以用于结构化数据串行化，很适合做数据存储或 RPC 数据交换格式。
glog&&gflags：谷歌的一个日志库；命令行参数解析库。方便调试使用。
hdf5：
lmdb,leveldb:数据库IO。准备数据时会用到。

可选依赖：

OpenCV >= 2.4 including 3.0
IO libraries: lmdb, leveldb (note: leveldb requires snappy)
cuDNN for GPU acceleration (v5)

Pycaffe:
Python 2.7 or Python 3.3+, numpy (>= 1.7), boost-provided boost.python

Matcaffe:
MATLAB with the mex compiler

安装CUDA7.5

CUDA维基百科：https://zh.wikipedia.org/wiki/CUDA
CUDA（Compute Unified Device Architecture，统一计算架构）是由NVIDIA所推出的一种集成技术，是该公司对于GPGPU的正式名称。通过这个技术，用户可利用NVIDIA的GeForce 8以后的GPU和较新的Quadro GPU进行计算。亦是首次可以利用GPU作为C-编译器的开发环境。

安装过程

1.下载Cuda

下载CUDA:https://developer.nvidia.com/cuda-downloads 选择下载deb包（或者runfile），下载完后用mu5sum检查一下文件是否完整。按照cuda官方文档安装cuda.

2.安装

先关闭桌面显示管理器lightdm，进入字符界面，在字符界面安装cuda。(这是因为cuda的安装包里包含了显卡驱动，安装驱动前要先关闭桌面显示管理器)
(也可分别安装显卡驱动与cuda库)

sudo service stop

切换到deb包目录，执行下面的命令

sudo dpkg -i cuda-repo-<distro>_<version>_<architecture>.deb  
sudo apt-get update  
sudo apt-get install cuda

然后重启电脑：sudo reboot
注意，cuda的安装包中已经包含了较新版本的显卡驱动。

3.配置环境变量

将cuda安装目录下的bin路径导出到系统的搜索路径path
这里写图片描述
并使之生效

添加动态库查找路径：在 /etc/ld.so.conf.d/加入文件 cuda.conf, 内容如下

/usr/local/cuda/lib64

保存后，执行下列命令使之立刻生效:

sudo ldconfig

4.验证

查看Cuda的C编译器NVCC的版本：

nvcc -V

这里写图片描述

编译并运行例子，进入cuda目录下的samples目录，然后在该目录下make，等待十来分钟。编译完成后，可以在Samples里面找到bin/x86_64/linux/release/目录，并切换到该目录
运行deviceQuery程序，查看输出结果如下（重点关注最后一行，Pass表示通过测试）。
这里写图片描述

5.gcc编译器版本

该版本cuda不支持gcc5.0的编译器

参考文献：
[1]Ubuntu 16.04 安装 NVIDIA CUDA Toolkit 7.5 https://gist.github.com/dangbiao1991/2c895917ea888ce33af8c1c72444b7bf
[2]Ubuntu 14.04+cuda 7.5+caffe安装配置 http://blog.csdn.net/ubunfans/article/details/47724341

安装Cudnn

下载cudnn https://developer.nvidia.com/rdp/cudnn-download, 解压，把lib目录,include目录分别复制到cuda的安装目录下。

安装BLAS

install ATLAS by sudo apt-get install libatlas-base-dev or install OpenBLAS or MKL for better CPU performance.

下载Caffe

安装Caffe依赖库

通用依赖库：

sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install --no-install-recommends libboost-all-dev

Ubuntu14.04 依赖库：

sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev

PyCaffe依赖库

进入caffe/python目录，安装依赖项：

for req in $(cat requirements.txt); do pip install $req; done

caffe官网推荐使用Anaconda http://continuum.io/downloads#all Anaconda是一个和Canopy类似的科学计算环境，但用起来更加方便。自带的包管理器conda也很强大。

MatCaffe

安装matlabR2014a

编译caffe

复制并修改Makefile.config文件：

## Refer to http://caffe.berkeleyvision.org/installation.html
# Contributions simplifying and improving our build system are welcome!

# cuDNN acceleration switch (uncomment to build with cuDNN).
  USE_CUDNN := 1

# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1

# uncomment to disable IO dependencies and corresponding data layers
# USE_OPENCV := 0
# USE_LEVELDB := 0
# USE_LMDB := 0

# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)
#	You should not set this flag if you will be reading LMDBs with any
#	possibility of simultaneous read and write
# ALLOW_LMDB_NOLOCK := 1

# Uncomment if you're using OpenCV 3
# OPENCV_VERSION := 3

# To customize your choice of compiler, uncomment and set the following.
# N.B. the default for Linux is g++ and the default for OSX is clang++
# CUSTOM_CXX := g++

# CUDA directory contains bin/ and lib/ directories that we need.
CUDA_DIR := /usr/local/cuda
# On Ubuntu 14.04, if cuda tools are installed via
# "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
# CUDA_DIR := /usr

# CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the *_50 lines for compatibility.
CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \
		-gencode arch=compute_20,code=sm_21 \
		-gencode arch=compute_30,code=sm_30 \
		-gencode arch=compute_35,code=sm_35 \
		-gencode arch=compute_50,code=sm_50 \
		-gencode arch=compute_50,code=compute_50

# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
BLAS := atlas
# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
# Leave commented to accept the defaults for your choice of BLAS
# (which should work)!
# BLAS_INCLUDE := /path/to/your/blas
# BLAS_LIB := /path/to/your/blas

# Homebrew puts openblas in a directory that is not on the standard search path
# BLAS_INCLUDE := $(shell brew --prefix openblas)/include
# BLAS_LIB := $(shell brew --prefix openblas)/lib

# This is required only if you will compile the matlab interface.
# MATLAB directory should contain the mex binary in /bin.
MATLAB_DIR := /usr/local/MATLAB/R2014a
# MATLAB_DIR := /Applications/MATLAB_R2012b.app

# NOTE: this is required only if you will compile the python interface.
# We need to be able to find Python.h and numpy/arrayobject.h.
 PYTHON_INCLUDE := /usr/include/python2.7 \
		/usr/local/lib/python2.7/dist-packages/numpy/core/include
# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
# ANACONDA_HOME := $(HOME)/anaconda
# PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
#		 $(ANACONDA_HOME)/include/python2.7 \
#		 $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include \

# Uncomment to use Python 3 (default is Python 2)
# PYTHON_LIBRARIES := boost_python3 python3.5m
# PYTHON_INCLUDE := /usr/include/python3.5m \
#                 /usr/lib/python3.5/dist-packages/numpy/core/include

# We need to be able to find libpythonX.X.so or .dylib.
 PYTHON_LIB := /usr/lib
# PYTHON_LIB := $(ANACONDA_HOME)/lib

# Homebrew installs numpy in a non standard path (keg only)
# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
# PYTHON_LIB += $(shell brew --prefix numpy)/lib

# Uncomment to support layers written in Python (will link against Python libs)
 WITH_PYTHON_LAYER := 1

# Whatever else you find you need goes here.
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib

# If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
# INCLUDE_DIRS += $(shell brew --prefix)/include
# LIBRARY_DIRS += $(shell brew --prefix)/lib

# Uncomment to use `pkg-config` to specify OpenCV library paths.
# (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
# USE_PKG_CONFIG := 1

# N.B. both build and distribute dirs are cleared on `make clean`
BUILD_DIR := build
DISTRIBUTE_DIR := distribute

# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
# DEBUG := 1

# The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0

# enable pretty build (comment to see full commands)
Q ?= @

进入caffe目录，执行：

make all
make test
make runtest

无错误，编译完成。

编译pycaffe与matcaffe

进入caffe目录，执行

make pycaffe
make matcaffe

Caffe python接口

复制caffe/python/caffe 到/usr/local/lib/python2.7/dist-packages/目录下。
复制caffe/build/lib/下的库文件到/usr/local/lib

$ sudo ldconfig

打开python，import caffe，无错误。
或者将路径导入：

import sys
sys.path.insert(0, caffe_dir)
import caffe

Caffe C++接口

分别将include,lib目录复制。

Caffe Debug

Cmake编译Caffe,可以用clion调试。在Cmakelist.txt中设置编译选项。

碰到的问题

库错误

在利用cmake编译caffe时，出现如下错误：

Linking CXX shared library ../../lib/libcaffe-d.so /usr/bin/ld: 
/usr/local/lib/libcblas.a(cblas_sgemv.o): relocation R_X86_64_32 
against `.rodata.str1.1’ can not be used when making a shared object; 
recompile with -fPIC /usr/local/lib/libcblas.a: error adding symbols: 
Bad value collect2: error: ld returned 1 exit status make[2]: * 
[lib/libcaffe-d.so.1.0.0-rc3] Error 1 make[1]: * 
[src/caffe/CMakeFiles/caffe.dir/all] Error 2 make: * [all] Error 2

解决方法：编辑cbuild文件夹下的CMakeCache.txt，将

//Path to a library. Atlas_CBLAS_LIBRARY:FILEPATH=path to
libcblas.a

改为

//Path to a library.
Atlas_CBLAS_LIBRARY:FILEPATH=path to
libcblas.so in your machine

这就应该是机器上利用不同方式多次装过这个库，文件较为混乱，找不到正确的库造成的。
Ubuntu14.04通过make+cmake编译安装caffe
进入cmake的build目录，执行make即可。

库冲突

系统的protobuf库是2.6，而python的protobuf库是3.3。
解决方法：更新系统protobuf库。手动下载protobuf源码，编译安装。最后记得sudo ldconfig。
注意，如果使用anaconda，由于anaconda库也有protobuf，注意别发生冲突。

测试

测试mnist http://caffe.berkeleyvision.org/gathered/examples/mnist.html

准备数据

cd $CAFFE_ROOT
./data/mnist/get_mnist.sh
./examples/mnist/create_mnist.sh

LeNet: the MNIST Classification Model

…

升级cuda8.0

安装cuda8.0，重启
编译cuda samples无法运行，提示错误：
这里写图片描述
应该是驱动版本没有更新
查看/etc/modprobe.d/目录下的文件，查看nvidia-graphics-drivers.conf：
将alias nvidia nvidia_352改为alias nvidia_367（具体改成什么，要看nvidia驱动生成的模块叫什么名字。）
将alias nvidia-uvm nvidia_352-uvm改为alias nvidia-uvm nvidia_367-uvm
这里写图片描述
问题解决。

系统升级了内核

重启后系统自动升级了内核，此时nvidia驱动需要重装。但是发现安装不上驱动。经查发现是因为这个新的内核版本存在bug，在装nvidia驱动的时候会报错。
我选择降低内核到之前的版本，具体方法需要修改grub，参见这篇文章http://blog.csdn.net/dl_chenbo/article/details/52400044

降低了内核之后，重新安装好了驱动384和对应的cuda 9。执行deviceQuary时依然出现错误：

cudaGetDeviceCount returned 30

这个错误一般是因为nvidia驱动没有安装成功，没有加载到内核中去。于是，我查询当前加载的nvidia内核有哪些：

lsmod | grep nvidia

这里写图片描述
发现缺少了nvidia_uvm模块。
于是尝试手动加载此模块，命令是

modprobe nvidia-uvm

结果报错，说无法插入nvidia_384_uvm。执行

sudo updatedb 更新数据库
locate --regex nvidia.*uvm.ko

发现内核模块是nvidia-uvm.ko.
这个问题应该是我在某个地方对nvidia-uvm取了别名nvidia_384_uvm,导致加载的时候是nvidia_384_uvm，而实际的内核是nvidia-uvm.ko。查看/etc/modprob.d/nvidia-graphics-drivers.conf,果然发现

这里写图片描述
注释掉之后。再次加载内核