【摘要】本文主要介绍了ubuntu环境下配置caffe开发的详细过程。本人由于缺乏经验,期间遇到了很多问题,经过了大约一周的时间,终于实现了初步理想效果。故撰此文,希望能带给其他同学有效的帮助。
一、软硬件信息
电脑硬件:如下图所示,本人电脑是属于较低配置,如果要深入学习caffe,训练GoogleNet,VGGNet等大型网络,建议使用更高配置(具体可上网搜索)。
软件信息:
Ubuntu16.04-LTS
cuda_8.0
cudnn-8.0-linux-x64-v6.0
NVIDIA-linux-x86_64-384.98.run
anaconda2
二、详细配置流程
1. ubuntu16的安装
ubuntu16.04-LTS版本下载地址:http://cn.ubuntu.com/download/点击打开链接 建议使用64bit版本,因为caffe的部分编程以及数据结构是基于64bit格式。
此外,由于要用到物理显卡,不建议使用虚拟机(无法检测物理显卡,驱动安装无法成功)。详细安装系统的教程请参考网友分享,笔者由于第一次安装双系统,曾经导致原有的win7也崩溃,在此特别写明部分注意事项:
1) 利用UltraISO软件把ubuntu.iso文件制作一个启动盘,在window系统下划出一块空闲磁盘空间给ubuntu。然后重启电脑,进入boost模式(跟进入BIOS的方式一样,因电脑不同而不同)
2) 按照安装索引完成ubuntu的安装。建议安装中文版ubuntu,笔者第一次安装英文版ubuntu后,无法使用汉语输入法。
3) 其余默认,约二十多min即可完成安装。
接下来就是caffe环境配置啦。首先贴出caffe官网教程,供有兴趣的同学查阅。http://caffe.berkeleyvision.org/install_apt.html 点击打开链接
个人认为,教程准确简洁,对于依赖库和软件包都罗列明了,但由于很多具体步骤没有体现,对于第一次配置环境的朋友来说,容易出现各种各样的问题(说多了都是泪。。。。).
2. 安装通用依赖项
根据教程,执行以下命令:
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install --no-install-recommends libboost-all-dev #以上都是通用依赖库
sudo apt-get install libatlas-base-dev #安装BLAS, 一般选择Atlas, 如果选择Openblas or MKL ,配置会复杂一点
sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev #安装gflags,glog,Mdb等库,遇到相关报错时会知道该命令的好处
cuda安装是环境配置过程中最坎坷的一环,笔者参考了很多网友的分享,但参差不齐,说法不一,导致重复安装了3次才得以成功。之所以使用cuda8.0,是为了更好地与caffe兼容,下载地址如下:https://developer.nvidia.com/accelerated-computing-toolkit
1)下载NVIDIA驱动
首先去官网(http://www.nvidia.com/Download/index.aspx?lang=en-us)查看适合自己显卡的驱动(下载runfile文件),笔者选择的NVIDIA-linux-x86_64-384.98.run。
2)安装驱动
ubuntu一般自带nouveau驱动,通过如下命令可以获得物理显卡信息。如没有结果显示,说明显卡不可引用,安装会失败
lspci | grep nvidia sudo apt-get install linux-headers-$(uname -r) #为当前系统核安装头文件和开发包
按ctrl+alt+F1进入控制台(就是一片黑屏,类似于cmd终端),执行以下指令:
sudo service lightdm stop # 关闭图形界面 sudo apt-get remove --purge nvidia* #卸载可能存在的nvidia驱动 sudo apt-get update sudo apt-get install dkms build-essential linux-headers-generic # 安装驱动可能需要的依赖
把已有的nouveau驱动加入黑名单并禁用nouveau内核模块
sudo gedit /etc/modprobe.d/blacklist-nouveau.conf
在上述新建文件中加入以下内容 blacklist nouveau options nouveau modeset=0 保存退出,执行下面命令 sudo update-initramfs -u
接下来安装前面下载的驱动。进入到驱动下载目录 sudo chmod u+x NVIDIA-Linux-x86_64-384.98.run #为驱动安装文件增加执行属性 sudo ./NVIDIA-Linux-x86_64-384.98.run #执行安装 sudo service lightdm start # 重新进入图形界面
可以用以下任一命令确认驱动是否正确安装 cat /proc/driver/nvidia/version lspci | grep nvidia
sudo chmod u+x cuda_8.0.61_375.26_linux.run #为驱动安装文件增加执行属性,文件名以自己的为准
sudo ./cuda_8.0.61_375.26_linux.run #执行安装
安装过程除了拒绝安装驱动以外,openGL也不要安装,sample可以选择安装用以验证。笔者曾经出现无法登录桌面(密码正确,但就是反复验证都进不去)的情况。具体原因不太确定,但如果严格执行前面步骤,则可以顺利通过(经验总结出来的...)。但万一真的出现这种情况,那么只能卸载cuda和驱动,重新安装了。重装步骤如下:
sudo /usr/local/cuda-8.0/bin/uninstall_cuda_8.0.pl
sudo /usr/bin/nvidia-uninstall
reboot
返回 步骤 1)
4) 配置环境变量
export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib6${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
为了避免可能出现的因依赖包缺失造成编译失败的问题,安装以下依赖项:
sudo apt-get install g++ freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libglu1-mesa libglu1-mesa-dev
进入到安装的sample目录下(路径由user在安装cuda时候确定),进入到~/NVIDIA_CUDA-8.0_Samples目录,执行
sudo make all -j8 #用8个核心进行编译
然后到~/NVIDIA_CUDA-8.0_Samples//bin/x86_64/linux/release 目录下,执行
./deviceQuery # 获取GPU信息
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 650"
CUDA Capability Major/Minor version number: 3.0
Total amount of global memory: 975 MBytes (1022820352 bytes)
( 2) Multiprocessors, (192) CUDA Cores/MP: 384 CUDA Cores
GPU Max Clock rate: 1110 MHz (1.11 GHz)
Memory Clock rate: 2500 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 262144 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 650
Result = PASS
4. cudnn的安装
cudnn是nvidia开发的用于深度学习算法的一款辅助GPU加速的软件。nvidia官方下载地址 :https://developer.nvidia.com/rdp/cudnn-download 点击打开链接 需要注册会员才能下载. 在文件目录下执行:
sudo tar -zxvf cudnn-8.0-linux-x64-v6.0.tgz #解压缩
cd cuda/include # 进入cudnn解压缩路径
sudo cp *h /usr/local/cuda/include/ #将cudnn头文件拷贝到cuda目录
cd ../lib64
sudo cp lib* /usr/local/cuda/lib64/ #将相关库文件也拷贝到cuda目录
cd /usr/local/cuda/lib64 #进入cuda目录,进行软连接
sudo chmod +r libcudnn.so.6.0.1 #具体文件名以自己为准 sudo ln -sf libcudnn.so.6.0.1 libcudnn.so.6 sudo ln -sf libcudnn.so.6 libcudnn.so sudo ldconfig #重配置使软连接生效
5. anaconda2的安装
python --version # 该命令可用来查询是否安装python以及版本
下载地址如下:https://www.anaconda.com/download/#all 点击打开链接 请选择python2.7version。
sudo chmod u+x Anaconda2-5.0.1-Linux-x86_64.sh #为sh文件增加执行属性,文件名以自己的为准
sudo bash Anaconda2-5.0.1-Linux-x86_64.sh #执行安装
至此,caffe开发的基本环境算是配置完成了。至于Opencv和matlab接口,caffe默认状态下没有涉及,所以本文暂不提供相关安装方案(其实是他们安装起来太麻烦了,笔者失败了好多次,暂时搁置。。。)。
三、下载caffe并运行
1.下载caffe文件
cp Makefile.config.example Makefile.config # 复制make配置文件
gedit Makefile.config #按照自己的意图修改
以下是本人的Make.config文件内容
## Refer to http://caffe.berkeleyvision.org/installation.html
# Contributions simplifying and improving our build system are welcome!
# cuDNN acceleration switch (uncomment to build with cuDNN).
# USE_CUDNN := 1
# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1
# uncomment to disable IO dependencies and corresponding data layers
# USE_OPENCV := 0
# USE_LEVELDB := 0
# USE_LMDB := 0
# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)
# You should not set this flag if you will be reading LMDBs with any
# possibility of simultaneous read and write
# ALLOW_LMDB_NOLOCK := 1
# Uncomment if you're using OpenCV 3
# OPENCV_VERSION := 3
# To customize your choice of compiler, uncomment and set the following.
# N.B. the default for Linux is g++ and the default for OSX is clang++
# CUSTOM_CXX := g++
# CUDA directory contains bin/ and lib/ directories that we need.
CUDA_DIR := /usr/local/cuda
# On Ubuntu 14.04, if cuda tools are installed via
# "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
# CUDA_DIR := /usr
# CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the *_50 through *_61 lines for compatibility.
# For CUDA < 8.0, comment the *_60 and *_61 lines for compatibility.
# For CUDA >= 9.0, comment the *_20 and *_21 lines for compatibility.
#CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \
# -gencode arch=compute_20,code=sm_21
CUDA_ARCH := -gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_50,code=sm_50 \
-gencode arch=compute_52,code=sm_52 \
-gencode arch=compute_60,code=sm_60 \
-gencode arch=compute_61,code=sm_61 \
-gencode arch=compute_61,code=compute_61
# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
BLAS := atlas
# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
# Leave commented to accept the defaults for your choice of BLAS
# (which should work)!
# BLAS_INCLUDE := /path/to/your/blas
# BLAS_LIB := /path/to/your/blas
# Homebrew puts openblas in a directory that is not on the standard search path
# BLAS_INCLUDE := $(shell brew --prefix openblas)/include
# BLAS_LIB := $(shell brew --prefix openblas)/lib
# This is required only if you will compile the matlab interface.
# MATLAB directory should contain the mex binary in /bin.
# MATLAB_DIR := /usr/local
# MATLAB_DIR := /Applications/MATLAB_R2012b.app
# Reconmend to use anaconda2 to avoid some possiple troubles.
# NOTE: this is required only if you will compile the python interface.
# We need to be able to find Python.h and numpy/arrayobject.h.
#PYTHON_INCLUDE := /usr/include/python2.7 \
/usr/lib/python2.7/dist-packages/numpy/core/include
# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
ANACONDA_HOME := /home/zengsh/anaconda2
PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
$(ANACONDA_HOME)/include/python2.7 \
$(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include
# Uncomment to use Python 3 (default is Python 2)
# PYTHON_LIBRARIES := boost_python3 python3.5m
# PYTHON_INCLUDE := /usr/include/python3.5m \
# /usr/lib/python3.5/dist-packages/numpy/core/include
# We need to be able to find libpythonX.X.so or .dylib.
#PYTHON_LIB := /usr/lib
PYTHON_LIB := $(ANACONDA_HOME)/lib
# Homebrew installs numpy in a non standard path (keg only)
# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
# PYTHON_LIB += $(shell brew --prefix numpy)/lib
# Uncomment to support layers written in Python (will link against Python libs)
# WITH_PYTHON_LAYER := 1
# Whatever else you find you need goes here.
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib
# If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
# INCLUDE_DIRS += $(shell brew --prefix)/include
# LIBRARY_DIRS += $(shell brew --prefix)/lib
# NCCL acceleration switch (uncomment to build with NCCL)
# https://github.com/NVIDIA/nccl (last tested version: v1.2.3-1+cuda8.0)
# USE_NCCL := 1
# Uncomment to use `pkg-config` to specify OpenCV library paths.
# (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
# USE_PKG_CONFIG := 1
# N.B. both build and distribute dirs are cleared on `make clean`
BUILD_DIR := build
DISTRIBUTE_DIR := distribute
# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
# DEBUG := 1
# The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0
# enable pretty build (comment to see full commands)
Q ?= @
主要在原有基础上做了如下修改:
cd python
for req in $(cat requirements.txt); do pip install $req; done
至此,在caffe-master目录,执行
make all -j16 #16个核心,可以加快速度
make test -j4
make runtest -j8 #
上面三个指令看似简单,却能检查出很多问题,笔者对此深有感触。相关记录在下面的内容进行分享。
3.mnist网络示例
cd $CAFFE_ROOT #一定要在文件根目录下执行
./data/mnist/get_mnist.sh
./examples/mnist/create_mnist.sh
具体细节可以参照官网: http://caffe.berkeleyvision.org/gathered/examples/mnist.html 点击打开链接四、问题与解决方案
1.显卡驱动问题(反复登录)
该问题请参考前面关于cuda安装的内容,正确流程操作下来应该不会出现问题~
2. 执行make可能出现的问题
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/home/ubuntu/anaconda/lib" #具体路径根据自己的情况而定