前言:在装caffe前,早已对caffe的安装以及cuda、cudnn的环境配置略有耳闻,一星期甚至半个月。当我开始装的时候,由虚拟机下的Linux到裸机Linux借助conda环境再到裸机Linux直接配置环境,最后终于完全编译整个文件包,python,cuda等 后来想想真的不值得,完全直接上手pytorch吧。就因为搞VanillaCNN,想完全复现。害 执念太深,切勿模仿。
正文:我尝试了Ubuntu14.0、16.3、18.4、20.1,最后经过各种操作以及编译后的代码bug和体验,选用Ubuntu18.4,python环境建议 Python3.6,GUN包使用protocol3.0 与 gcc5.5、g++5.5(推荐), 最好换个固态,省时省力。
关于cuda和cudnn的安装可以参考官方教程 CUDA Toolkit Documentation
NVIDIA Deep Learning cuDNN Documentation或者下面的链接。
Ubuntu16.04+caffe+CUDA8.0+cuDNN v5+python编译 参考。
一 安装环境
配置文件参考,因为每个人的都不太一样:Install Caffe on Ubuntu 18.04 with OpenCV 4.4 - Q-engineering
1.下载安装包:Caffe | Installation: Ubuntu
$ sudo apt-get install cmake git unzip
$ sudo apt install python3-opencv
# BLAS
$ sudo apt-get install libatlas-base-dev # Atlas
or
$ sudo apt-get install libopenblas-dev # OpenBLAS
# Other dependencies
$ sudo apt-get install libgflags-dev libgoogle-glog-dev
$ sudo apt-get install libprotobuf-dev libleveldb-dev liblmdb-dev
$ sudo apt-get install libsnappy-dev libhdf5-serial-dev protobuf-compiler
$ sudo apt-get install --no-install-recommends libboost-all-dev
$ sudo apt-get install the python3-dev python3-skimage
$ sudo pip3 install pydot protobuf
$ sudo apt-get install graphviz
2. 安装完caffe前记得补充:
sudo apt-get install python3-dev python3-skimage
pip3 install protobuf==3.0.0-alpha-3
3. 执行安装caffe时,一定要注意加sudo 在每一步的命令前。
$ sudo make clean
$ sudo make all -j$(nproc)
$ sudo make test -j$(nproc)
$ sudo ldconfig /usr/local/cuda/lib64 //若下面runtest错误 运行这一句
$ sudo make runtest -j$(nproc)
$ sudo ldconfig
4. 安装pycaffe
先进入caffe/python,执行(注意Python2.7的pip位置) -i后面代表镜像地址。
for req in $(cat requirements.txt); do sudo pip2 install -i https://pypi.tuna.tsinghua.edu.cn/simple some-package $req; done
for req in $(cat requirements.txt); do pip3 install -i https://pypi.tuna.tsinghua.edu.cn/simple some-package --use-feature=2020-resolver $req; done
后运行
Add the path of caffe/python directory to ~/.bashrc
$ export PYTHONPATH="$PYTHONPATH:$HOME/caffe/python
$ source .bashrc
$ sudo make pycaffe
$ sudo make pytest
二 安装遇到的问题
问题1
CXX src/caffe/util/im2col.cpp
In file included from src/caffe/util/im2col.cpp:4:0:
./include/caffe/util/math_functions.hpp:7:10: fatal error: glog/logging.h: No such file or directory
#include "glog/logging.h"
^~~~~~~~~~~~~~~~
compilation terminated.
Makefile:591: recipe for target '.build_release/src/caffe
解决:
apt-get install libgoogle-glog-dev libgflags-dev
问题2
CXX src/caffe/util/im2col.cpp
In file included from ./include/caffe/util/math_functions.hpp:9:0,
from src/caffe/util/im2col.cpp:4:
./include/caffe/common.hpp:4:10: fatal error: boost/shared_ptr.hpp: No such file or directory
#include <boost/shared_ptr.hpp>
^~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
Makefile:591: recipe for target '.build_release/src/caffe/util/im2col.o' failed
make: *** [.build_release/src/caffe/util/im2col.o] Error 1
解决
sudo apt-get install libboost-all-dev
sudo apt-get install boost-devel
问题3
CXX src/caffe/util/im2col.cpp
In file included from ./include/caffe/util/math_functions.hpp:11:0,
from src/caffe/util/im2col.cpp:4:
./include/caffe/util/mkl_alternate.hpp:14:10: fatal error: cblas.h: No such file or directory
#include <cblas.h>
^~~~~~~~~
compilation terminated.
Makefile:591: recipe for target '.build_release/src/caffe/util/im2col.o' failed
make: *** [.build_release/src/caffe/util/im2col.o] Error 1
解决
sudo apt-get install libatlas-base-dev
问题4:
E: Unable To Locate Package Error On Ubuntu
解决:
Solve E: Unable To Locate Package Error On Ubuntu
Fix E: "Unable to Locate Package" Error in Kali Linux • WebMatLog
问题5
Makefile:591: recipe for target '.build_release/src/caffe/util/blocking_queue.o' failed
make: *** [.build_release/src/caffe/util/blocking_queue.o] Error 1
解决:
Ensure that it's atleast version 4.8
I suggest you to install g++-4.8 and selecting it in Makefile.config (CUSTOM_CXX :=g++-4.8) then compiling again.
Also ensure that you're using appropriate version of boost lib for given version of g++ that you're using.
sudo apt-get install g++-4.8
问题6:
CXX src/caffe/util/hdf5.cpp
In file included from src/caffe/util/hdf5.cpp:2:0:
./include/caffe/util/hdf5.hpp:7:18: fatal error: hdf5.h: No such file or directory
#include "hdf5.h"
^
compilation terminated.
解决
sudo apt-get install libhdf5-dev
add patch to libhdf5 in Makefile.config.
example: INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/
sudo apt-get install libopencv-dev
sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev
sudo apt-get install libleveldb-dev
问题7
nvcc fatal : Unsupported gpu architecture 'compute_20'
解决
nvcc fatal : Unsupported gpu architecture 'compute_20' while cuda 9.1+caffe+openCV 3.4.0 is installed
问题8
collect2: error: ld returned 1 exit status
Makefile:582: recipe for target '.build_release/lib/libcaffe.so.1.0.0' failed
解决 you should "make clean" and the "make all" after modify you Makefile.config
make clean
make all
mkdir build
cd build
cmake ..
make all
make install
make runtest
问题
could not find snappy missing :SNAPPY_LIBRARIES
解决
sudo apt-get install libsnappy-dev
问题:
CMake Error at cmake/Cuda.cmake:227 (message): cuDNN version >3 is required.
解决
in Openpose change the cmake/cuda.cmake file and the /cmake/modules/FindCuDNN.cmake file.
Find the line that reads:
file(READ {CUDNN_INCLUDE}/cudnn.h CUDNN_VERSION_FILE_CONTENTS) change to: file(READ {CUDNN_INCLUDE}/cudnn_version.h CUDNN_VERSION_FILE_CONTENTS)
安装Cmake
How to install CMake on Ubuntu | FOSS Linux
问题
CMake Error at CMakeLists.txt:107 (add_dependencies):
The dependency target "pycaffe" of target "pytest" does not exist.
解决
sudo apt-get install the python-dev
sudo apt-get python-matplotlib
sudo apt-get install python-scipy
sudo apt-get install python-numpy
问题:
When compile with CUDA10, there's a warning like this.
In file included from src/caffe/util/math_functions.cu:1:0:
/usr/local/cuda/include/math_functions.h:54:2: warning: #warning "math_functions.h is an internal header file and must not be used directly. This file will be removed in a future CUDA release. Please use cuda_runtime_api.h or cuda_runtime.h instead." [-Wcpp]
#warning "math_functions.h is an internal header file and must not be used directly. This file will be removed in a future CUDA release. Please use cuda_runtime_api.h or cuda_runtime.h instead."
This PR aims to remove the deprecated header file.
解决
Replace math_functions.h with cuda_runtime.h to remove CUDA compile w… by chenzeyuczy · Pull Request #6681 · BVLC/caffe
问题:
Why do I get “/sbin/ldconfig.real: /usr/local/cuda/lib64/libcudnn.so.7 is not a symbolic link”?
解决:
Timeline for answer to Why do I get "/sbin/ldconfig.real: /usr/local/cuda/lib64/libcudnn.so.7 is not a symbolic link"? by Rika
错误 :
错误libcaffe.so.1.0.0: undefined symbol: _ZNK6google8protobuf7Message11GetTypeNameB5cxx11Ev ,这是由于protobuf不是采用默认配置安装的,跟caffe不匹配。
解决
https://www.twblogs.net/a/5cc07f0fbd9eee397113d7eb?lang=zh-cn
安装python所需要的插件;for req in $(cat requirements.txt); do pip install --use-feature=2020-resolver $req; done
问题:
Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: Python
解决:
其原因是没有开启对python的支持,需要在Makefile.conf文件中开启如下开关
WITH_PYTHON_LAYER=1
然后再
cd caffe-master/python
for req in $(cat requirements.txt); do pip install $req; done
make -j16 && make pycaffe
配置路径
export PYTHONPATH
cd $PYTHONPATH 进入这个路径后发现新的环境变量配置没有生效!!!
如果想其立即生效,需要执行如下命令: source ~/.bashrc
问题:
Makefile:635: recipe for target '.build_release/tools/compute_image_mean.bin' failed
解决:
step1:去掉Makefile.config里的 #USE_PKG_CONFIG := 1 的注释
step2:在其下面一行添加:
LIBRARIES += glog gflags protobuf leveldb snappy
lmdb boost_system hdf5_hl hdf5 m
opencv_core opencv_highgui opencv_imgproc opencv_imgcodecs
保存后,make clean后,再编译。
问题:
error while loading shared libraries: libgfortran.so.4: cannot open shared object file: No such file or directory
解决
You can already install gcc-7 and g++-7 from this package.
sudo add-apt-repository ppa:jonathonf/gcc-7.1
sudo apt-get update
sudo apt-get install gcc-7 g++-7
问题:
/usr/bin/ld: warning: libcudart.so.9.0, needed by /usr/local/lib/libopencv_core.so, may
解决:Caffe-GPU错误:/usr/bin/ld: warning: libcudart.so.9.0, needed by /usr/local/lib/libopencv_core.so, may
错误原因: 系统中同时存在libcudart.so.9.0和 libcudart.so.9.1,使用find命令找到你不需要的那个去掉就行了。
解决方法:
在终端输入:
find / -name libcudart.so.9.*
可以看到同时存在libcudart.so.9.0和 libcudart.so.9.1.
此时可以通过 nvcc -V 命令查看自己的Cuda版本.
本人安装的cuda9.0,所以libcudart.so.9.1是不需要的.
故进入libcudart.so.9.1的目录,进行以下移除操作:
mv libcudart.so.9.1 libcudart.so.9.1.bak
mv libcudart.so.9.1.85 libcudart.so.9.1.85.bak
再次
sudo cp libcudart.so.10.0 /usr/local/lib/libcudart.so.10.0
sudo cp libcublas.so.10.0 /usr/local/lib/libcublas.so.10.0
sudo cp libcurand.so.10.0 /usr/local/lib/libcurand.so.10.0
问题:
error while loading shared libraries: libhdf5_hl.so.10: cannot open shared object
解决 Error loading shared library libhdf5_hl.so. · Issue #1463 · BVLC/caffe
I had this same problem. I found that libhdf5-dev created the files
/usr/lib/x86_64-linux-gnu/libhdf5_hl.so.7
/usr/lib/x86_64-linux-gnu/libhdf5.so.7
After copying them to the new names
/usr/lib/x86_64-linux-gnu/libhdf5_hl.so.8
/usr/lib/x86_64-linux-gnu/libhdf5.so.8
make runtest worked fine. I'm a Linux neophyte though so couldn't tell you why it was looking for the .8 files rather than the .7 files (or why my version of libhdf5-dev installed the .7 files rather than the .8 ones)
问题
/sbin/ldconfig.real: /usr/lib/x86_64-linux-gnu/libhdf5_hl.so.100 不是符号连接
解决
注意路径不是在 /home/user/anancond3/lib 而是在/home/user/anancond3/envs/caffe2.7/lib/里
然后:完美解决错误:libhdf5_hl.so.100(XXX): cannot open shared object file: No such file or directory,Error127
问题
error: #error This file was generated by an older version of protoc which is #error This file was generated by an older version of protoc which is
解决
conda uninstall libprotobuf
conda uninstall protobuf
python2.7安装caffe Installing Caffe with CUDA in Conda
cmake -DBLAS=open -DCUDNN_INCLUDE=/usr/local/cuda/include/ -DCUDNN_LIBRARY=/usr/local/cuda/lib64/libcudnn.so -DCMAKE_PREFIX_PATH=$CONDA_PREFIX -DCMAKE_INSTALL_PREFIX=$CONDA_PREFIX -DCMAKE_CXX_FLAGS="-std=c++11" ..
问题:
Requirement alrady satisfied protobuf in ....python3/dist-packages
解决:
sudo apt update
sudo pip3 install protobuf==3.5.1
问题
.build_release/tools/caffe: error while loading shared libraries: libcudart.so.10.0: cannot open shared object file: No such file or directory
解决
sudo ldconfig /usr/local/cuda/lib64