安装环境:cudnn7.5+cuda8.0+ubuntun16.04
general dependencies:apt
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install --no-install-recommends libboost-all-dev
sudo apt-get install libatlas-base-dev
sudo apt-get instal python-dev
cd /usr/lib/x86_64-linux-gnu
ln -s libhdf5_serial_hl.so.10.0.2 libhdf5_hl.so
ln -s libhdf5_serial.so.10.1.0 libhdf5.so
apt install liblmdb-dev libgflags-dev libgoogle-glog-dev
apt install libgflags-dev
sudo apt-get install —reinstall python-pkg-resources
cuda8.0
compilation with Make
cp Makefile.config.example Makefile.config
# Adjust Makefile.config (for example, if using Anaconda Python, or if cuDNN is desired)
按照先前的环境配置 config文件:
若要使用python来编写layer,则
将 #WITH_PYTHON_LAYER := 1
修改为 WITH_PYTHON_LAYER := 1
重要的一项 :
将 # Whatever else you find you need goes here. 下面的
1 INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
2 LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib
修改为:
1 INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial
2 LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib
uncomment
USE_NCCL=1
要另外安装nccl(optimized primitives for collective multi-GPU communication), 到http://github.com/NVIDIA/nccl上下载解压
cd nccl
make test
export LD_LIBRARY_PATH=./build/lib:$LD_LIBRARY_PATH
./build/test/single/all_reduce_test 10000000
make install
problem: No module named pkg_resources
solution: sudo apt-get install —reinstall python-pkg-resources
the error when installing caffe on Ubuntu
Check failed: status == CURAND_STATUS_SUCCESS (201 vs. 0) CURAND_STATUS_LAUNCH_FAILURE
solution: update cuda8.0
compilation here:
make all
make test
make runtest
after make all , it will show:
CXX/LD -o .build_release/tools/caffe.bin
/usr/bin/ld: warning: libcudart.so.9.0, needed by /usr/local/lib/libnccl.so, may conflict with libcudart.so.7.5
CXX tools/compute_image_mean.cpp
CXX/LD -o .build_release/tools/compute_image_mean.bin
CXX tools/upgrade_solver_proto_text.cpp
CXX/LD -o .build_release/tools/upgrade_solver_proto_text.bin
CXX examples/mnist/convert_mnist_data.cpp
CXX/LD -o .build_release/examples/mnist/convert_mnist_data.bin
CXX examples/siamese/convert_mnist_siamese_data.cpp
CXX/LD -o .build_release/examples/siamese/convert_mnist_siamese_data.bin
CXX examples/cifar10/convert_cifar_data.cpp
CXX/LD -o .build_release/examples/cifar10/convert_cifar_data.bin
CXX examples/cpp_classification/classification.cpp
CXX/LD -o .build_release/examples/cpp_classification/classification.bin
after make test, it will show:
LD .build_release/src/caffe/test/test_bias_layer.o
LD .build_release/src/caffe/test/test_threshold_layer.o
LD .build_release/src/caffe/test/test_spp_layer.o
LD .build_release/src/caffe/test/test_benchmark.o
LD .build_release/src/caffe/test/test_hdf5data_layer.o
LD .build_release/src/caffe/test/test_euclidean_loss_layer.o
LD .build_release/src/caffe/test/test_deconvolution_layer.o
LD .build_release/src/caffe/test/test_data_layer.o
LD .build_release/src/caffe/test/test_softmax_with_loss_layer.o
LD .build_release/src/caffe/test/test_memory_data_layer.o
LD .build_release/src/caffe/test/test_lrn_layer.o
LD .build_release/src/caffe/test/test_image_data_layer.o
LD .build_release/src/caffe/test/test_convolution_layer.o
LD .build_release/cuda/src/caffe/test/test_im2col_kernel.o
after make runtest
problem: Warning! ***HDF5 library version mismatched error***
The HDF5 header files used to compile this application do not match
the version used by the HDF5 library to which this application is linked.
Data corruption or segmentation faults may occur if the application continues.
Headers are 1.10.1, library is 1.8.16
annconda自带的hdf5和系统后来装的hdf5不匹配,可以下版本对应的hdf5 1.8.16在anaconda上重新安装
问题则可以解决
if you want to use pycaffe:make
[----------] Global test environment tear-down
[==========] 2175 tests from 285 test cases ran. (751535 ms total)
[ PASSED ] 2175 tests.
表示安装成功
配置环境变量:
vi ~/.bashrc
export PYTHONPATH=/usr/caffe/python:$PYTHONPATH
source ~/.bashrc
配置pycaffe
先安装requirements.txt里面需要的Python包
cd caffe
make pycaffe
ImportError: /home/chkusr/gbx/caffe-master/python/caffe/_caffe.so: undefined symbol: _ZN5boost6python6detail11init_moduleER11PyModuleDefPFvvE
solution:在Makefile.config中取消注释:PYTHON_LIBRARIES := boost_python3 python3.6m
root@CNSZ035915:/usr/lib/x86_64-linux-gnu# ln -s libboost_python-py35.so libboost_python3.so
problem:
cannot find -lpython3.6m
solution:
cp /root/anaconda3/lib/libpython3.6m.so /usr/lib/libpython3.6m.so
cp -r /root/anaconda3/lib/python3.6 /usr/lib
make clean
make all
从头开始编译
Run the caffe example:
1.minist
cd $CAFFE_ROOT
./data/mnist/get_mnist.sh
./examples/mnist/create_mnist.sh
#use lenet network for the training
./examples/mnist/train_lenet.sh
2.cifar10
cd $CAFFE_ROOT
./data/cifar10/get_cifar10.sh
./examples/cifar10/create_cifar10.sh
./examples/cifar10/train_quick.sh
想要多GPU并行,可以在
./build/tools/caffe train --solver=... 后加一个选项--gpu all or --gpu 0,1,2,3
/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.25' not found (required by /usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0)
libpython3.6-dev