深度学习caffe2环境搭建指北

最新推荐文章于 2024-08-21 09:30:35 发布

爱吃绿豆沙的王叔

最新推荐文章于 2024-08-21 09:30:35 发布

阅读量1.1k

点赞数

文章标签： caffe2 深度学习人工智能

本文链接：https://blog.csdn.net/yourkenny/article/details/80484493

版权

  需要安装 ： 

  1.nvidia驱动 

  2.nvidia的cuda，cudnn 

  python 

caffe2

  detectron 

  cocoapi 

 
 1.先确定机器的nvidia显卡是否支持cuda 

 
 https://developer.nvidia.com/cuda-gpus 

 
 2.检查是否已安装显卡驱动 

  输入命令 nvidia-smi，如果有输出，则说明已安装，如果没有则需要安装，并禁掉原来的集成显卡 

  nvidia驱动的安装可以直接根据显卡类型到官网下载。 

  这里推荐另外一种，就是cuda中自带的驱动。所以可以直接安装cuda 

 
 3.安装cuda，wget会被墙，所以咱们最好先下载好，然后上传到服务器，然后执行以下命令 

  为了使nvidia的显卡生效，要先禁掉机器自带的第三方显卡驱动 
 nouveau 

 
 删除原有驱动（可选）： 

  sudo apt-get remove --purge nvidia*   

 
 禁用nouveau驱动： 

 
 编辑 /etc/modprobe.d/blacklist-nouveau.conf 文件，添加以下内容： 

  blacklist nouveau   

  blacklist lbm-nouveau   

  options nouveau modeset=0 

  alias nouveau off   

  alias lbm-nouveau off  
   

  安装cuda 

  sudo apt-get update && sudo apt-get install wget -y --no-install-recommendswget "http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb"sudo dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.debsudo apt-get updatesudo apt-get install cuda 

  重启服务器 

 
 4.安装cudnn，最好安装6.0以上版本，同理最好是先下载好离线包，然后上传到服务器 

  CUDNN_URL="http://developer.download.nvidia.com/compute/redist/cudnn/v5.1/cudnn-8.0-linux-x64-v5.1.tgz"wget ${CUDNN_URL}sudo tar -xzf cudnn-8.0-linux-x64-v5.1.tgz -C /usr/localrm cudnn-8.0-linux-x64-v5.1.tgz && sudo ldconfig 

 
 5.配置环境变量 

  vim /etc/profile 

  export PATH=/usr/local/cuda-8.0/bin:$PATH 

  export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH 

  export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH 

  export PYTHONPATH=/usr/local:$PYTHONPATH 

 
 6.检查python版本 

  在控制台python 

  如果输入python，没有进入python 命令行模式，则有可能是： 

  1.python解释器没有配置环境变量 

  2.本地有多个python版本，在当前用户的环境中，需要指定一个版本 

 
 7.参考官方文档安装caffe2 

  # Clone Caffe2's source code from our Github repositorygit clone --recursive https://github.com/pytorch/pytorch.git && cd pytorchgit submodule update --init# Create a directory to put Caffe2's build files inmkdir build && cd build# Configure Caffe2's build# This looks for packages on your machine and figures out which functionality# to include in the Caffe2 installation. The output of this command is very# useful in debugging. 

 
 在执行这一步之前，先确定cuda和驱动已经安装成功 

 
 nvcc -V 

 
 nvidia-smi 

  如果输出不正确，检查配置环境；如果有输出，则cmake ..# Compile, link, and install Caffe2sudo make install 

  官方链接： 
 https://caffe2.ai/docs/getting-started.html?platform=ubuntu&configuration=compile#install-with-gpu-support 

 
 注意 
 ： 

 
 按照官方的文档一步步操作，比较简单，但是在安装cuda和cudnn时， 
 会遇到墙的问题 
 ，所以建议按照文档建议的版本，提前安装cuda和cudnn 

 
 1.caffe2框架，需要使用到GPU 

 
 2.caffe2 编译会比较耗时，大概半个小时左右 

 
 3.在make install 之前，一定要检查本地环境是否stand by了。 

 
  在build路径下，cmake 

 
  在输出的信息里，确定cuda 是启用模式，cudnn 启用模式，cuda和cudnn的版本，python的版本 

 
 8.配置环境变量 

  vim /etc/profile 

  export PYTHONPATH=$PYTHONPATH:/home/pillow/pytorch/build 

 
 9.验证caffe2安装是否成功 

  cd ~ && python -c 'from caffe2.python import core' 2>/dev/null && echo "Success" || echo "Failure" 

  如果输出success则表示CPU版已经安装成功 

  python2 caffe2/python/operator_test/relu_op_test.py 

  输出ok 

  python2 -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())' 

  输出数量应该大于1，显示的是显卡的数量 

 
 10.安装python依赖 

  pip install numpy>=1.13 pyyaml>=3.12 matplotlib opencv-python>=3.2 setuptools Cython mock scipy 

 
 11.安装cocoapi 

  git clone https://github.com/cocodataset/cocoapi.git cd $COCOAPI/PythonAPI# Install into global site-packagesmake install# Alternatively, if you do not have permissions or prefer# not to install the COCO API into global site-packagespython2 setup.py install --user 

 
 12.安装Detectron 

  Clone the Detectron repository:git clone https://github.com/facebookresearch/detectron 

  Set up Python modules: 

  cd $DETECTRON/lib && make 
  

 
 检查是否可用 

  python2 $DETECTRON/tests/test_spatial_narrow_as_op.py 

  输出ok 

 
 13.安装detectron之后,在根目录执行，这一步是下载pkl文件 

  python2 tools/infer_simple.py \ 

  --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml \ 

  --output-dir /tmp/detectron-visualizations \ 

  --image-ext jpg \ 

  --wts https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl \ 

  demo 

  +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 

  遇到的问题 

 
 1.编译到百分之90左右，会有报错如下： 

  [ 90%] Linking CXX executable ../bin/mpi_gpu_test 

  /usr/bin/ld: CMakeFiles/mpi_gpu_test.dir/mpi/mpi_gpu_test.cc.o: undefined refere nce to symbol '_ZN3MPI8Datatype4FreeEv' 

  //usr/lib/libmpi_cxx.so.1: error adding symbols: DSO missing from command line 

  collect2: error: ld returned 1 exit status 

  caffe2/CMakeFiles/mpi_gpu_test.dir/build.make:110: recipe for target 'bin/mpi_gp u_test' failed 

  make[2]: *** [bin/mpi_gpu_test] Error 1 

  CMakeFiles/Makefile2:3544: recipe for target 'caffe2/CMakeFiles/mpi_gpu_test.dir /all' failed 

  make[1]: *** [caffe2/CMakeFiles/mpi_gpu_test.dir/all] Error 2 

  Makefile:138: recipe for target 'all' failed 

  make: *** [all] Error 2 

 
 解决方案： 

 
 cmake .. -DUSE_MPI=OFF 

 
 DUSE_MPI是多机并行计算才会用到，可以关闭 

 
 2.再次编译，仍然不起效果。再次编译前要删除整个编译文件夹，以防缓存的影响 

 
 3.在跑测试代码的时候，需要先确定环境变量已生效，确保万一，重新发起一个会话 

  export JAVA_HOME=/home/pillow/java_env/jdk8 

  export JRE_HOME=${JAVA_HOME}/jre 

  export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib 

  export PATH=${JAVA_HOME}/bin:$PATH 

  export PATH=/usr/local/cuda-8.0/bin:$PATH 

  export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH 

  export PYTHONPATH=/usr/local:$PYTHONPATH 

  export PYTHONPATH=/home/pillow/pytorch/build:$PYTHONPATH 

  export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH 

爱吃绿豆沙的王叔

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫