mxnet在linux上的安装

最新推荐文章于 2024-08-14 18:30:34 发布

hyperminer

最新推荐文章于 2024-08-14 18:30:34 发布

阅读量1.5w

点赞数 3

分类专栏：深度学习

本文链接：https://blog.csdn.net/zhangweijiqn/article/details/53199955

版权

深度学习专栏收录该内容

13 篇文章 0 订阅

订阅专栏

  安装MXNET: 

 
 http://mxnet.io/get_started/setup.html 

  问题查找可以首先考虑 github issues. 

 
 If you are running Python on Amazon Linux or Ubuntu, you can use Git Bash scripts to quickly install the MXNet libraries and all dependencies. If you are using other languages or operating systems, skip to  
 Standard Installation 
 .（如果是用python运行并且安装在ubuntu/Amazon Linux上，可以使用Git Bash脚本来 
 快速安装 
 ；其他的按照标准方式安装） 

 
 Quick Installation 
  on ubuntu: 

  git clone https://github.com/dmlc/mxnet.git ~/MXNet/mxnet --recursive 

  cd ~/MXNet/mxnet/setup-utils 

  bash install-mxnet-ubuntu.sh 

 
 Standard Installation 
 ： 

 
 Minimum Requirements 

 
 You must have the following: 

A C++ compiler that supports C++ 11 The C++ compiler compiles and builds MXNet source code. Supported compilers include the following:

A BLAS (Basic Linear Algebra Subprograms) library BLAS libraries contain routines that provide the standard building blocks for performing basic vector and matrix operations. You need a BLAS library to perform basic linear algebraic operations. Supported BLAS libraries include the following:

 
 Build MXNet on Ubuntu/DebianOn Ubuntu versions 13.10 or later, you need the following dependencies:* Git (to pull code from GitHub)* libatlas-base-dev (for linear algebraic operations)* libopencv-dev (for computer vision operations)Install these dependencies using the following commands:```bashsudo apt-get updatesudo apt-get install -y build-essential git libatlas-base-dev libopencv-dev 

 
 After you have downloaded and installed the dependencies, use the following commands to pull the MXNet source code from Git and build MXNet: 

 
 git clone --recursive https://github.com/dmlc/mxnet 
 cd 
  mxnet; make -j 
 $( 
 nproc 
 ) 

 
 从安装的命令中可以看出要安装的软件如下： 

 
 libatlas-base-dev 
  
 libopencv-dev

  如果是ubuntu系统，上面安装出问题的话，可以一步步安装： 

 
 sudo apt-get update 

  sudo apt-get install -y build-essential git libatlas-base-dev libopencv-dev 

 
 git clone --recursive https://github.com/dmlc/mxnet 

  cd mxnet 

  make -j4 

 
 sudo apt-get install python-numpy 
 # for debian 

 
 sudo apt-get install python-setuptools 
 # for debian 

  cd python; sudo python setup.py install 

  // 这里不建议使用sudo 

  // 
 注意sudo是把install的东西安装到了root用户的python环境变量里，这里的一个坑就是在当前非root用户下执行python后，本地安装是装在anaconda的python环境下，import mxnet后报no module named mxnet的错。切换到root用户（python路径：/usr/bin/python），python可以正常执行import mxnet. 

  安装完后在python 的lib目录中会发现： ./python2.7/site-packages/mxnet-0.7.0-py2.7.egg/mxnet的目录。 

  测试是否安装成功： 

 
 import 
  
 mxnet 
  
 as 
  
 mx

  mxnet依赖opencv,安装opencv的时候可能依赖很多其他库 

  安装opencv依赖问题 

  sudo apt-get install -y build-essential git libblas-dev libopencv-dev 

  正在读取软件包列表... 完成 

  正在分析软件包的依赖关系树 

  正在读取状态信息... 完成 

  build-essential 已经是最新的版本了。 

  build-essential 被设置为手动安装。 

  有一些软件包无法被安装。如果您用的是 unstable 发行版，这也许是 

  因为系统无法达到您要求的状态造成的。该版本中可能会有一些您需要的软件 

  包尚未被创建或是它们已被从新到(Incoming)目录移出。 

  下列信息可能会对解决问题有所帮助： 

  下列软件包有未满足的依赖关系： 

  libopencv-dev : 依赖: libopencv-objdetect-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装 

  依赖: libopencv-highgui-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装 

  依赖: libopencv-legacy-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装 

  依赖: libopencv-contrib-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装 

  依赖: libopencv-videostab-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装 

  依赖: libopencv-superres-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装 

  依赖: libopencv-ocl-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装 

  依赖: libcv-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装 

  依赖: libhighgui-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装 

  依赖: libcvaux-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装 

  E: 无法修正错误，因为您要求某些软件包保持现状，就是它们破坏了软件包间的依赖关系。 

 
 如果遇到上面的错误，就需要手工安装OpenCV,参考下面。 

  Centos安装Mxnet 

  （这里主要是centos6.5, centos7会好安装一些）： 

  Issues: 
  https://github.com/dmlc/mxnet/issues/3324 

 
 https://github.com/dmlc/mxnet/issues/1303 

 
 https://github.com/dmlc/mxnet/issues/1125 

  centos安装，官方提供的例子是apt-get,适用于debian系列linux，centos不适用，install-mxnet-ubuntu.sh中是一些apt-get命令。 

  在centos系统上，安装会比ubuntu系统困难些，文档比较少，参考issues： 
 https://github.com/dmlc/mxnet/issues/1303 

  rz命令安装（如果未安装）： 
 yum -y install lrzsz 

  问题：初次安装会遇到依赖问题，执行bash install-mxnet-ubuntu.sh后： 

  Setting up Install Process 

  No package 
 build-essential available. 

  No package 
 libatlas-base-dev available. 

  No package 
 libopencv-dev available. 

  Error: Nothing to do 

  问题： 

  ubuntu和centos问题(apt-get / yum)，建议centos版本》＝6.5 

  首先尝试使用yum来安装opencv： 

  sudo yum install atlas-devel opencv 

  sudo yum install opencv-devel 

  可以尝试使用下面的过程： 

 
 yum update  

 
 yum install -y build-essential git libatlas-base-dev libopencv-dev  

 
 yum install -y opencv opencv-devel atlas-devel  

 
 yum install gcc gcc-g++  

 
 ldconfig /etc/ld.so.cache  

 
 git clone --recursive  
 https://github.com/dmlc/mxnet

 
 cd mxnet  

 
 ./prepare_mkl.sh  

 
 cp make/config.mk .  

 
 vim config.mk  

 
 +31 ADD_LDFLAGS = -L/usr/lib64/atlas  

 
 vim mshadow/make/mshadow.mk  

 
 -68 MSHADOW_LDFLAGS += -lcblas  

 
 +68 MSHADOW_LDFLAGS += -lsatlas  

 
 yum info glib2  

 
 yum upgrade glib2  

 
 make -j4  

 
 [root@xdataimg2 mxnet]# ll lib  

 
 总用量 38920  

 
 -rw-r--r-- 1 root root 28637318 11月 12 12:32 libmxnet.a  

 
 -rwxr-xr-x 1 root root 11214217 11月 12 12:32 libmxnet.so 

 
 wget  
 https://bootstrap.pypa.io/get-pip.py

 
 python get-pip.py  

 
 pip install numpy 
 -i  
 http://pypi.mirrors.ustc.edu.cn/simple 
  --trusted-host pypi.mirrors.ustc.edu.cn  

 
 pip install scipy -i  
 http://pypi.mirrors.ustc.edu.cn/simple 
  --trusted-host pypi.mirrors.ustc.edu.cn  

 
 cd mxnet/python  

 
 python setup.py install 

  源码安装opencv 

  参考： 
 http://blog.csdn.net/kuaile123/article/details/20870731 

 
 首先安装opencv依赖： 

 
 yum install cmake gcc gcc-c++ gtk+-devel gimp-devel gimp-devel-tools gimp-help-browser zlib-devel libtiff-devel libjpeg-devel libpng-devel gstreamer-devel libavc1394-devel libraw1394-devel libdc1394-devel jasper-devel jasper-utils swig  
 Python 
  libtool nasm 

 
 sudo yum install opencv-devel 

 
 sudo yum install atlas-devel 

 
 // or  
 sudo yum install atlas-devel opencv 

 
 yum install cmake 

 
 在OpenCV官网 
 http://sourceforge.net/projects/opencvlibrary/files/ 
  下载所需版本，解压。 

 
 cd  OpenCV-2.4.10 

 
 cmake CMakeLists.txt 
    

 
 make & make install 

 
 make的时候可能会报错： 

  Linking CXX executable ../../bin/opencv_perf_core 

  ../../lib/libopencv_highgui.so.2.4.10: undefined reference to `png_set_longjmp_fn' 

  collect2: error: ld returned 1 exit status 

  G++版本： 

 
 查看g++版本： 

 
 g++ --version / g++ -v 

  gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) 

  显然g++比要求的版本是要低的，需要升级. 

  升级GCC/G++(两个是在一起的): 

  下载地址： 
 http://ftp.tsukuba.wide.ad.jp/software/gcc/releases/gcc-4.8.5/ 

  wget 
  http://ftp.tsukuba.wide.ad.jp/software/gcc/releases/gcc-4.8.5/gcc-4.8.5.tar.gz 

  tar -jxvf gcc-4.8.2.tar.bz2 

  cd gcc-4.8.2 

  ./contrib/download_prerequisites 

  mkdir build 

  mxnet on yarn: 

  dmlc-submit --mode <cluster-mode> [arguments] [command] 

  待测： dmlc-submit -h 

  --cluster string, {'mpi', 'yarn', 'local', 'sge'}, default to ${DMLC_SUBMIT_CLUSTER} 

  Job submission mode. 

  --num-workers integer, required 

  Number of workers in the job. 

  --num-servers` integer, default=0 

  Number of servers in the job. 

  --worker-cores integer, default=1 

  Number of cores needed to be allocated for worker job. 

  --server-cores integer, default=1 

  Number of cores needed to be allocated for server job. 

  --worker-memory string, default='1g' 

  Memory needed for server job. 

  --server-memory string, default='1g' 

  Memory needed for server job. 

  --jobname string, default=auto specify 

  Name of the job. 

  --queue string, default='default' 

  The submission queue we should submit the job to. 

  --log-level string, {INFO, DEBUG} 

  The logging level. 

  --log-file string, default='None' 

  Output log to the specific log file, the log is still printed on stderr. 

  tracker]# ./dmlc-submit --cluster=yarn --num-workers=2 --worker-cores=1 --num-servers=2 ../../example/image-classification/train_mnist.py 

  source activate ml2 

  hdfs dfs -put train-* /tmp/mnist 

  hdfs dfs -chomd -R 777 /tmp/mnist 

  tools/launch.py -n 2 --launcher yarn python train_mnist.py --data-dir hdfs:///tmp/mnist/cd build 

  ../configure --enable-checking=release --enable-languages=c,c++ --disable-multilib 

  报错：configure: error: Building GCC requires GMP 4.2+, MPFR 2.4.0+ and MPC 0.8.0+. 

  参考： 
 http://blog.csdn.net/ivanlxf/article/details/19080681 

 
 执行./contrib/download_prerequisities脚本会自动下载三个依赖库别为gmp-4.3.2、mpfr-2.4.2、mpc-0.8.1，也可以通过如下地址离线下载安装： 

 
 (1)安装gmp: 

 
 wget ftp://ftp.gnu.org/gnu/gmp/gmp-4.3.2.tar.bz2 

 
 tar -jxf gmp-4.3.2.tar.bz2 

 
 cd gmp-4.3.2 

 
 mkdir build 

 
 cd build 

 
 ../configure --prefix=/usr/local/gcc/gmp-4.3.2 

 
 make && make install 

 
 (2)安装 
 mpfr 

 
 wget http://www.mpfr.org/mpfr-2.4.2/mpfr-2.4.2.tar.bz2 

 
 tar -jxf mpfr-2.4.2.tar.bz2 

 
 mkdir build 

 
 cd build 

 
 ../configure  
 --prefix 
 =/usr/local/gcc/mpfr-2.4.2  
 --with-gmp 
 =/usr/local/gcc/gmp-4.3.2 

 
 make && make install 

 
 (3)安装mpc 

 
 wget  
 http://www.multiprecision.org/mpc/download/mpc-0.8.1.tar.gz 

 
 tar zxvf mpc-0.8.1.tar.gz 

  mkdir build 

  cd build 

 
 ../configure --prefix=/usr/local/gcc/mpc-0.8.1 --with-mpfr=/usr/local/gcc/mpfr-2.4.2 --with-gmp=/usr/local/gcc/gmp-4.3.2  

 
 make && make install 

 
 (4)添加共享库路径，su到root编辑ld.so.conf文件，添加如下内容到文件中： 

 
 编辑ld.so.conf文件，添加如下内容到文件中： 

 
 /usr/local/gcc/gmp-4.3.2/lib 

 
 /usr/local/gcc/mpfr-2.4.2/lib 

 
 /usr/local/gcc/mpc-0.8.1/lib 

 
 保存退出，执行ldconfig命令 

 
 继续执行gcc的configure，依然报上面的错误，手工指定上面三个库的路径： 

  ../configure --prefix=/usr/local/gcc --enable-threads=posix --disable-checking --enable-languages=c,c++ --disable-multilib --with-gmp=/usr/local/gcc/gmp-4.3.2 --with-mpfr=/usr/local/gcc/mpfr-2.4.2 --with-mpc=/usr/local/gcc/mpc-0.8.1 

  通过之后，执行 make && make install (等待时间比较长) 

  (5)卸载旧的，配置新的： 

  yum remove gcc 

  yum remove gcc-c++ 

  updatedb 

  cd /usr/bin // gcc,g++所在路径，可以通过which g++查看 

  ln -s /usr/local/gcc/bin/gcc gcc 

  ln -s /usr/local/gcc/bin/g++ g++ 

  Clang安装 

  sudo yum install clang 

  源码安装mxnet: 

 
 git clone --recursive 
 https://github.com/dmlc/mxnet 

 
 cd 
  mxnet; 

  cp make/config.mk . 

 
 make -j4 

 
 sudo yum install python-numpy 
 # for redhat 

 
 cd 
  python; sudo python setup.py install 

 
 import mxnet as mx 

  make报错： 

  /usr/local/include/c++/4.8.0/condition_variable:83:5: note: no known conversion for implicit ‘this’ parameter from ‘const std::condition_variable*’ to ‘std::condition_variable*’ 

  经查（ 
 https://github.com/dmlc/mxnet/issues/530）是由于gcc版本过低引起的，升级gcc参考上面。 

  报错： 

  /usr/bin/ld: cannot find -lcblas 

  collect2: error: ld returned 1 exit status 

  make: *** [lib/libmxnet.so] Error 1 

  确保安装了cblas和 
 atlas 

 
 相关issues: 
 https://github.com/dmlc/mxnet/issues/1442 

 
 报错： 

  checking whether the C compiler works... no 

  configure: error: in `/root/App/MXNet/mxnet/ps-lite/protobuf-2.5.0': 

  configure: error: C compiler cannot create executables 

  See `config.log' for more details 

  make[1]: *** [/root/App/MXNet/mxnet/deps/include/google/protobuf/message.h] Error 77 

  make[1]: Leaving directory `/root/App/MXNet/mxnet/ps-lite' 

  make: *** [PSLITE] Error 2 

  Run MxNet on yarn: 

 
 http://mxnet.io/how_to/cloud.html 

  官方文档中的描述： 

 
 Use YARN, MPI, SGE 

 
 While ssh can be simple for cases when we do not have a cluster scheduling framework. MXNet is designed to be able to port to various platforms. We also provide other scripts in  
 tracker 
  to run on other cluster frameworks, including Hadoop(YARN) and SGE. Your contribution is more than welcomed to provide examples to run MXNet on your favourite distributed platform. 

  mxnet on yarn: 

  dmlc-submit --mode <cluster-mode> [arguments] [command] 

  待测： dmlc-submit -h 

  --cluster string, {'mpi', 'yarn', 'local', 'sge'}, default to ${DMLC_SUBMIT_CLUSTER} 

  Job submission mode. 

  --num-workers integer, required 

  Number of workers in the job. 

  --num-servers` integer, default=0 

  Number of servers in the job. 

  --worker-cores integer, default=1 

  Number of cores needed to be allocated for worker job. 

  --server-cores integer, default=1 

  Number of cores needed to be allocated for server job. 

  --worker-memory string, default='1g' 

  Memory needed for server job. 

  --server-memory string, default='1g' 

  Memory needed for server job. 

  --jobname string, default=auto specify 

  Name of the job. 

  --queue string, default='default' 

  The submission queue we should submit the job to. 

  --log-level string, {INFO, DEBUG} 

  The logging level. 

  --log-file string, default='None' 

  Output log to the specific log file, the log is still printed on stderr. 

  tracker]# ./dmlc-submit --cluster=yarn --num-workers=2 --worker-cores=1 --num-servers=2 ../../example/image-classification/train_mnist.py 

  source activate ml2 

  hdfs dfs -put train-* /tmp/mnist 

  hdfs dfs -chomd -R 777 /tmp/mnist 

  tools/launch.py -n 2 --launcher yarn python train_mnist.py --data-dir hdfs:///tmp/mnist/ 

  mxnet on multiple cpus: 

 
 http://mxnet.io/how_to/multi_devices.html 

  在一台机器上跑可以直接运行： 
  python train_mnist.py --network lenet 

  前提： 所有机器都编译通过并且安装了mxnet，并且机器之间可以通过ssh连接。 

  cd 
 mxnet/example/image-classification 

  echo "192.168.177.77" >> hosts //当前机器192.168.177.78， 

 
 ../../tools/launch.py -n 
 2 
  --launcher ssh -H hosts python train_mnist.py --network lenet --kv-store dist_sync 

 
 note that: 

 
 use  
 launch.py 
  to submit the job 

provide launcher, ssh if all machines are ssh-able, mpi if mpirun is available, sge for Sun Grid Engine, and yarn for Apache Yarn.
-n number of worker nodes to run
-H the host file which is required by ssh and mpi
--kv-store use either dist_sync or dist_async

 
 效果对比： 

  77,78两台机器上跑： 

  单台机器： 

  python train_mnist.py --network lenet 

  19:26:27-20:44:57 78minutes 

  两台机器： 

  ../../tools/launch.py -n 2 --launcher ssh -H hosts python train_mnist.py --network lenet --kv-store dist_sync 

  18:43:51-19:22:53 39minutes 

hyperminer

关注

3
点赞
踩
4

收藏

觉得还不错? 一键收藏
1
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录