mxnet在linux上的安装


安装MXNET:
问题查找可以首先考虑 github issues.

If you are running Python on Amazon Linux or Ubuntu, you can use Git Bash scripts to quickly install the MXNet libraries and all dependencies. If you are using other languages or operating systems, skip to  Standard Installation .(如果是用python运行并且安装在ubuntu/Amazon Linux上,可以使用Git Bash脚本来 快速安装 ;其他的按照标准方式安装)

Quick Installation on ubuntu:
git clone https://github.com/dmlc/mxnet.git ~/MXNet/mxnet --recursive
cd ~/MXNet/mxnet/setup-utils
bash install-mxnet-ubuntu.sh

Standard Installation
Minimum Requirements
You must have the following:
  • A C++ compiler that supports C++ 11 The C++ compiler compiles and builds MXNet source code. Supported compilers include the following:
  • BLAS (Basic Linear Algebra Subprograms) library BLAS libraries contain routines that provide the standard building blocks for performing basic vector and matrix operations. You need a BLAS library to perform basic linear algebraic operations. Supported BLAS libraries include the following:
Build MXNet on Ubuntu/DebianOn Ubuntu versions 13.10 or later, you need the following dependencies:* Git (to pull code from GitHub)* libatlas-base-dev (for linear algebraic operations)* libopencv-dev (for computer vision operations)Install these dependencies using the following commands:```bashsudo apt-get updatesudo apt-get install -y build-essential git libatlas-base-dev libopencv-dev
After you have downloaded and installed the dependencies, use the following commands to pull the MXNet source code from Git and build MXNet:
git clone --recursive https://github.com/dmlc/mxnet cd mxnet; make -j $( nproc )
从安装的命令中可以看出要安装的软件如下:
libatlas-base-dev libopencv-dev

如果是ubuntu系统,上面安装出问题的话,可以一步步安装:
sudo apt-get update
sudo apt-get install -y build-essential git libatlas-base-dev libopencv-dev
git clone --recursive https://github.com/dmlc/mxnet
cd mxnet
make -j4
sudo apt-get install python-numpy # for debian
sudo apt-get install python-setuptools # for debian
cd python; sudo python setup.py install
// 这里不建议使用sudo
// 注意sudo是把install的东西安装到了root用户的python环境变量里,这里的一个坑就是在当前非root用户下执行python后,本地安装是装在anaconda的python环境下,import mxnet后报no module named mxnet的错。切换到root用户(python路径:/usr/bin/python),python可以正常执行import mxnet.
安装完后在python 的lib目录中会发现: ./python2.7/site-packages/mxnet-0.7.0-py2.7.egg/mxnet的目录。

测试是否安装成功:
import mxnet as mx

mxnet依赖opencv,安装opencv的时候可能依赖很多其他库
安装opencv依赖问题
sudo apt-get install -y build-essential git libblas-dev libopencv-dev
正在读取软件包列表... 完成
正在分析软件包的依赖关系树
正在读取状态信息... 完成
build-essential 已经是最新的版本了。
build-essential 被设置为手动安装。
有一些软件包无法被安装。如果您用的是 unstable 发行版,这也许是
因为系统无法达到您要求的状态造成的。该版本中可能会有一些您需要的软件
包尚未被创建或是它们已被从新到(Incoming)目录移出。
下列信息可能会对解决问题有所帮助:

下列软件包有未满足的依赖关系:
libopencv-dev : 依赖: libopencv-objdetect-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装
依赖: libopencv-highgui-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装
依赖: libopencv-legacy-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装
依赖: libopencv-contrib-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装
依赖: libopencv-videostab-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装
依赖: libopencv-superres-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装
依赖: libopencv-ocl-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装
依赖: libcv-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装
依赖: libhighgui-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装
依赖: libcvaux-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装
E: 无法修正错误,因为您要求某些软件包保持现状,就是它们破坏了软件包间的依赖关系。
如果遇到上面的错误,就需要手工安装OpenCV,参考下面。

Centos安装Mxnet
(这里主要是centos6.5, centos7会好安装一些):
centos安装,官方提供的例子是apt-get,适用于debian系列linux,centos不适用,install-mxnet-ubuntu.sh中是一些apt-get命令。
在centos系统上,安装会比ubuntu系统困难些,文档比较少,参考issues: https://github.com/dmlc/mxnet/issues/1303

rz命令安装(如果未安装): yum -y install lrzsz

问题:初次安装会遇到依赖问题,执行bash install-mxnet-ubuntu.sh后:
Setting up Install Process
No package build-essential available.
No package libatlas-base-dev available.
No package libopencv-dev available.
Error: Nothing to do

问题:
ubuntu和centos问题(apt-get / yum),建议centos版本》=6.5
首先尝试使用yum来安装opencv:
sudo yum install atlas-devel opencv
sudo yum install opencv-devel

可以尝试使用下面的过程:
yum update 
yum install -y build-essential git libatlas-base-dev libopencv-dev 
yum install -y opencv opencv-devel atlas-devel 
yum install gcc gcc-g++ 
ldconfig /etc/ld.so.cache 
git clone --recursive  https://github.com/dmlc/mxnet  
cd mxnet 
./prepare_mkl.sh 
cp make/config.mk . 
vim config.mk 
+31 ADD_LDFLAGS = -L/usr/lib64/atlas 
vim mshadow/make/mshadow.mk 
-68 MSHADOW_LDFLAGS += -lcblas 
+68 MSHADOW_LDFLAGS += -lsatlas 
yum info glib2 
yum upgrade glib2 
make -j4 
[root@xdataimg2 mxnet]# ll lib 
总用量 38920 
-rw-r--r-- 1 root root 28637318 11月 12 12:32 libmxnet.a 
-rwxr-xr-x 1 root root 11214217 11月 12 12:32 libmxnet.so
python get-pip.py 
pip install numpy -i  http://pypi.mirrors.ustc.edu.cn/simple  --trusted-host pypi.mirrors.ustc.edu.cn 
pip install scipy -i  http://pypi.mirrors.ustc.edu.cn/simple  --trusted-host pypi.mirrors.ustc.edu.cn 
cd mxnet/python 
python setup.py install

源码安装opencv

首先安装opencv依赖:
yum install cmake gcc gcc-c++ gtk+-devel gimp-devel gimp-devel-tools gimp-help-browser zlib-devel libtiff-devel libjpeg-devel libpng-devel gstreamer-devel libavc1394-devel libraw1394-devel libdc1394-devel jasper-devel jasper-utils swig  Python  libtool nasm
sudo yum install opencv-devel
sudo yum install atlas-devel
// or sudo yum install atlas-devel opencv
yum install cmake
在OpenCV官网 http://sourceforge.net/projects/opencvlibrary/files/  下载所需版本,解压。
cd  OpenCV-2.4.10
cmake CMakeLists.txt   
make & make install

make的时候可能会报错:
Linking CXX executable ../../bin/opencv_perf_core
../../lib/libopencv_highgui.so.2.4.10: undefined reference to `png_set_longjmp_fn'
collect2: error: ld returned 1 exit status

G++版本:
查看g++版本:
g++ --version / g++ -v
gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC)
显然g++比要求的版本是要低的,需要升级.

升级GCC/G++(两个是在一起的):
tar -jxvf gcc-4.8.2.tar.bz2
cd gcc-4.8.2
./contrib/download_prerequisites
mkdir build
mxnet on yarn:
dmlc-submit --mode <cluster-mode> [arguments] [command]
待测: dmlc-submit -h
--cluster string, {'mpi', 'yarn', 'local', 'sge'}, default to ${DMLC_SUBMIT_CLUSTER}
Job submission mode.
--num-workers integer, required
Number of workers in the job.
--num-servers` integer, default=0
Number of servers in the job.
--worker-cores integer, default=1
Number of cores needed to be allocated for worker job.
--server-cores integer, default=1
Number of cores needed to be allocated for server job.
--worker-memory string, default='1g'
Memory needed for server job.
--server-memory string, default='1g'
Memory needed for server job.
--jobname string, default=auto specify
Name of the job.
--queue string, default='default'
The submission queue we should submit the job to.
--log-level string, {INFO, DEBUG}
The logging level.
--log-file string, default='None'
Output log to the specific log file, the log is still printed on stderr.

tracker]# ./dmlc-submit --cluster=yarn --num-workers=2 --worker-cores=1 --num-servers=2 ../../example/image-classification/train_mnist.py
source activate ml2
hdfs dfs -put train-* /tmp/mnist
hdfs dfs -chomd -R 777 /tmp/mnist
tools/launch.py -n 2 --launcher yarn python train_mnist.py --data-dir hdfs:///tmp/mnist/cd build
../configure --enable-checking=release --enable-languages=c,c++ --disable-multilib
报错:configure: error: Building GCC requires GMP 4.2+, MPFR 2.4.0+ and MPC 0.8.0+.
执行./contrib/download_prerequisities脚本会自动下载三个依赖库别为gmp-4.3.2、mpfr-2.4.2、mpc-0.8.1,也可以通过如下地址离线下载安装:
(1)安装gmp:
wget ftp://ftp.gnu.org/gnu/gmp/gmp-4.3.2.tar.bz2
tar -jxf gmp-4.3.2.tar.bz2
cd gmp-4.3.2
mkdir build
cd build
../configure --prefix=/usr/local/gcc/gmp-4.3.2
make && make install

(2)安装 mpfr
wget http://www.mpfr.org/mpfr-2.4.2/mpfr-2.4.2.tar.bz2
tar -jxf mpfr-2.4.2.tar.bz2
mkdir build
cd build
../configure  --prefix =/usr/local/gcc/mpfr-2.4.2  --with-gmp =/usr/local/gcc/gmp-4.3.2
make && make install

(3)安装mpc
tar zxvf mpc-0.8.1.tar.gz
mkdir build
cd build
../configure --prefix=/usr/local/gcc/mpc-0.8.1 --with-mpfr=/usr/local/gcc/mpfr-2.4.2 --with-gmp=/usr/local/gcc/gmp-4.3.2 
make && make install
(4)添加共享库路径,su到root编辑ld.so.conf文件,添加如下内容到文件中:
编辑ld.so.conf文件,添加如下内容到文件中:
/usr/local/gcc/gmp-4.3.2/lib
/usr/local/gcc/mpfr-2.4.2/lib
/usr/local/gcc/mpc-0.8.1/lib
保存退出,执行ldconfig命令
继续执行gcc的configure,依然报上面的错误,手工指定上面三个库的路径:
../configure --prefix=/usr/local/gcc --enable-threads=posix --disable-checking --enable-languages=c,c++ --disable-multilib --with-gmp=/usr/local/gcc/gmp-4.3.2 --with-mpfr=/usr/local/gcc/mpfr-2.4.2 --with-mpc=/usr/local/gcc/mpc-0.8.1
通过之后,执行 make && make install (等待时间比较长)
(5)卸载旧的,配置新的:
yum remove gcc
yum remove gcc-c++
updatedb
cd /usr/bin // gcc,g++所在路径,可以通过which g++查看
ln -s /usr/local/gcc/bin/gcc gcc
ln -s /usr/local/gcc/bin/g++ g++
Clang安装
sudo yum install clang

源码安装mxnet:
git clone --recursive https://github.com/dmlc/mxnet
cd mxnet;
cp make/config.mk .
make -j4

sudo yum install python-numpy # for redhat
cd python; sudo python setup.py install

import mxnet as mx

make报错:
/usr/local/include/c++/4.8.0/condition_variable:83:5: note: no known conversion for implicit ‘this’ parameter from ‘const std::condition_variable*’ to ‘std::condition_variable*’
经查( https://github.com/dmlc/mxnet/issues/530)是由于gcc版本过低引起的,升级gcc参考上面。

报错:
/usr/bin/ld: cannot find -lcblas
collect2: error: ld returned 1 exit status
make: *** [lib/libmxnet.so] Error 1
确保安装了cblas和 atlas

报错:
checking whether the C compiler works... no
configure: error: in `/root/App/MXNet/mxnet/ps-lite/protobuf-2.5.0':
configure: error: C compiler cannot create executables
See `config.log' for more details
make[1]: *** [/root/App/MXNet/mxnet/deps/include/google/protobuf/message.h] Error 77
make[1]: Leaving directory `/root/App/MXNet/mxnet/ps-lite'
make: *** [PSLITE] Error 2

Run MxNet on yarn:
官方文档中的描述:
Use YARN, MPI, SGE
While ssh can be simple for cases when we do not have a cluster scheduling framework. MXNet is designed to be able to port to various platforms. We also provide other scripts in  tracker  to run on other cluster frameworks, including Hadoop(YARN) and SGE. Your contribution is more than welcomed to provide examples to run MXNet on your favourite distributed platform.

mxnet on yarn:
dmlc-submit --mode <cluster-mode> [arguments] [command]
待测: dmlc-submit -h
--cluster string, {'mpi', 'yarn', 'local', 'sge'}, default to ${DMLC_SUBMIT_CLUSTER}
Job submission mode.
--num-workers integer, required
Number of workers in the job.
--num-servers` integer, default=0
Number of servers in the job.
--worker-cores integer, default=1
Number of cores needed to be allocated for worker job.
--server-cores integer, default=1
Number of cores needed to be allocated for server job.
--worker-memory string, default='1g'
Memory needed for server job.
--server-memory string, default='1g'
Memory needed for server job.
--jobname string, default=auto specify
Name of the job.
--queue string, default='default'
The submission queue we should submit the job to.
--log-level string, {INFO, DEBUG}
The logging level.
--log-file string, default='None'
Output log to the specific log file, the log is still printed on stderr.

tracker]# ./dmlc-submit --cluster=yarn --num-workers=2 --worker-cores=1 --num-servers=2 ../../example/image-classification/train_mnist.py
source activate ml2
hdfs dfs -put train-* /tmp/mnist
hdfs dfs -chomd -R 777 /tmp/mnist
tools/launch.py -n 2 --launcher yarn python train_mnist.py --data-dir hdfs:///tmp/mnist/

mxnet on multiple cpus:
在一台机器上跑可以直接运行: python train_mnist.py --network lenet
前提: 所有机器都编译通过并且安装了mxnet,并且机器之间可以通过ssh连接。
cd mxnet/example/image-classification
echo "192.168.177.77" >> hosts //当前机器192.168.177.78,
../../tools/launch.py -n 2 --launcher ssh -H hosts python train_mnist.py --network lenet --kv-store dist_sync
note that:
use  launch.py  to submit the job
  • provide launcher, ssh if all machines are ssh-able, mpi if mpirun is available, sge for Sun Grid Engine, and yarn for Apache Yarn.
  • -n number of worker nodes to run
  • -H the host file which is required by ssh and mpi
  • --kv-store use either dist_sync or dist_async
效果对比:
77,78两台机器上跑:
单台机器:
python train_mnist.py --network lenet
19:26:27-20:44:57 78minutes
两台机器:
../../tools/launch.py -n 2 --launcher ssh -H hosts python train_mnist.py --network lenet --kv-store dist_sync
18:43:51-19:22:53 39minutes


  • 3
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值