重新写了一篇更加完整的:http://blog.csdn.net/s_sunnyy/article/details/56479077
1 git clone--recursive https://github.com/dmlc/mxnet
cdmxnet
make -j4
-> make时出错,显示opencv和blas找不到
2 安装opencv
git clonehttps://github.com/opencv/opencv
cd opencv
mkdir -p build
cd build
cmake -DBUILD_opencv_gpu=OFF -D WITH_EIGEN=ON -D WITH_TBB=ON -D WITH_CUDA=OFF -DWITH_1394=OFF -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local ..
make PREFIX=/usr/local install
export PKG_CONFIG_PATH=/home/pub/mxnetlib/lib/pkgconfig:$PKG_CONFIG_PATH
问题1: 没有CMake
安装CMake
获取CMake源码包https://cmake.org/download/
解压tar -xvzfcmake-3.7.1.tar.gz
cd cmake-3.7.1
./bootstrap
gmake
gmake install
查看是否安装成功 cmake --version
问题2:停在了下载ippicv的地方
自行下载ippicv_linux_20151201.tgz(链接:http://www.linuxfromscratch.org/blfs/view/7.9/general/opencv.html),然后将下载的ippicv文件直接拷贝进入opencv源码的下面这个目录:opencv/3rdparty/ippicv/downloads/linux-808b791a6eac9ed78d32a7666804320e/
再次尝试make -j
仍然出错blas的问题
3 blas
修改config.mk
# the additionallink flags you want to add
ADD_LDFLAGS =-L/home/pub/caffelib/lib
# the additionalcompile flags you want to add
ADD_CFLAGS =-I/home/pub/caffelib/include
再次编译make -j4
仍然出错
4 修改config.mk
export CC = gcc
export CXX = g++
export NVCC = nvcc
# whether compilewith debug
DEBUG = 0
# the additionallink flags you want to add
ADD_LDFLAGS =-L/home/pub/caffelib/lib
#ADD_LDFLAGS =
# the additionalcompile flags you want to add
ADD_CFLAGS =-I/home/pub/caffelib/include
#ADD_CFLAGS =
#---------------------------------------------
# matrix computationlibraries for CPU/GPU
#---------------------------------------------
# whether use CUDAduring compile
USE_CUDA = 1
#USE_CUDA = 0
# add the path toCUDA library to link and compile flag
# if you havealready add them to environment variable, leave it as NONE
USE_CUDA_PATH =/usr/local/cuda
#USE_CUDA_PATH =NONE
# whether use CuDNNR3 library
USE_CUDNN = 0
# CUDA architecturesetting: going with all of them.
# For CUDA < 6.0,comment the *_50 lines for compatibility.
CUDA_ARCH :=-gencode arch=compute_30,code=sm_30 \
-gencodearch=compute_35,code=sm_35 \
-gencodearch=compute_50,code=sm_50 \
-gencodearch=compute_50,code=compute_50
# whether use cudaruntime compiling for writing kernels in native language (i.e. Python)
USE_NVRTC = 0
# whether use opencvduring compilation
# you can disableit, however, you will not able to use
# imbin iterator
USE_OPENCV = 1
# use openmp forparallelization
USE_OPENMP = 1
#USE_OPENMP = 0
# MKL ML Library forIntel CPU/Xeon Phi
# Please refer toMKL_README.md for details
# MKL ML Libraryfolder, need to be root for /usr/local
# Change to UserHome directory for standard user
# For USE_BLAS!=mklonly
MKLML_ROOT=/usr/local
#MKLML_ROOT=/home/pub/caffelib
# whether useMKL2017 library
USE_MKL2017 = 0
# whether useMKL2017 experimental feature for high performance
# PrerequisiteUSE_MKL2017=1
USE_MKL2017_EXPERIMENTAL= 0
# whether use NNPACKlibrary
USE_NNPACK = 0
USE_NNPACK_NUM_THREADS= 4
# choose the versionof blas you want to use
# can be: mkl, blas,atlas, openblas
# in default useatlas for linux while apple for osx
UNAME_S := $(shelluname -s)
ifeq ($(UNAME_S),Darwin)
USE_BLAS = apple
else
USE_BLAS = openblas
#USE_BLAS = atlas
endif
# add path to intellibrary, you may need it for MKL, if you did not add the path
# to environmentvariable
USE_INTEL_PATH =NONE
#USE_INTEL_PATH =/opt/intel
# If use MKL, choosestatic link automatically to allow python wrapper
ifeq ($(USE_BLAS),mkl)
USE_STATIC_MKL = 1
else
USE_STATIC_MKL =NONE
endif
#----------------------------
# Settings for powerand arm arch
#----------------------------
ARCH := $(shelluname -a)
ifneq (,$(filter$(ARCH), armv6l armv7l powerpc64le ppc64le aarch64))
USE_SSE=0
else
USE_SSE=1
endif
#----------------------------
# distributedcomputing
#----------------------------
# whether or not toenable multi-machine supporting
USE_DIST_KVSTORE = 0
# whether or notallow to read and write HDFS directly. If yes, then hadoop is
# required
USE_HDFS = 0
# path to libjvm.so.required if USE_HDFS=1
LIBJVM=$(JAVA_HOME)/jre/lib/amd64/server
# whether or notallow to read and write AWS S3 directly. If yes, then
#libcurl4-openssl-dev is required, it can be installed on Ubuntu by
# sudo apt-getinstall -y libcurl4-openssl-dev
USE_S3 = 0
#----------------------------
# additionaloperators
#----------------------------
# path to folderscontaining projects specific operators that you don't want to put insrc/operators
EXTRA_OPERATORS =
#----------------------------
# plugins
#----------------------------
# whether to usecaffe integration. This requires installing caffe.
# You also need toadd CAFFE_PATH/build/lib to your LD_LIBRARY_PATH
# CAFFE_PATH =$(HOME)/caffe
# MXNET_PLUGINS +=plugin/caffe/caffe.mk
# whether to usetorch integration. This requires installing torch.
# You also need toadd TORCH_PATH/install/lib to your LD_LIBRARY_PATH
# TORCH_PATH =$(HOME)/torch
# MXNET_PLUGINS +=plugin/torch/torch.mk
# WARPCTC_PATH =$(HOME)/warp-ctc
# MXNET_PLUGINS +=plugin/warpctc/warpctc.mk
# whether to usesframe integration. This requires build sframe
#git@github.com:dato-code/SFrame.git
# SFRAME_PATH =$(HOME)/SFrame
# MXNET_PLUGINS +=plugin/sframe/plugin.mk
换成hx普通用户,修改~/.bashrc
加上cuda和mxnetlib
make -j4不报错
5 cd /mxnet/python
python setup.py install
6 cd mxnet/example/image-classification
python train_mnist.py
缺少numpy
安装numpy->装不上->安装setuptools、pip、python-devel-2.7.5-16.el7.x86_64->安上了
cd setuptools-12.0.3/
# python setup.py install
cd pip-1.4/
python setup.py build
python setup.py install
7 使用GPU运行train_mnist.py
python train_mnist.py --gpus 0
INFO:root:start witharguments Namespace(batch_size=64,disp_batches=100, gpus='0',kv_store='device', load_epoch=None, lr=0.1,lr_factor=0.1, lr_step_epochs='10',model_prefix=None, mom=0.9, network='mlp',num_classes=10, num_epochs=20,num_examples=60000, num_layers=None,optimizer='sgd', test_io=0, top_k=0,wd=0.0001)
INFO:root:Starttraining with [gpu(0)]
INFO:root:Epoch[0]Batch [100] Speed: 54859.33samples/sec Train-accuracy=0.101875
INFO:root:Epoch[0]Batch [200] Speed: 58738.74samples/sec Train-accuracy=0.103750
INFO:root:Epoch[0]Batch [300] Speed: 55169.61samples/sec Train-accuracy=0.095156
INFO:root:Epoch[0]Batch [400] Speed: 55889.36samples/sec Train-accuracy=0.103281
INFO:root:Epoch[0]Batch [500] Speed: 55987.16samples/sec Train-accuracy=0.096250
INFO:root:Epoch[0]Batch [600] Speed: 55792.36samples/sec Train-accuracy=0.100312
INFO:root:Epoch[0]Batch [700] Speed: 55821.60samples/sec Train-accuracy=0.098437
INFO:root:Epoch[0]Batch [800] Speed: 55965.09samples/sec Train-accuracy=0.097031
INFO:root:Epoch[0]Batch [900] Speed: 55960.66samples/sec Train-accuracy=0.091406
INFO:root:Epoch[0]Resetting Data Iterator
INFO:root:Epoch[0]Time cost=1.473
INFO:root:Epoch[0]Validation-accuracy=0.098029
INFO:root:Epoch[1]Batch [100] Speed: 55439.99samples/sec Train-accuracy=0.101875
INFO:root:Epoch[1]Batch [200] Speed: 56097.96samples/sec Train-accuracy=0.103750
INFO:root:Epoch[1]Batch [300] Speed: 55872.72samples/sec Train-accuracy=0.095156
INFO:root:Epoch[1]Batch [400] Speed: 55794.91samples/sec Train-accuracy=0.103281
INFO:root:Epoch[1]Batch [500] Speed: 55954.83samples/sec Train-accuracy=0.096250
INFO:root:Epoch[1]Batch [600] Speed: 56369.37samples/sec Train-accuracy=0.100312
INFO:root:Epoch[1]Batch [700] Speed: 56096.08samples/sec Train-accuracy=0.098437
INFO:root:Epoch[1]Batch [800] Speed: 56049.93samples/sec Train-accuracy=0.097031
INFO:root:Epoch[1]Batch [900] Speed: 55324.47samples/sec Train-accuracy=0.091406
INFO:root:Epoch[1]Resetting Data Iterator
INFO:root:Epoch[1]Time cost=1.077
INFO:root:Epoch[1]Validation-accuracy=0.098029
INFO:root:Epoch[2]Batch [100] Speed: 56719.54samples/sec Train-accuracy=0.101875
INFO:root:Epoch[2]Batch [200] Speed: 55992.99samples/sec Train-accuracy=0.103750
INFO:root:Epoch[2]Batch [300] Speed: 56049.35samples/sec Train-accuracy=0.095156
INFO:root:Epoch[2]Batch [400] Speed: 55940.60samples/sec Train-accuracy=0.103281
.....
成功