最近项目做机器学习的功能,同时用到了keras,xgboost,svm,项目用到的os很老是centos6.3,近期也没有升级os的plan,我只有在此os摸索, 耗了好几天的时间。
以下记录了安装过程和遇到的问题,适用于CentOS release 6.3 x86_64。
开始使用Keras + Theano(0.9.0)backend,在AWS EC2 m3.xlarge/m3.2xlarge centos6.3/7.1/7.2/7.3 instance测试,发现Theano有严重的内存泄漏,同样的模型和训练集无论是16G内存还是32内存都是1个epoch没有训练完就吃满内存爆掉。theano安装简单,依赖少,但是有严重内存泄漏,不得不使用tensorflow backend,这个没有出现内存泄漏。
(1) Upgrade g++/gcc to 4.8.2
wget http://people.centos.org/tru/devtools-2/devtools-2.repo -O /etc/yum.repos.d/devtools-2.repo
yum install devtoolset-2-gcc devtoolset-2-gcc-c++ devtoolset-2-binutils
/opt/rh/devtoolset-2/root/usr/bin/g++ --version
ln -s /opt/rh/devtoolset-2/root/usr/bin/* /usr/local/bin/
g++ --version
说明:
xgboost需要c++11特性,不得不升级。
(2) Install python 2.7.5
yum install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel
wget https://www.python.org/ftp/python/2.7.5/Python-2.7.5.tgz
tar xf Python-2.7.5.tgz
cd Python-2.7.5
./configure --prefix=/usr/local/ -enable-shared --enable-unicode=ucs4
make && make altinstall
ls -ltr /usr/bin/python*
ls -ltr /usr/local/bin/python*
vim /etc/ld.so.conf add /usr/local/lib
/sbin/ldconfig
/sbin/ldconfig -v
说明:
--enable-unicode=ucs4是tensorflow需要的编译选项;
make altinstall不影响老的python版本;
也可以安装最新版的python 2.7.13。
(3) upgrade glic libc.so.6 from libc-2.12.so to libc-2.17.so
tensorflow需要libc-2.17.so。
安装前strings /lib64/libc-2.12.so |grep GLIBC
GLIBC_2.2.5
GLIBC_2.2.6
GLIBC_2.3
GLIBC_2.3.2
GLIBC_2.3.3
GLIBC_2.3.4
GLIBC_2.4
GLIBC_2.5
GLIBC_2.6
GLIBC_2.7
GLIBC_2.8
GLIBC_2.9
GLIBC_2.10
GLIBC_2.11
GLIBC_2.12
GLIBC_PRIVATE
保存下面内容为shell脚本glibc-2.17_centos6.sh并执行:
##################
#! /bin/sh
# update glibc to 2.17 for CentOS 6
wget --no-proxy http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-i386/glibc-2.17-55.fc20/glibc-2.17-55.el6.i686.rpm
wget --no-proxy http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-x86_64/glibc-2.17-55.fc20/glibc-2.17-55.el6.x86_64.rpm
wget --no-proxy http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-x86_64/glibc-2.17-55.fc20/glibc-common-2.17-55.el6.x86_64.rpm
wget --no-proxy http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-x86_64/glibc-2.17-55.fc20/glibc-devel-2.17-55.el6.x86_64.rpm
wget --no-proxy http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-x86_64/glibc-2.17-55.fc20/glibc-headers-2.17-55.el6.x86_64.rpm
sudo rpm -Uvh glibc-2.17-55.el6.x86_64.rpm \
glibc-2.17-55.el6.i686.rpm \
glibc-common-2.17-55.el6.x86_64.rpm \
glibc-devel-2.17-55.el6.x86_64.rpm \
glibc-headers-2.17-55.el6.x86_64.rpm
##################
安装后strings /lib64/libc.so.6 |grep GLIBC_
GLIBC_2.2.5
GLIBC_2.2.6
GLIBC_2.3
GLIBC_2.3.2
GLIBC_2.3.3
GLIBC_2.3.4
GLIBC_2.4
GLIBC_2.5
GLIBC_2.6
GLIBC_2.7
GLIBC_2.8
GLIBC_2.9
GLIBC_2.10
GLIBC_2.11
GLIBC_2.12
GLIBC_2.13
GLIBC_2.14
GLIBC_2.15
GLIBC_2.16
GLIBC_2.17
GLIBC_PRIVATE
(4) upgrade libstdc++.so.6 from libstdc++.so.6.0.13 to libstdc++.so.6.0.19
tensorflow需要libstdc++.so.6.0.19,可以从centos7里面获取或者google。
安装前strings /usr/lib64/libstdc++.so.6 | grep GLIBCXX
GLIBCXX_3.4
GLIBCXX_3.4.1
GLIBCXX_3.4.2
GLIBCXX_3.4.3
GLIBCXX_3.4.4
GLIBCXX_3.4.5
GLIBCXX_3.4.6
GLIBCXX_3.4.7
GLIBCXX_3.4.8
GLIBCXX_3.4.9
GLIBCXX_3.4.10
GLIBCXX_3.4.11
GLIBCXX_3.4.12
GLIBCXX_3.4.13
GLIBCXX_FORCE_NEW
GLIBCXX_DEBUG_MESSAGE_LENGTH
rm -rf /usr/lib64/libstdc++.so.6
cp ./libstdc++.so.6.0.19 /usr/lib64/
ln -s /usr/lib64/libstdc++.so.6.0.19 /usr/lib64/libstdc++.so.6
安装后strings /usr/lib64/libstdc++.so.6 | grep GLIBC
GLIBCXX_3.4
GLIBCXX_3.4.1
GLIBCXX_3.4.2
GLIBCXX_3.4.3
GLIBCXX_3.4.4
GLIBCXX_3.4.5
GLIBCXX_3.4.6
GLIBCXX_3.4.7
GLIBCXX_3.4.8
GLIBCXX_3.4.9
GLIBCXX_3.4.10
GLIBCXX_3.4.11
GLIBCXX_3.4.12
GLIBCXX_3.4.13
GLIBCXX_3.4.14
GLIBCXX_3.4.15
GLIBCXX_3.4.16
GLIBCXX_3.4.17
GLIBCXX_3.4.18
GLIBCXX_3.4.19
GLIBC_2.3
GLIBC_2.2.5
GLIBC_2.14
GLIBC_2.4
GLIBC_2.3.2
GLIBCXX_DEBUG_MESSAGE_LENGTH
(5) Install pip
wget --no-check-certificate https://bootstrap.pypa.io/ez_setup.py
/usr/local/bin/python2.7 ez_setup.py
/usr/local/bin/easy_install-2.7 pip
(6) Install keras, tensorflow, xgboost
/usr/local/bin/pip2.7 install keras==2.0.1
/usr/local/bin/pip2.7 install xgboost
/usr/local/bin/pip2.7 install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.0.1-cp27-none-linux_x86_64.whl
说明:
tensorflow要使用1.0.1版的,0.8/0.9/1.1我测试的都不行。
(7) Install libsvm 3.22
wget http://www.csie.ntu.edu.tw/~cjlin/cgi-bin/libsvm.cgi?+http://www.csie.ntu.edu.tw/~cjlin/libsvm+tar.gz
tar xf libsvm-3.22.tar.gz
cd libsvm-3.22
make
cd python
make
vim svm.py edit libsvm = CDLL(path.join(dirname, '/usr/local/lib/libsvm.so.2')) (需要修改)
cp *.py /usr/local/lib/python2.7/site-packages/
cp ../libsvm.so.2 /usr/local/lib/
cd /usr/local/lib/
ln -s libsvm.so.2 libsvm.so
(8) Run python2.7 to test import
pip list显示:
funcsigs (1.0.2)
Keras (2.0.1)
mock (2.0.0)
numpy (1.12.1)
pbr (3.0.0)
pip (9.0.1)
protobuf (3.3.0)
PyYAML (3.12)
scikit-learn (0.18.1)
scipy (0.19.0)
setuptools (33.1.1)
six (1.10.0)
tensorflow (1.0.1)
Theano (0.9.0)
wheel (0.29.0)
xgboost (0.6a2)
python2.7测试
Python 2.7.5 (default, May 11 2017, 09:45:52)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import xgboost
import svm
import svmutil
import keras/usr/local/lib/python2.7/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)
>>> >>> >>>
Using TensorFlow backend.
我的算法也跑起来了。
以下记录了安装过程和遇到的问题,适用于CentOS release 6.3 x86_64。
开始使用Keras + Theano(0.9.0)backend,在AWS EC2 m3.xlarge/m3.2xlarge centos6.3/7.1/7.2/7.3 instance测试,发现Theano有严重的内存泄漏,同样的模型和训练集无论是16G内存还是32内存都是1个epoch没有训练完就吃满内存爆掉。theano安装简单,依赖少,但是有严重内存泄漏,不得不使用tensorflow backend,这个没有出现内存泄漏。
(1) Upgrade g++/gcc to 4.8.2
wget http://people.centos.org/tru/devtools-2/devtools-2.repo -O /etc/yum.repos.d/devtools-2.repo
yum install devtoolset-2-gcc devtoolset-2-gcc-c++ devtoolset-2-binutils
/opt/rh/devtoolset-2/root/usr/bin/g++ --version
ln -s /opt/rh/devtoolset-2/root/usr/bin/* /usr/local/bin/
g++ --version
说明:
xgboost需要c++11特性,不得不升级。
(2) Install python 2.7.5
yum install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel
wget https://www.python.org/ftp/python/2.7.5/Python-2.7.5.tgz
tar xf Python-2.7.5.tgz
cd Python-2.7.5
./configure --prefix=/usr/local/ -enable-shared --enable-unicode=ucs4
make && make altinstall
ls -ltr /usr/bin/python*
ls -ltr /usr/local/bin/python*
vim /etc/ld.so.conf add /usr/local/lib
/sbin/ldconfig
/sbin/ldconfig -v
说明:
--enable-unicode=ucs4是tensorflow需要的编译选项;
make altinstall不影响老的python版本;
也可以安装最新版的python 2.7.13。
(3) upgrade glic libc.so.6 from libc-2.12.so to libc-2.17.so
tensorflow需要libc-2.17.so。
安装前strings /lib64/libc-2.12.so |grep GLIBC
GLIBC_2.2.5
GLIBC_2.2.6
GLIBC_2.3
GLIBC_2.3.2
GLIBC_2.3.3
GLIBC_2.3.4
GLIBC_2.4
GLIBC_2.5
GLIBC_2.6
GLIBC_2.7
GLIBC_2.8
GLIBC_2.9
GLIBC_2.10
GLIBC_2.11
GLIBC_2.12
GLIBC_PRIVATE
保存下面内容为shell脚本glibc-2.17_centos6.sh并执行:
##################
#! /bin/sh
# update glibc to 2.17 for CentOS 6
wget --no-proxy http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-i386/glibc-2.17-55.fc20/glibc-2.17-55.el6.i686.rpm
wget --no-proxy http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-x86_64/glibc-2.17-55.fc20/glibc-2.17-55.el6.x86_64.rpm
wget --no-proxy http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-x86_64/glibc-2.17-55.fc20/glibc-common-2.17-55.el6.x86_64.rpm
wget --no-proxy http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-x86_64/glibc-2.17-55.fc20/glibc-devel-2.17-55.el6.x86_64.rpm
wget --no-proxy http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-x86_64/glibc-2.17-55.fc20/glibc-headers-2.17-55.el6.x86_64.rpm
sudo rpm -Uvh glibc-2.17-55.el6.x86_64.rpm \
glibc-2.17-55.el6.i686.rpm \
glibc-common-2.17-55.el6.x86_64.rpm \
glibc-devel-2.17-55.el6.x86_64.rpm \
glibc-headers-2.17-55.el6.x86_64.rpm
##################
安装后strings /lib64/libc.so.6 |grep GLIBC_
GLIBC_2.2.5
GLIBC_2.2.6
GLIBC_2.3
GLIBC_2.3.2
GLIBC_2.3.3
GLIBC_2.3.4
GLIBC_2.4
GLIBC_2.5
GLIBC_2.6
GLIBC_2.7
GLIBC_2.8
GLIBC_2.9
GLIBC_2.10
GLIBC_2.11
GLIBC_2.12
GLIBC_2.13
GLIBC_2.14
GLIBC_2.15
GLIBC_2.16
GLIBC_2.17
GLIBC_PRIVATE
(4) upgrade libstdc++.so.6 from libstdc++.so.6.0.13 to libstdc++.so.6.0.19
tensorflow需要libstdc++.so.6.0.19,可以从centos7里面获取或者google。
安装前strings /usr/lib64/libstdc++.so.6 | grep GLIBCXX
GLIBCXX_3.4
GLIBCXX_3.4.1
GLIBCXX_3.4.2
GLIBCXX_3.4.3
GLIBCXX_3.4.4
GLIBCXX_3.4.5
GLIBCXX_3.4.6
GLIBCXX_3.4.7
GLIBCXX_3.4.8
GLIBCXX_3.4.9
GLIBCXX_3.4.10
GLIBCXX_3.4.11
GLIBCXX_3.4.12
GLIBCXX_3.4.13
GLIBCXX_FORCE_NEW
GLIBCXX_DEBUG_MESSAGE_LENGTH
rm -rf /usr/lib64/libstdc++.so.6
cp ./libstdc++.so.6.0.19 /usr/lib64/
ln -s /usr/lib64/libstdc++.so.6.0.19 /usr/lib64/libstdc++.so.6
安装后strings /usr/lib64/libstdc++.so.6 | grep GLIBC
GLIBCXX_3.4
GLIBCXX_3.4.1
GLIBCXX_3.4.2
GLIBCXX_3.4.3
GLIBCXX_3.4.4
GLIBCXX_3.4.5
GLIBCXX_3.4.6
GLIBCXX_3.4.7
GLIBCXX_3.4.8
GLIBCXX_3.4.9
GLIBCXX_3.4.10
GLIBCXX_3.4.11
GLIBCXX_3.4.12
GLIBCXX_3.4.13
GLIBCXX_3.4.14
GLIBCXX_3.4.15
GLIBCXX_3.4.16
GLIBCXX_3.4.17
GLIBCXX_3.4.18
GLIBCXX_3.4.19
GLIBC_2.3
GLIBC_2.2.5
GLIBC_2.14
GLIBC_2.4
GLIBC_2.3.2
GLIBCXX_DEBUG_MESSAGE_LENGTH
(5) Install pip
wget --no-check-certificate https://bootstrap.pypa.io/ez_setup.py
/usr/local/bin/python2.7 ez_setup.py
/usr/local/bin/easy_install-2.7 pip
(6) Install keras, tensorflow, xgboost
/usr/local/bin/pip2.7 install keras==2.0.1
/usr/local/bin/pip2.7 install xgboost
/usr/local/bin/pip2.7 install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.0.1-cp27-none-linux_x86_64.whl
说明:
tensorflow要使用1.0.1版的,0.8/0.9/1.1我测试的都不行。
(7) Install libsvm 3.22
wget http://www.csie.ntu.edu.tw/~cjlin/cgi-bin/libsvm.cgi?+http://www.csie.ntu.edu.tw/~cjlin/libsvm+tar.gz
tar xf libsvm-3.22.tar.gz
cd libsvm-3.22
make
cd python
make
vim svm.py edit libsvm = CDLL(path.join(dirname, '/usr/local/lib/libsvm.so.2')) (需要修改)
cp *.py /usr/local/lib/python2.7/site-packages/
cp ../libsvm.so.2 /usr/local/lib/
cd /usr/local/lib/
ln -s libsvm.so.2 libsvm.so
(8) Run python2.7 to test import
pip list显示:
funcsigs (1.0.2)
Keras (2.0.1)
mock (2.0.0)
numpy (1.12.1)
pbr (3.0.0)
pip (9.0.1)
protobuf (3.3.0)
PyYAML (3.12)
scikit-learn (0.18.1)
scipy (0.19.0)
setuptools (33.1.1)
six (1.10.0)
tensorflow (1.0.1)
Theano (0.9.0)
wheel (0.29.0)
xgboost (0.6a2)
python2.7测试
Python 2.7.5 (default, May 11 2017, 09:45:52)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import xgboost
import svm
import svmutil
import keras/usr/local/lib/python2.7/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)
>>> >>> >>>
Using TensorFlow backend.
我的算法也跑起来了。