Centos6.3 keras+tensorflow+xgboost+libsvm机器学习环境搭建

最近项目做机器学习的功能,同时用到了keras,xgboost,svm,项目用到的os很老是centos6.3,近期也没有升级os的plan,我只有在此os摸索, 耗了好几天的时间。

以下记录了安装过程和遇到的问题,适用于CentOS release 6.3 x86_64。

开始使用Keras + Theano(0.9.0)backend,在AWS EC2 m3.xlarge/m3.2xlarge centos6.3/7.1/7.2/7.3 instance测试,发现Theano有严重的内存泄漏,同样的模型和训练集无论是16G内存还是32内存都是1个epoch没有训练完就吃满内存爆掉。theano安装简单,依赖少,但是有严重内存泄漏,不得不使用tensorflow backend,这个没有出现内存泄漏。

(1) Upgrade g++/gcc to 4.8.2
        wget http://people.centos.org/tru/devtools-2/devtools-2.repo -O /etc/yum.repos.d/devtools-2.repo
        yum install devtoolset-2-gcc devtoolset-2-gcc-c++ devtoolset-2-binutils
        /opt/rh/devtoolset-2/root/usr/bin/g++ --version
        ln -s /opt/rh/devtoolset-2/root/usr/bin/* /usr/local/bin/
        g++ --version
        
        说明:
        xgboost需要c++11特性,不得不升级。

(2) Install python 2.7.5
        yum install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel
        wget https://www.python.org/ftp/python/2.7.5/Python-2.7.5.tgz
        tar xf Python-2.7.5.tgz
        cd Python-2.7.5
        ./configure --prefix=/usr/local/ -enable-shared --enable-unicode=ucs4
        make && make altinstall
        ls -ltr /usr/bin/python*
        ls -ltr /usr/local/bin/python*

        vim /etc/ld.so.conf add /usr/local/lib
        /sbin/ldconfig
        /sbin/ldconfig -v

    说明:
    --enable-unicode=ucs4是tensorflow需要的编译选项;
    make altinstall不影响老的python版本;
    也可以安装最新版的python 2.7.13。

(3) upgrade glic libc.so.6 from libc-2.12.so to libc-2.17.so
    tensorflow需要libc-2.17.so。
    安装前strings /lib64/libc-2.12.so |grep GLIBC
        GLIBC_2.2.5
        GLIBC_2.2.6
        GLIBC_2.3
        GLIBC_2.3.2
        GLIBC_2.3.3
        GLIBC_2.3.4
        GLIBC_2.4
        GLIBC_2.5
        GLIBC_2.6
        GLIBC_2.7
        GLIBC_2.8
        GLIBC_2.9
        GLIBC_2.10
        GLIBC_2.11
        GLIBC_2.12
        GLIBC_PRIVATE

    保存下面内容为shell脚本glibc-2.17_centos6.sh并执行:
        ##################
        #! /bin/sh

        # update glibc to 2.17 for CentOS 6

        wget --no-proxy http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-i386/glibc-2.17-55.fc20/glibc-2.17-55.el6.i686.rpm
        wget --no-proxy http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-x86_64/glibc-2.17-55.fc20/glibc-2.17-55.el6.x86_64.rpm
        wget --no-proxy http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-x86_64/glibc-2.17-55.fc20/glibc-common-2.17-55.el6.x86_64.rpm
        wget --no-proxy http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-x86_64/glibc-2.17-55.fc20/glibc-devel-2.17-55.el6.x86_64.rpm
        wget --no-proxy http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-x86_64/glibc-2.17-55.fc20/glibc-headers-2.17-55.el6.x86_64.rpm

        sudo rpm -Uvh glibc-2.17-55.el6.x86_64.rpm \
        glibc-2.17-55.el6.i686.rpm \
        glibc-common-2.17-55.el6.x86_64.rpm \
        glibc-devel-2.17-55.el6.x86_64.rpm \
        glibc-headers-2.17-55.el6.x86_64.rpm
        ##################


    安装后strings /lib64/libc.so.6 |grep GLIBC_
            GLIBC_2.2.5
            GLIBC_2.2.6
            GLIBC_2.3
            GLIBC_2.3.2
            GLIBC_2.3.3
            GLIBC_2.3.4
            GLIBC_2.4
            GLIBC_2.5
            GLIBC_2.6
            GLIBC_2.7
            GLIBC_2.8
            GLIBC_2.9
            GLIBC_2.10
            GLIBC_2.11
            GLIBC_2.12
            GLIBC_2.13
            GLIBC_2.14
            GLIBC_2.15
            GLIBC_2.16
            GLIBC_2.17
            GLIBC_PRIVATE


(4) upgrade libstdc++.so.6 from libstdc++.so.6.0.13 to libstdc++.so.6.0.19
    tensorflow需要libstdc++.so.6.0.19,可以从centos7里面获取或者google。
    
    安装前strings /usr/lib64/libstdc++.so.6 | grep GLIBCXX
        GLIBCXX_3.4
        GLIBCXX_3.4.1
        GLIBCXX_3.4.2
        GLIBCXX_3.4.3
        GLIBCXX_3.4.4
        GLIBCXX_3.4.5
        GLIBCXX_3.4.6
        GLIBCXX_3.4.7
        GLIBCXX_3.4.8
        GLIBCXX_3.4.9
        GLIBCXX_3.4.10
        GLIBCXX_3.4.11
        GLIBCXX_3.4.12
        GLIBCXX_3.4.13
        GLIBCXX_FORCE_NEW
        GLIBCXX_DEBUG_MESSAGE_LENGTH

    rm -rf /usr/lib64/libstdc++.so.6  
    cp ./libstdc++.so.6.0.19 /usr/lib64/
    ln -s /usr/lib64/libstdc++.so.6.0.19 /usr/lib64/libstdc++.so.6  

    安装后strings /usr/lib64/libstdc++.so.6 | grep GLIBC  
        GLIBCXX_3.4
        GLIBCXX_3.4.1
        GLIBCXX_3.4.2
        GLIBCXX_3.4.3
        GLIBCXX_3.4.4
        GLIBCXX_3.4.5
        GLIBCXX_3.4.6
        GLIBCXX_3.4.7
        GLIBCXX_3.4.8
        GLIBCXX_3.4.9
        GLIBCXX_3.4.10
        GLIBCXX_3.4.11
        GLIBCXX_3.4.12
        GLIBCXX_3.4.13
        GLIBCXX_3.4.14
        GLIBCXX_3.4.15
        GLIBCXX_3.4.16
        GLIBCXX_3.4.17
        GLIBCXX_3.4.18
        GLIBCXX_3.4.19
        GLIBC_2.3
        GLIBC_2.2.5
        GLIBC_2.14
        GLIBC_2.4
        GLIBC_2.3.2
        GLIBCXX_DEBUG_MESSAGE_LENGTH


(5) Install pip
    wget --no-check-certificate https://bootstrap.pypa.io/ez_setup.py
    /usr/local/bin/python2.7 ez_setup.py
    /usr/local/bin/easy_install-2.7 pip

(6) Install keras, tensorflow, xgboost
    /usr/local/bin/pip2.7 install keras==2.0.1
    /usr/local/bin/pip2.7 install xgboost
    /usr/local/bin/pip2.7 install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.0.1-cp27-none-linux_x86_64.whl
    
    说明:
    tensorflow要使用1.0.1版的,0.8/0.9/1.1我测试的都不行。

(7) Install libsvm 3.22
    wget http://www.csie.ntu.edu.tw/~cjlin/cgi-bin/libsvm.cgi?+http://www.csie.ntu.edu.tw/~cjlin/libsvm+tar.gz
    tar xf libsvm-3.22.tar.gz   
    cd libsvm-3.22
    make
    cd python
    make
    vim svm.py edit libsvm = CDLL(path.join(dirname, '/usr/local/lib/libsvm.so.2')) (需要修改)
    cp *.py /usr/local/lib/python2.7/site-packages/
    cp ../libsvm.so.2 /usr/local/lib/
    cd /usr/local/lib/
    ln -s libsvm.so.2 libsvm.so

(8) Run python2.7 to test import

    pip list显示:
        funcsigs (1.0.2)
        Keras (2.0.1)
        mock (2.0.0)
        numpy (1.12.1)
        pbr (3.0.0)
        pip (9.0.1)
        protobuf (3.3.0)
        PyYAML (3.12)
        scikit-learn (0.18.1)
        scipy (0.19.0)
        setuptools (33.1.1)
        six (1.10.0)
        tensorflow (1.0.1)
        Theano (0.9.0)
        wheel (0.29.0)
        xgboost (0.6a2)

    python2.7测试
        Python 2.7.5 (default, May 11 2017, 09:45:52)
        [GCC 4.8.2 20140120 (Red Hat 4.8.2-15)] on linux2
        Type "help", "copyright", "credits" or "license" for more information.
        >>> import xgboost
        import svm
        import svmutil
        import keras/usr/local/lib/python2.7/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
          "This module will be removed in 0.20.", DeprecationWarning)
        >>> >>> >>>
        Using TensorFlow backend.
        
    我的算法也跑起来了。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值