xgboost0.7的编译安装

背景

    最近研究院的同事需要使用xgboost。起初是想着在python里面给装下;因为目前开放给研究院的spark主要还是用的pyspark;
    在测试服务器上安装xgboost:pip install xgboost报错:
     #pragma message: Will need g++-4.6 or higher to compile allthe features in dmlc-core, compile without c++0x, some features may be disabled
    我使用的服务器是centos6.7.查看服务器的gcc:gcc -v得到的版本是4.4.7;
     gcc version 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC)
    如果要安装xgboost,就得所有的服务器都升级gcc,这个工作量有点大;无奈另想它法;准备编译出jar包;用scala的方式使用;

编译jvm-package

官方git地址:https://github.com/dmlc/xgboost
官网地址    :http://xgboost.readthedocs.io/en/latest/

查看官方安装方式:https://xgboost.readthedocs.io/en/latest/build.html
有如下描述:

It consists of two steps:

  1. First build the shared library from the C++ codes (libxgboost.so for linux/osx and libxgboost.dll for windows).
    • Exception: for R-package installation please directly refer to the R package section.
  2. Then install the language packages (e.g. Python Package)
Important the newest version of xgboost uses submodule to maintain packages. So when you clone the repo, remember to use the recursive option as follows.
大概意思是安装要两步;第一步是构建共享代码;第二步是编译安装;
需要注意的是;xgboost最新版本用子模块来维护包,需要通过递归的方式来获取包:
git clone --recursive https://github.com/dmlc/xgboost

构建共享代码

我是在centos7版本的服务器上编译的;根据官方的步骤执行如下的命令:
cd xgboost; cp make/minimum.mk ./config.mk; make -j4
如果是使用其他版本的服务器,参照如下方法:

Building on Ubuntu/Debian

On Ubuntu, one builds xgboost by

git clone --recursive https://github.com/dmlc/xgboost
cd xgboost; make -j4

Building on OSX

On OSX, one builds xgboost by

git clone --recursive https://github.com/dmlc/xgboost
cd xgboost; cp make/minimum.mk ./config.mk; make -j4
上面这一步执行成功后;在进行第二步的编译;如果有报错;可以根据错误提示,安装相关的依赖包后再次make;

编译jvm-package

参考官方步骤:https://xgboost.readthedocs.io/en/latest/jvm/index.html

Installation

Currently, XGBoost4J only support installation from source. Building XGBoost4J using Maven requires Maven 3 or newer and Java 7+.

Before you install XGBoost4J, you need to define environment variable JAVA_HOME as your JDK directory to ensure that your compiler can find jni.h correctly, since XGBoost4J relies on JNI to implement the interaction between the JVM and native libraries.

After your JAVA_HOME is defined correctly, it is as simple as run mvn package under jvm-packages directory to install XGBoost4J. You can also skip the tests by running mvn -DskipTests=true package, if you are sure about the correctness of your local setup.

To publish the artifacts to your local maven repository, run

mvn install

Or, if you would like to skip tests, run

mvn -DskipTests install

This command will publish the xgboost binaries, the compiled java classes as well as the java sources to your local repository. Then you can use XGBoost4J in your Java projects by including the following dependency inpom.xml:

<dependency>
  <groupId>ml.dmlc</groupId>
  <artifactId>xgboost4j</artifactId>
  <version>0.7</version>
</dependency>

After integrating with Dataframe/Dataset APIs of Spark 2.0, XGBoost4J-Spark only supports compile with Spark 2.x. You can build XGBoost4J-Spark as a component of XGBoost4J by running mvn package, and you can specify the version of spark with mvn -Dspark.version=2.0.0 package. (To continue working with Spark 1.x, the users are supposed to update pom.xml by modifying the properties like spark.versionscala.version, andscala.binary.version. Users also need to change the implementation by replacing SparkSession with SQLContext and the type of API parameters from Dataset[_] to Dataframe


默认的spark版本是2.1.0.scala版本是2.11.8。
mvn package
耐心等待完成编译安装。
如果需要更改spark和scala的版本,只需要更改pom.xml中的 spark.version scala.version , and scala.binary.version。再执行mvn install。

使用

安装完成后,在centos7的服务器上面执行一切正常;将xgboost4j-spark-0.7-jar-with-dependencies.jar上传到生成环境的centos6服务器上面执行;报了如下错误:

/lib64/libc.so.6: version `GLIBC_2.14' not found

google后得到原因是系统的glibc版本太低,软件编译时使用了较高版本的glibc引起的.
想办法在centos6服务器上编译xgboost;发现又报了GCC版本过低的错误。按照下面的方法升级了gcc;再次按照上面的方式编译成功;
wget http://people.centos.org/tru/devtools-2/devtools-2.repo
mv devtools-2.repo /etc/yum.repos.d
yum install devtoolset-2-gcc devtoolset-2-binutils devtoolset-2-gcc-c++
mv /usr/bin/gcc /usr/bin/gcc-4.4.7
mv /usr/bin/g++ /usr/bin/g++-4.4.7
mv /usr/bin/c++ /usr/bin/c++-4.4.7
ln -s /opt/rh/devtoolset-2/root/usr/bin/gcc /usr/bin/gcc
ln -s /opt/rh/devtoolset-2/root/usr/bin/c++ /usr/bin/c++
ln -s /opt/rh/devtoolset-2/root/usr/bin/g++ /usr/bin/g++
gcc --version





  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值