XGBoost以前对windo平台支持不太好,GPU版本都没有,所以以前没装它。不过,现在它已经能在Windows平台上跑GPU版本了。所以,就把它装上了。
XGBoost关于GPU版本的信息请参考:https://github.com/dmlc/xgboost/tree/master/plugin/updater_gpu
1、安装
XGBoost的Windows平台GPU版本有两种方式:
1)偷懒方法。直接下编译好的包进行安装。安装包请这里下载:http://www.picnet.com.au/blogs/guido/post/2016/09/22/xgboost-windows-x64-binaries-for-download/
- git clone https://github.com/dmlc/xgboost.git xgboost_install_dir
- copy libxgboost.dll (downloaded from this page) into the xgboost_install_dir\python-package\xgboost\ directory
- cd xgboost_install_dir\python-package\
- python setup.py install
2)下载源代码自己编译安装。
git clone --recursive https://github.com/dmlc/xgboost
然后,启动cmake,给下载的xgboost目录生成相关的vs2013工程目录文件。注意:这里记得要把plugin_updater_gpu的选项要加上。不然的话,编译出来的还是CPU版本的。
To use the plugin xgboost must be built using cmake specifying the option PLUGIN_UPDATER_GPU=ON. The location of the CUB library must also be specified with the cmake variable CUB_DIRECTORY. CMake will prepare a build system depending on which platform you are on.
这里的CUB是指: CUB 1.6.4 - https://nvlabs.github.io/cub/ 。 这个最好装一下,不然后面编译会报错。
生成VS2013工程文件后,后面就可以用vs2013编译了。
2、运行测试。
XGBoost有一个专门测试例子:xgboost\demo\gpu_acceleration\bosch.py。这个例子用的是最近Kaggle比赛里的一个数据集进行训练,然后分成5份做交叉校验。如果用CPU执行的话,我的机器每轮要23秒左右,换成GPU最快的算法,每轮只要3秒多。官网的bosch.py有些问题,没法在新的sklearn下跑。我做了些修改,代码如下:
#encoding=utf-8 import pandas as pd import xgboost as xgb import time import random from sklearn.model_selection import StratifiedKFold import numpy as np #For sampling rows from input file random_seed = 9 subset = 0.3# The original subset number is 0.4 . To run on my pc ,we drcrease it into 0.3 root_dir="H:/github_samples/data/datasets/" n_rows = 1183747;#这个是要跳过的行数。跳过的行数越大,我们所使用的数据集就越小,同时所需内存也越小。 train_rows = int(n_rows * subset) random.seed(random_seed) skip = sorted(random.sample(xrange(1,n_rows + 1),n_rows-train_rows)) data = pd.read_csv(root_dir+"train_numeric.csv", index_col=0, dtype=np.float32, skiprows=skip) y = data['Response'].values # y=np.load("y.npy") print(y.shape) # np.save("y",y) del data['Response'] X = data.values # X=np.load("X.npy") print(X.shape) # np.save("X",X) param = {} param['objective'] = 'binary:logistic' param['eval_metric'] = 'auc' param['max_depth'] = 5 param['eta'] = 0.3 param['silent'] = 0 # param['nthread']=7 # cpu 线程数 param['updater'] = 'grow_gpu_hist' # param['updater'] = 'grow_gpu' #param['updater'] = 'grow_colmaker' num_round = 20 # cv = StratifiedKFold(y, n_folds=5) cv = StratifiedKFold(n_splits=5) print (cv.get_n_splits(X,y)) for i, (train, test) in enumerate(cv.split(X,y)): print (len(train),len(test)) dtrain = xgb.DMatrix(X[train], label=y[train]) tmp = time.time() bst = xgb.train(param, dtrain, num_round) boost_time = time.time() - tmp res = bst.eval(xgb.DMatrix(X[test], label=y[test])) print("Fold {}: {}, Boost Time {}".format(i, res, str(boost_time))) del bst
本来后面还想做个mnist数据集的测试,结果一跑程序,我的GPU就报显存不够的错误。本来还想写个分批读入内存的例子的,只是最近太忙了。算了,以后有空再做一个例子吧。对XGBoost的GPU版本感兴趣的可以继续阅读相关信息:https://github.com/dmlc/xgboost/tree/master/plugin/updater_gpu
以上。