在GBDT+LR、XGBoost Blending Stacking 等方法中,使用了apply()方法:
生成训练数据X_train在训练好的模型里每棵树中所处的叶子节点的位置(索引)
即:Return the predicted leaf every tree for each sample.
如:
https://www.cnblogs.com/wkang/p/9657032.html
https://github.com/lytforgood/MachineLearningTrick
xgboost中的apply方法:
def apply(self, X, ntree_limit=0):
"""Return the predicted leaf every tree for each sample.
Parameters
----------
X : array_like, shape=[n_samples, n_features]
Input features matrix.
ntree_limit : int
Limit number of trees in the prediction; defaults to 0 (use all trees).
Returns
-------
X_leaves : array_like, shape=[n_samples, n_trees]
For each datapoint x in X and for each tree, return the index of the
leaf x ends up in. Leaves are numbered within
``[0; 2**(self.max_depth+1))``, possibly with gaps in the numbering.
"""
test_dmatrix = DMatrix(X, missing=self.missing, nthread=self.n_jobs)
return self.get_booster().predict(test_dmatrix,
pred_leaf=True,
ntree_limit=ntree_limit)
从return 的内容可以看出,调用的方法是predict(),并指定参数pred_leaf=True。由此,可以确定通过使用lightGBM的predict()方法即可实现同样的结果