自学习CTR预估中GBDT与LR融合方案 ,有意用简单暴利的python实现一版GBDT/XGboost做特征选择,融合LR进行CTR的代码demo。
1. GBDT + LR
python3.5.3 + scikit-learn0.18.1
from scipy.sparse.construct import hstack
from sklearn.model_selection import train_test_split
from sklearn.datasets.svmlight_format import load_svmlight_file
from sklearn.ensemble.gradient_boosting import GradientBoostingClassifier
from sklearn.linear_model.logistic import LogisticRegression
from sklearn.metrics.ranking import roc_auc_score
from sklearn.preprocessing.data import OneHotEncoder
import numpy as np
def gbdt_lr_train(libsvmFileName):
# load样本数据
X_all, y_all = load_svmlight_file(libsvmFileName)
# 训练/测试数据分割
X_train, X_test, y_train, y_test = train_test_split(X_all, y_all, test_size = 0.3, random_state = 42)
# 定义GBDT模