机器学习_8集成学习

1.集成学习方法概述

集成学习有三种学习方法:Bagging、Boosting(最常用)、Stacking(模型的堆叠)

1.Bagging

在这里插入图片描述

随机森林(Bagging的扩展变体)

在这里插入图片描述
随机森林的优点:
1.在数据集上表现良好,相对于其他算法有较大优势;
2.易于并行化,在数据集上有很大的优势;
3.能够处理高纬度数据,不用作特征选择,不需要做数据的规范化。

在这里插入图片描述
很多弱分类器通过集成学习成为强分类器。
在这里插入图片描述

2.Boosting

在这里插入图片描述

AdaBoost算法

自适应增强:
在这里插入图片描述
算法思想:
在这里插入图片描述
算法思想的图例表示:
(红点应该变小,权重降低,此处忽略)
在这里插入图片描述
在这里插入图片描述

GBDT算法

GBDT(Gradient Boosting Decision Tree)是一种迭代的决策树算法。该算法由多棵决策树组成,GBDT的核心在于累加所有树的结果作为最终结果,所以GBDT中的树都是回归树(累加所以值是连续值,所以是回归树),属于Boosting策略,GBDT被公认为泛化能力非常强
GBDT由三个概念组成:Regression Decision Tree (DT)回归树、Gradient Boosting(GB)梯度提升、Sheinkage缩减。
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
例子:
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
BGDT算法思想:损失函数的负梯度在当前模型的值作为提升树的残差的近似值来拟合回归树。

在这里插入图片描述

XGBoost

非常常用的一种boosting方法,是目前最快最好的开源boosting tree工具包。
在这里插入图片描述

LightGBM

优点:训练速度更快,内存占用更低,准确率更好一点,分布式支持,可快速处理海量数据。
主要改进:LightGBM=XGBoost+GOSS+EFB+Histogram
1.基于梯度的单边采样算法(GOSS):
主要思想:通过对样本采样的方法减少计算目标函数增益时候的复杂度。GOSS算法保留了梯度大的样本,并对梯度小的样本进行随机抽样,为了不改变样本的数据分布,在计算增益时为梯度小的样本乘以一个常数进行平衡。如果一个样本梯度很小,说明该样本的训练误差很小,或者说该样本已经得到了很好的训练。
在这里插入图片描述
应用举例:
在这里插入图片描述
2.互斥特征捆绑算法(EFB):
高维特征间可能相互排斥(如两特征间不同时取非零值)或者不完全排斥,可以将特征融合绑定,从而降低特征数量。
在这里插入图片描述
3.直方图算法:
基本思想:将连续的特征离散化为k个离散特征,同时构造一个宽度为k的直方图(含k个bin)。无需遍历数据,只需遍历k个bin即可找到最佳分裂点。
在这里插入图片描述
4.基于最大深度的Leaf-wise的垂直生长算法:

在这里插入图片描述

3.Stacking

在这里插入图片描述

2.集成学习代码

import warnings
warnings.filterwarnings("ignore") #忽略莫名其妙出现的不影响程序运行的警告
import pandas as pd
from sklearn.model_selection import train_test_split  

# 生成12000行的数据,训练集和测试集按照3:1划分

from sklearn.datasets import make_hastie_10_2
data, target = make_hastie_10_2()

X_train, X_test, y_train, y_test = train_test_split(data, target, random_state=123)
print(X_train.shape, X_test.shape)
#  ((9000, 10), (3000, 10)) 10个特征

六大模型对比

对比六大模型,都使用默认参数,测一下交叉验证的分数。

from sklearn.linear_model import LogisticRegression # 线性逻辑回归
from sklearn.ensemble import RandomForestClassifier # 随机森林
from sklearn.ensemble import AdaBoostClassifier # AdaBoost
from sklearn.ensemble import GradientBoostingClassifier # GBDT
from xgboost import XGBClassifier #XGBoost
from lightgbm import LGBMClassifier  # LightGBM
from sklearn.model_selection import cross_val_score
import time

#分类器clf
clf1 = LogisticRegression()  
clf2 = RandomForestClassifier()
clf3 = AdaBoostClassifier()
clf4 = GradientBoostingClassifier()
clf5 = XGBClassifier()
clf6 = LGBMClassifier()

for clf, label in zip([clf1, clf2, clf3, clf4, clf5, clf6], ['Logistic Regression', 'Random Forest', 'AdaBoost', 'GBDT', 'XGBoost','LightGBM']):
    start = time.time() #开始时间
    scores = cross_val_score(clf, X_train, y_train,scoring='accuracy', cv=5)
    end = time.time() #结束时间
    running_time = end - start
    print("Accuracy: %0.8f (+/- %0.2f),耗时%0.2f秒。模型名称[%s]" %(scores.mean(), scores.std(), running_time, label))
Accuracy: 0.49411111 (+/- 0.01),耗时0.06秒。模型名称[Logistic Regression]
Accuracy: 0.88533333 (+/- 0.01),耗时13.73秒。模型名称[Random Forest]
Accuracy: 0.87533333 (+/- 0.01),耗时2.79秒。模型名称[AdaBoost]
Accuracy: 0.91122222 (+/- 0.00),耗时9.22秒。模型名称[GBDT]
[15:30:36] WARNING: C:/Users/Administrator/workspace/xgboost-win64_release_1.5.1/src/learner.cc:1115: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
[15:30:37] WARNING: C:/Users/Administrator/workspace/xgboost-win64_release_1.5.1/src/learner.cc:1115: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
[15:30:38] WARNING: C:/Users/Administrator/workspace/xgboost-win64_release_1.5.1/src/learner.cc:1115: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
[15:30:39] WARNING: C:/Users/Administrator/workspace/xgboost-win64_release_1.5.1/src/learner.cc:1115: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
[15:30:40] WARNING: C:/Users/Administrator/workspace/xgboost-win64_release_1.5.1/src/learner.cc:1115: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
Accuracy: 0.92366667 (+/- 0.00),耗时5.75秒。模型名称[XGBoost]
Accuracy: 0.92800000 (+/- 0.00),耗时0.60秒。模型名称[LightGBM]
    

对比六大模型,可以看出,逻辑回归速度最快,但准确率最低;而LightGBM,速度快,而且准确率最高,所以,现在处理结构化数据的时候,大部分都是用LightGBM算法

XGBoost的使用

1.原生XGBoost的使用

import xgboost as xgb
#记录程序运行时间
import time
start_time = time.time()

#xgb矩阵赋值
xgb_train = xgb.DMatrix(X_train, y_train)
xgb_test = xgb.DMatrix(X_test, label=y_test)
##参数
params = {
    'booster': 'gbtree',
    # 'silent': 1,  #设置成1则没有运行信息输出,最好是设置为0.
    #'nthread':7,# cpu 线程数 默认最大
    'eta': 0.007,  # 如同学习率
    'min_child_weight': 3,
    # 这个参数默认是 1,是每个叶子里面 h 的和至少是多少,对正负样本不均衡时的 0-1 分类而言,
    #假设 h 在 0.01 附近,min_child_weight 为 1 意味着叶子节点中最少需要包含 100 个样本。
    #这个参数非常影响结果,控制叶子节点中二阶导的和的最小值,该参数值越小,越容易 overfitting。
    'max_depth': 6,  # 构建树的深度,越大越容易过拟合
    'gamma': 0.1,  # 树的叶子节点上作进一步分区所需的最小损失减少,越大越保守,一般0.1、0.2这样子。
    'subsample': 0.7,  # 随机采样训练样本
    'colsample_bytree': 0.7,  # 生成树时进行的列采样 
    'lambda': 2,  # 控制模型复杂度的权重值的L2正则化项参数,参数越大,模型越不容易过拟合。
    #'alpha':0, # L1正则项参数
    #'scale_pos_weight':1, #如果取值大于0的话,在类别样本不平衡的情况下有助于快速收敛。
    #'objective': 'multi:softmax', #多分类的问题
    #'num_class':10, # 类别数,多分类与 multisoftmax 并用
    'seed': 1000,  #随机种子
    #'eval_metric': 'auc'
}
plst = list(params.items()) # 字典转为列表
num_rounds = 500  # 迭代次数
watchlist = [(xgb_train, 'train'), (xgb_test, 'val')] # 显示的结果

#训练模型并保存
# early_stopping_rounds 当设置的迭代次数较大时,early_stopping_rounds 可在一定的迭代次数内准确率没有提升就停止训练
model = xgb.train(plst,xgb_train,num_rounds,watchlist,early_stopping_rounds=100)
#model.save_model('./model/xgb.model') # 用于存储训练出的模型
print("best best_ntree_limit", model.best_ntree_limit)
y_pred = model.predict(xgb_test,ntree_limit=model.best_ntree_limit)
print('error=%f' %(sum(1 for i in range(len(y_pred)) if int(y_pred[i] > 0.5) != y_test[i]) / float(len(y_pred))))
# 输出运行时长
cost_time = time.time() - start_time
print("xgboost success!", '\n', "cost time:", cost_time, "(s)......")
    [0]	train-rmse:1.11000	val-rmse:1.10422
    [1]	train-rmse:1.10734	val-rmse:1.10182
    [2]	train-rmse:1.10465	val-rmse:1.09932
    [3]	train-rmse:1.10207	val-rmse:1.09694
    [4]	train-rmse:1.09944	val-rmse:1.09453
    [5]	train-rmse:1.09682	val-rmse:1.09211
    [6]	train-rmse:1.09424	val-rmse:1.08975
    [7]	train-rmse:1.09175	val-rmse:1.08745
    [8]	train-rmse:1.08923	val-rmse:1.08511
    [9]	train-rmse:1.08664	val-rmse:1.08275
    [10]	train-rmse:1.08410	val-rmse:1.08039
    [11]	train-rmse:1.08167	val-rmse:1.07811
    [12]	train-rmse:1.07922	val-rmse:1.07581
    [13]	train-rmse:1.07675	val-rmse:1.07353
    [14]	train-rmse:1.07434	val-rmse:1.07129
    [15]	train-rmse:1.07190	val-rmse:1.06903
    [16]	train-rmse:1.06943	val-rmse:1.06677
    [17]	train-rmse:1.06713	val-rmse:1.06467
    [18]	train-rmse:1.06476	val-rmse:1.06252
    [19]	train-rmse:1.06244	val-rmse:1.06040
    [20]	train-rmse:1.06015	val-rmse:1.05831
    [21]	train-rmse:1.05791	val-rmse:1.05630
    [22]	train-rmse:1.05558	val-rmse:1.05421
    [23]	train-rmse:1.05328	val-rmse:1.05215
    [24]	train-rmse:1.05102	val-rmse:1.05011
    [25]	train-rmse:1.04873	val-rmse:1.04802
    [26]	train-rmse:1.04649	val-rmse:1.04600
    [27]	train-rmse:1.04429	val-rmse:1.04398
    [28]	train-rmse:1.04214	val-rmse:1.04205
    [29]	train-rmse:1.03999	val-rmse:1.04009
    [30]	train-rmse:1.03787	val-rmse:1.03816
    [31]	train-rmse:1.03570	val-rmse:1.03616
    [32]	train-rmse:1.03362	val-rmse:1.03429
    [33]	train-rmse:1.03153	val-rmse:1.03241
    [34]	train-rmse:1.02947	val-rmse:1.03053
    [35]	train-rmse:1.02745	val-rmse:1.02868
    [36]	train-rmse:1.02537	val-rmse:1.02678
    [37]	train-rmse:1.02329	val-rmse:1.02480
    [38]	train-rmse:1.02124	val-rmse:1.02294
    [39]	train-rmse:1.01930	val-rmse:1.02125
    [40]	train-rmse:1.01727	val-rmse:1.01946
    [41]	train-rmse:1.01531	val-rmse:1.01769
    [42]	train-rmse:1.01328	val-rmse:1.01584
    [43]	train-rmse:1.01131	val-rmse:1.01403
    [44]	train-rmse:1.00931	val-rmse:1.01216
    [45]	train-rmse:1.00737	val-rmse:1.01039
    [46]	train-rmse:1.00548	val-rmse:1.00869
    [47]	train-rmse:1.00349	val-rmse:1.00691
    [48]	train-rmse:1.00159	val-rmse:1.00517
    [49]	train-rmse:0.99967	val-rmse:1.00345
    [50]	train-rmse:0.99774	val-rmse:1.00165
    [51]	train-rmse:0.99584	val-rmse:0.99994
    [52]	train-rmse:0.99397	val-rmse:0.99818
    [53]	train-rmse:0.99209	val-rmse:0.99649
    [54]	train-rmse:0.99020	val-rmse:0.99482
    [55]	train-rmse:0.98833	val-rmse:0.99308
    [56]	train-rmse:0.98649	val-rmse:0.99140
    [57]	train-rmse:0.98468	val-rmse:0.98977
    [58]	train-rmse:0.98293	val-rmse:0.98820
    [59]	train-rmse:0.98113	val-rmse:0.98655
    [60]	train-rmse:0.97933	val-rmse:0.98492
    [61]	train-rmse:0.97754	val-rmse:0.98337
    [62]	train-rmse:0.97566	val-rmse:0.98167
    [63]	train-rmse:0.97399	val-rmse:0.98020
    [64]	train-rmse:0.97229	val-rmse:0.97869
    [65]	train-rmse:0.97051	val-rmse:0.97715
    [66]	train-rmse:0.96875	val-rmse:0.97555
    [67]	train-rmse:0.96706	val-rmse:0.97396
    [68]	train-rmse:0.96543	val-rmse:0.97249
    [69]	train-rmse:0.96373	val-rmse:0.97092
    [70]	train-rmse:0.96207	val-rmse:0.96938
    [71]	train-rmse:0.96044	val-rmse:0.96796
    [72]	train-rmse:0.95881	val-rmse:0.96649
    [73]	train-rmse:0.95715	val-rmse:0.96497
    [74]	train-rmse:0.95554	val-rmse:0.96354
    [75]	train-rmse:0.95389	val-rmse:0.96215
    [76]	train-rmse:0.95216	val-rmse:0.96062
    [77]	train-rmse:0.95056	val-rmse:0.95913
    [78]	train-rmse:0.94890	val-rmse:0.95758
    [79]	train-rmse:0.94729	val-rmse:0.95609
    [80]	train-rmse:0.94567	val-rmse:0.95468
    [81]	train-rmse:0.94416	val-rmse:0.95334
    [82]	train-rmse:0.94261	val-rmse:0.95194
    [83]	train-rmse:0.94103	val-rmse:0.95055
    [84]	train-rmse:0.93942	val-rmse:0.94913
    [85]	train-rmse:0.93790	val-rmse:0.94772
    [86]	train-rmse:0.93637	val-rmse:0.94639
    [87]	train-rmse:0.93485	val-rmse:0.94504
    [88]	train-rmse:0.93331	val-rmse:0.94367
    [89]	train-rmse:0.93183	val-rmse:0.94238
    [90]	train-rmse:0.93032	val-rmse:0.94103
    [91]	train-rmse:0.92883	val-rmse:0.93970
    [92]	train-rmse:0.92737	val-rmse:0.93836
    [93]	train-rmse:0.92584	val-rmse:0.93707
    [94]	train-rmse:0.92442	val-rmse:0.93583
    [95]	train-rmse:0.92303	val-rmse:0.93458
    [96]	train-rmse:0.92167	val-rmse:0.93333
    [97]	train-rmse:0.92022	val-rmse:0.93210
    [98]	train-rmse:0.91876	val-rmse:0.93081
    [99]	train-rmse:0.91732	val-rmse:0.92943
    [100]	train-rmse:0.91587	val-rmse:0.92822
    [101]	train-rmse:0.91452	val-rmse:0.92695
    [102]	train-rmse:0.91312	val-rmse:0.92576
    [103]	train-rmse:0.91172	val-rmse:0.92450
    [104]	train-rmse:0.91039	val-rmse:0.92334
    [105]	train-rmse:0.90900	val-rmse:0.92216
    [106]	train-rmse:0.90758	val-rmse:0.92102
    [107]	train-rmse:0.90620	val-rmse:0.91982
    [108]	train-rmse:0.90483	val-rmse:0.91866
    [109]	train-rmse:0.90349	val-rmse:0.91743
    [110]	train-rmse:0.90210	val-rmse:0.91619
    [111]	train-rmse:0.90079	val-rmse:0.91500
    [112]	train-rmse:0.89943	val-rmse:0.91385
    [113]	train-rmse:0.89816	val-rmse:0.91276
    [114]	train-rmse:0.89691	val-rmse:0.91167
    [115]	train-rmse:0.89564	val-rmse:0.91063
    [116]	train-rmse:0.89438	val-rmse:0.90948
    [117]	train-rmse:0.89305	val-rmse:0.90836
    [118]	train-rmse:0.89179	val-rmse:0.90727
    [119]	train-rmse:0.89047	val-rmse:0.90606
    [120]	train-rmse:0.88915	val-rmse:0.90497
    [121]	train-rmse:0.88791	val-rmse:0.90391
    [122]	train-rmse:0.88666	val-rmse:0.90275
    [123]	train-rmse:0.88538	val-rmse:0.90158
    [124]	train-rmse:0.88409	val-rmse:0.90045
    [125]	train-rmse:0.88287	val-rmse:0.89940
    [126]	train-rmse:0.88154	val-rmse:0.89830
    [127]	train-rmse:0.88038	val-rmse:0.89720
    [128]	train-rmse:0.87913	val-rmse:0.89609
    [129]	train-rmse:0.87789	val-rmse:0.89501
    [130]	train-rmse:0.87664	val-rmse:0.89395
    [131]	train-rmse:0.87543	val-rmse:0.89289
    [132]	train-rmse:0.87419	val-rmse:0.89177
    [133]	train-rmse:0.87299	val-rmse:0.89080
    [134]	train-rmse:0.87177	val-rmse:0.88981
    [135]	train-rmse:0.87063	val-rmse:0.88880
    [136]	train-rmse:0.86944	val-rmse:0.88777
    [137]	train-rmse:0.86829	val-rmse:0.88667
    [138]	train-rmse:0.86718	val-rmse:0.88571
    [139]	train-rmse:0.86601	val-rmse:0.88461
    [140]	train-rmse:0.86483	val-rmse:0.88355
    [141]	train-rmse:0.86364	val-rmse:0.88260
    [142]	train-rmse:0.86244	val-rmse:0.88166
    [143]	train-rmse:0.86135	val-rmse:0.88068
    [144]	train-rmse:0.86024	val-rmse:0.87967
    [145]	train-rmse:0.85906	val-rmse:0.87862
    [146]	train-rmse:0.85793	val-rmse:0.87754
    [147]	train-rmse:0.85677	val-rmse:0.87655
    [148]	train-rmse:0.85570	val-rmse:0.87570
    [149]	train-rmse:0.85458	val-rmse:0.87474
    [150]	train-rmse:0.85347	val-rmse:0.87378
    [151]	train-rmse:0.85242	val-rmse:0.87292
    [152]	train-rmse:0.85134	val-rmse:0.87200
    [153]	train-rmse:0.85023	val-rmse:0.87108
    [154]	train-rmse:0.84911	val-rmse:0.87012
    [155]	train-rmse:0.84802	val-rmse:0.86925
    [156]	train-rmse:0.84700	val-rmse:0.86835
    [157]	train-rmse:0.84590	val-rmse:0.86743
    [158]	train-rmse:0.84482	val-rmse:0.86651
    [159]	train-rmse:0.84374	val-rmse:0.86555
    [160]	train-rmse:0.84267	val-rmse:0.86459
    [161]	train-rmse:0.84166	val-rmse:0.86370
    [162]	train-rmse:0.84059	val-rmse:0.86280
    [163]	train-rmse:0.83950	val-rmse:0.86185
    [164]	train-rmse:0.83843	val-rmse:0.86096
    [165]	train-rmse:0.83736	val-rmse:0.86002
    [166]	train-rmse:0.83640	val-rmse:0.85921
    [167]	train-rmse:0.83533	val-rmse:0.85831
    [168]	train-rmse:0.83429	val-rmse:0.85740
    [169]	train-rmse:0.83318	val-rmse:0.85650
    [170]	train-rmse:0.83215	val-rmse:0.85553
    [171]	train-rmse:0.83112	val-rmse:0.85465
    [172]	train-rmse:0.83010	val-rmse:0.85378
    [173]	train-rmse:0.82908	val-rmse:0.85298
    [174]	train-rmse:0.82806	val-rmse:0.85203
    [175]	train-rmse:0.82705	val-rmse:0.85117
    [176]	train-rmse:0.82604	val-rmse:0.85034
    [177]	train-rmse:0.82509	val-rmse:0.84950
    [178]	train-rmse:0.82406	val-rmse:0.84869
    [179]	train-rmse:0.82307	val-rmse:0.84776
    [180]	train-rmse:0.82202	val-rmse:0.84692
    [181]	train-rmse:0.82106	val-rmse:0.84610
    [182]	train-rmse:0.82008	val-rmse:0.84532
    [183]	train-rmse:0.81906	val-rmse:0.84441
    [184]	train-rmse:0.81812	val-rmse:0.84361
    [185]	train-rmse:0.81716	val-rmse:0.84278
    [186]	train-rmse:0.81622	val-rmse:0.84202
    [187]	train-rmse:0.81528	val-rmse:0.84120
    [188]	train-rmse:0.81436	val-rmse:0.84043
    [189]	train-rmse:0.81345	val-rmse:0.83962
    [190]	train-rmse:0.81251	val-rmse:0.83876
    [191]	train-rmse:0.81163	val-rmse:0.83807
    [192]	train-rmse:0.81071	val-rmse:0.83732
    [193]	train-rmse:0.80981	val-rmse:0.83657
    [194]	train-rmse:0.80887	val-rmse:0.83574
    [195]	train-rmse:0.80799	val-rmse:0.83500
    [196]	train-rmse:0.80707	val-rmse:0.83423
    [197]	train-rmse:0.80613	val-rmse:0.83340
    [198]	train-rmse:0.80524	val-rmse:0.83263
    [199]	train-rmse:0.80434	val-rmse:0.83187
    [200]	train-rmse:0.80341	val-rmse:0.83110
    [201]	train-rmse:0.80253	val-rmse:0.83037
    [202]	train-rmse:0.80169	val-rmse:0.82971
    [203]	train-rmse:0.80081	val-rmse:0.82901
    [204]	train-rmse:0.79989	val-rmse:0.82820
    [205]	train-rmse:0.79902	val-rmse:0.82740
    [206]	train-rmse:0.79810	val-rmse:0.82662
    [207]	train-rmse:0.79720	val-rmse:0.82591
    [208]	train-rmse:0.79630	val-rmse:0.82514
    [209]	train-rmse:0.79539	val-rmse:0.82439
    [210]	train-rmse:0.79449	val-rmse:0.82371
    [211]	train-rmse:0.79358	val-rmse:0.82297
    [212]	train-rmse:0.79266	val-rmse:0.82221
    [213]	train-rmse:0.79181	val-rmse:0.82154
    [214]	train-rmse:0.79094	val-rmse:0.82078
    [215]	train-rmse:0.79004	val-rmse:0.82007
    [216]	train-rmse:0.78916	val-rmse:0.81934
    [217]	train-rmse:0.78831	val-rmse:0.81866
    [218]	train-rmse:0.78744	val-rmse:0.81797
    [219]	train-rmse:0.78657	val-rmse:0.81720
    [220]	train-rmse:0.78569	val-rmse:0.81650
    [221]	train-rmse:0.78481	val-rmse:0.81576
    [222]	train-rmse:0.78401	val-rmse:0.81510
    [223]	train-rmse:0.78317	val-rmse:0.81442
    [224]	train-rmse:0.78234	val-rmse:0.81369
    [225]	train-rmse:0.78151	val-rmse:0.81299
    [226]	train-rmse:0.78065	val-rmse:0.81228
    [227]	train-rmse:0.77982	val-rmse:0.81157
    [228]	train-rmse:0.77894	val-rmse:0.81082
    [229]	train-rmse:0.77807	val-rmse:0.81016
    [230]	train-rmse:0.77723	val-rmse:0.80948
    [231]	train-rmse:0.77640	val-rmse:0.80883
    [232]	train-rmse:0.77556	val-rmse:0.80811
    [233]	train-rmse:0.77478	val-rmse:0.80753
    [234]	train-rmse:0.77391	val-rmse:0.80681
    [235]	train-rmse:0.77306	val-rmse:0.80612
    [236]	train-rmse:0.77220	val-rmse:0.80543
    [237]	train-rmse:0.77139	val-rmse:0.80475
    [238]	train-rmse:0.77065	val-rmse:0.80412
    [239]	train-rmse:0.76984	val-rmse:0.80342
    [240]	train-rmse:0.76899	val-rmse:0.80278
    [241]	train-rmse:0.76826	val-rmse:0.80216
    [242]	train-rmse:0.76748	val-rmse:0.80156
    [243]	train-rmse:0.76669	val-rmse:0.80094
    [244]	train-rmse:0.76589	val-rmse:0.80029
    [245]	train-rmse:0.76510	val-rmse:0.79968
    [246]	train-rmse:0.76430	val-rmse:0.79907
    [247]	train-rmse:0.76350	val-rmse:0.79839
    [248]	train-rmse:0.76277	val-rmse:0.79776
    [249]	train-rmse:0.76205	val-rmse:0.79716
    [250]	train-rmse:0.76125	val-rmse:0.79652
    [251]	train-rmse:0.76044	val-rmse:0.79594
    [252]	train-rmse:0.75961	val-rmse:0.79526
    [253]	train-rmse:0.75883	val-rmse:0.79455
    [254]	train-rmse:0.75804	val-rmse:0.79391
    [255]	train-rmse:0.75733	val-rmse:0.79328
    [256]	train-rmse:0.75655	val-rmse:0.79261
    [257]	train-rmse:0.75579	val-rmse:0.79197
    [258]	train-rmse:0.75506	val-rmse:0.79140
    [259]	train-rmse:0.75429	val-rmse:0.79079
    [260]	train-rmse:0.75358	val-rmse:0.79017
    [261]	train-rmse:0.75279	val-rmse:0.78952
    [262]	train-rmse:0.75203	val-rmse:0.78888
    [263]	train-rmse:0.75121	val-rmse:0.78819
    [264]	train-rmse:0.75048	val-rmse:0.78750
    [265]	train-rmse:0.74974	val-rmse:0.78687
    [266]	train-rmse:0.74903	val-rmse:0.78629
    [267]	train-rmse:0.74825	val-rmse:0.78565
    [268]	train-rmse:0.74748	val-rmse:0.78506
    [269]	train-rmse:0.74678	val-rmse:0.78448
    [270]	train-rmse:0.74609	val-rmse:0.78392
    [271]	train-rmse:0.74541	val-rmse:0.78334
    [272]	train-rmse:0.74472	val-rmse:0.78274
    [273]	train-rmse:0.74404	val-rmse:0.78212
    [274]	train-rmse:0.74327	val-rmse:0.78156
    [275]	train-rmse:0.74258	val-rmse:0.78095
    [276]	train-rmse:0.74189	val-rmse:0.78042
    [277]	train-rmse:0.74118	val-rmse:0.77985
    [278]	train-rmse:0.74044	val-rmse:0.77924
    [279]	train-rmse:0.73974	val-rmse:0.77862
    [280]	train-rmse:0.73906	val-rmse:0.77810
    [281]	train-rmse:0.73836	val-rmse:0.77758
    [282]	train-rmse:0.73765	val-rmse:0.77707
    [283]	train-rmse:0.73695	val-rmse:0.77647
    [284]	train-rmse:0.73624	val-rmse:0.77587
    [285]	train-rmse:0.73555	val-rmse:0.77527
    [286]	train-rmse:0.73486	val-rmse:0.77468
    [287]	train-rmse:0.73419	val-rmse:0.77419
    [288]	train-rmse:0.73352	val-rmse:0.77366
    [289]	train-rmse:0.73280	val-rmse:0.77304
    [290]	train-rmse:0.73211	val-rmse:0.77246
    [291]	train-rmse:0.73140	val-rmse:0.77185
    [292]	train-rmse:0.73073	val-rmse:0.77128
    [293]	train-rmse:0.72999	val-rmse:0.77067
    [294]	train-rmse:0.72931	val-rmse:0.77017
    [295]	train-rmse:0.72864	val-rmse:0.76965
    [296]	train-rmse:0.72794	val-rmse:0.76912
    [297]	train-rmse:0.72727	val-rmse:0.76858
    [298]	train-rmse:0.72660	val-rmse:0.76802
    [299]	train-rmse:0.72588	val-rmse:0.76744
    [300]	train-rmse:0.72522	val-rmse:0.76698
    [301]	train-rmse:0.72461	val-rmse:0.76649
    [302]	train-rmse:0.72391	val-rmse:0.76595
    [303]	train-rmse:0.72326	val-rmse:0.76543
    [304]	train-rmse:0.72259	val-rmse:0.76487
    [305]	train-rmse:0.72195	val-rmse:0.76435
    [306]	train-rmse:0.72132	val-rmse:0.76377
    [307]	train-rmse:0.72066	val-rmse:0.76324
    [308]	train-rmse:0.72004	val-rmse:0.76272
    [309]	train-rmse:0.71935	val-rmse:0.76219
    [310]	train-rmse:0.71868	val-rmse:0.76165
    [311]	train-rmse:0.71806	val-rmse:0.76111
    [312]	train-rmse:0.71741	val-rmse:0.76065
    [313]	train-rmse:0.71674	val-rmse:0.76006
    [314]	train-rmse:0.71608	val-rmse:0.75959
    [315]	train-rmse:0.71547	val-rmse:0.75906
    [316]	train-rmse:0.71478	val-rmse:0.75852
    [317]	train-rmse:0.71414	val-rmse:0.75798
    [318]	train-rmse:0.71352	val-rmse:0.75745
    [319]	train-rmse:0.71287	val-rmse:0.75697
    [320]	train-rmse:0.71226	val-rmse:0.75643
    [321]	train-rmse:0.71162	val-rmse:0.75597
    [322]	train-rmse:0.71096	val-rmse:0.75548
    [323]	train-rmse:0.71036	val-rmse:0.75501
    [324]	train-rmse:0.70976	val-rmse:0.75457
    [325]	train-rmse:0.70910	val-rmse:0.75402
    [326]	train-rmse:0.70848	val-rmse:0.75350
    [327]	train-rmse:0.70789	val-rmse:0.75308
    [328]	train-rmse:0.70721	val-rmse:0.75264
    [329]	train-rmse:0.70661	val-rmse:0.75216
    [330]	train-rmse:0.70597	val-rmse:0.75162
    [331]	train-rmse:0.70536	val-rmse:0.75114
    [332]	train-rmse:0.70476	val-rmse:0.75069
    [333]	train-rmse:0.70416	val-rmse:0.75022
    [334]	train-rmse:0.70352	val-rmse:0.74974
    [335]	train-rmse:0.70289	val-rmse:0.74921
    [336]	train-rmse:0.70229	val-rmse:0.74870
    [337]	train-rmse:0.70173	val-rmse:0.74823
    [338]	train-rmse:0.70112	val-rmse:0.74774
    [339]	train-rmse:0.70053	val-rmse:0.74730
    [340]	train-rmse:0.69992	val-rmse:0.74682
    [341]	train-rmse:0.69928	val-rmse:0.74625
    [342]	train-rmse:0.69869	val-rmse:0.74576
    [343]	train-rmse:0.69812	val-rmse:0.74532
    [344]	train-rmse:0.69751	val-rmse:0.74484
    [345]	train-rmse:0.69691	val-rmse:0.74435
    [346]	train-rmse:0.69631	val-rmse:0.74392
    [347]	train-rmse:0.69571	val-rmse:0.74345
    [348]	train-rmse:0.69513	val-rmse:0.74295
    [349]	train-rmse:0.69460	val-rmse:0.74253
    [350]	train-rmse:0.69400	val-rmse:0.74205
    [351]	train-rmse:0.69339	val-rmse:0.74158
    [352]	train-rmse:0.69281	val-rmse:0.74109
    [353]	train-rmse:0.69223	val-rmse:0.74060
    [354]	train-rmse:0.69158	val-rmse:0.74009
    [355]	train-rmse:0.69102	val-rmse:0.73968
    [356]	train-rmse:0.69046	val-rmse:0.73923
    [357]	train-rmse:0.68992	val-rmse:0.73881
    [358]	train-rmse:0.68933	val-rmse:0.73831
    [359]	train-rmse:0.68873	val-rmse:0.73781
    [360]	train-rmse:0.68812	val-rmse:0.73739
    [361]	train-rmse:0.68752	val-rmse:0.73687
    [362]	train-rmse:0.68697	val-rmse:0.73640
    [363]	train-rmse:0.68644	val-rmse:0.73595
    [364]	train-rmse:0.68587	val-rmse:0.73549
    [365]	train-rmse:0.68528	val-rmse:0.73510
    [366]	train-rmse:0.68469	val-rmse:0.73460
    [367]	train-rmse:0.68413	val-rmse:0.73410
    [368]	train-rmse:0.68362	val-rmse:0.73371
    [369]	train-rmse:0.68303	val-rmse:0.73321
    [370]	train-rmse:0.68246	val-rmse:0.73276
    [371]	train-rmse:0.68188	val-rmse:0.73234
    [372]	train-rmse:0.68133	val-rmse:0.73191
    [373]	train-rmse:0.68082	val-rmse:0.73148
    [374]	train-rmse:0.68026	val-rmse:0.73110
    [375]	train-rmse:0.67970	val-rmse:0.73072
    [376]	train-rmse:0.67908	val-rmse:0.73022
    [377]	train-rmse:0.67856	val-rmse:0.72977
    [378]	train-rmse:0.67799	val-rmse:0.72935
    [379]	train-rmse:0.67742	val-rmse:0.72894
    [380]	train-rmse:0.67687	val-rmse:0.72850
    [381]	train-rmse:0.67631	val-rmse:0.72807
    [382]	train-rmse:0.67577	val-rmse:0.72763
    [383]	train-rmse:0.67523	val-rmse:0.72717
    [384]	train-rmse:0.67475	val-rmse:0.72677
    [385]	train-rmse:0.67423	val-rmse:0.72637
    [386]	train-rmse:0.67368	val-rmse:0.72598
    [387]	train-rmse:0.67312	val-rmse:0.72557
    [388]	train-rmse:0.67265	val-rmse:0.72520
    [389]	train-rmse:0.67212	val-rmse:0.72478
    [390]	train-rmse:0.67160	val-rmse:0.72435
    [391]	train-rmse:0.67107	val-rmse:0.72397
    [392]	train-rmse:0.67054	val-rmse:0.72352
    [393]	train-rmse:0.67000	val-rmse:0.72310
    [394]	train-rmse:0.66944	val-rmse:0.72270
    [395]	train-rmse:0.66893	val-rmse:0.72232
    [396]	train-rmse:0.66842	val-rmse:0.72194
    [397]	train-rmse:0.66787	val-rmse:0.72160
    [398]	train-rmse:0.66731	val-rmse:0.72117
    [399]	train-rmse:0.66681	val-rmse:0.72068
    [400]	train-rmse:0.66629	val-rmse:0.72026
    [401]	train-rmse:0.66578	val-rmse:0.71988
    [402]	train-rmse:0.66529	val-rmse:0.71948
    [403]	train-rmse:0.66479	val-rmse:0.71908
    [404]	train-rmse:0.66428	val-rmse:0.71869
    [405]	train-rmse:0.66381	val-rmse:0.71831
    [406]	train-rmse:0.66330	val-rmse:0.71789
    [407]	train-rmse:0.66278	val-rmse:0.71750
    [408]	train-rmse:0.66230	val-rmse:0.71715
    [409]	train-rmse:0.66180	val-rmse:0.71676
    [410]	train-rmse:0.66126	val-rmse:0.71635
    [411]	train-rmse:0.66074	val-rmse:0.71596
    [412]	train-rmse:0.66029	val-rmse:0.71567
    [413]	train-rmse:0.65978	val-rmse:0.71532
    [414]	train-rmse:0.65927	val-rmse:0.71490
    [415]	train-rmse:0.65876	val-rmse:0.71450
    [416]	train-rmse:0.65825	val-rmse:0.71415
    [417]	train-rmse:0.65780	val-rmse:0.71382
    [418]	train-rmse:0.65730	val-rmse:0.71341
    [419]	train-rmse:0.65680	val-rmse:0.71302
    [420]	train-rmse:0.65630	val-rmse:0.71267
    [421]	train-rmse:0.65581	val-rmse:0.71230
    [422]	train-rmse:0.65533	val-rmse:0.71196
    [423]	train-rmse:0.65483	val-rmse:0.71158
    [424]	train-rmse:0.65438	val-rmse:0.71127
    [425]	train-rmse:0.65392	val-rmse:0.71088
    [426]	train-rmse:0.65342	val-rmse:0.71052
    [427]	train-rmse:0.65295	val-rmse:0.71014
    [428]	train-rmse:0.65251	val-rmse:0.70984
    [429]	train-rmse:0.65206	val-rmse:0.70955
    [430]	train-rmse:0.65159	val-rmse:0.70922
    [431]	train-rmse:0.65111	val-rmse:0.70883
    [432]	train-rmse:0.65062	val-rmse:0.70844
    [433]	train-rmse:0.65017	val-rmse:0.70810
    [434]	train-rmse:0.64969	val-rmse:0.70771
    [435]	train-rmse:0.64922	val-rmse:0.70731
    [436]	train-rmse:0.64873	val-rmse:0.70700
    [437]	train-rmse:0.64824	val-rmse:0.70668
    [438]	train-rmse:0.64777	val-rmse:0.70635
    [439]	train-rmse:0.64724	val-rmse:0.70601
    [440]	train-rmse:0.64678	val-rmse:0.70562
    [441]	train-rmse:0.64631	val-rmse:0.70523
    [442]	train-rmse:0.64583	val-rmse:0.70490
    [443]	train-rmse:0.64535	val-rmse:0.70455
    [444]	train-rmse:0.64487	val-rmse:0.70419
    [445]	train-rmse:0.64441	val-rmse:0.70384
    [446]	train-rmse:0.64396	val-rmse:0.70349
    [447]	train-rmse:0.64350	val-rmse:0.70313
    [448]	train-rmse:0.64300	val-rmse:0.70278
    [449]	train-rmse:0.64254	val-rmse:0.70250
    [450]	train-rmse:0.64208	val-rmse:0.70216
    [451]	train-rmse:0.64157	val-rmse:0.70178
    [452]	train-rmse:0.64109	val-rmse:0.70144
    [453]	train-rmse:0.64065	val-rmse:0.70109
    [454]	train-rmse:0.64020	val-rmse:0.70072
    [455]	train-rmse:0.63973	val-rmse:0.70036
    [456]	train-rmse:0.63926	val-rmse:0.69998
    [457]	train-rmse:0.63884	val-rmse:0.69967
    [458]	train-rmse:0.63840	val-rmse:0.69938
    [459]	train-rmse:0.63792	val-rmse:0.69903
    [460]	train-rmse:0.63743	val-rmse:0.69866
    [461]	train-rmse:0.63695	val-rmse:0.69831
    [462]	train-rmse:0.63655	val-rmse:0.69796
    [463]	train-rmse:0.63613	val-rmse:0.69760
    [464]	train-rmse:0.63567	val-rmse:0.69726
    [465]	train-rmse:0.63523	val-rmse:0.69695
    [466]	train-rmse:0.63475	val-rmse:0.69664
    [467]	train-rmse:0.63430	val-rmse:0.69625
    [468]	train-rmse:0.63389	val-rmse:0.69597
    [469]	train-rmse:0.63342	val-rmse:0.69563
    [470]	train-rmse:0.63299	val-rmse:0.69538
    [471]	train-rmse:0.63254	val-rmse:0.69505
    [472]	train-rmse:0.63207	val-rmse:0.69470
    [473]	train-rmse:0.63164	val-rmse:0.69443
    [474]	train-rmse:0.63121	val-rmse:0.69414
    [475]	train-rmse:0.63082	val-rmse:0.69383
    [476]	train-rmse:0.63035	val-rmse:0.69350
    [477]	train-rmse:0.62995	val-rmse:0.69321
    [478]	train-rmse:0.62950	val-rmse:0.69284
    [479]	train-rmse:0.62902	val-rmse:0.69252
    [480]	train-rmse:0.62865	val-rmse:0.69225
    [481]	train-rmse:0.62821	val-rmse:0.69197
    [482]	train-rmse:0.62776	val-rmse:0.69165
    [483]	train-rmse:0.62734	val-rmse:0.69125
    [484]	train-rmse:0.62691	val-rmse:0.69097
    [485]	train-rmse:0.62651	val-rmse:0.69066
    [486]	train-rmse:0.62608	val-rmse:0.69035
    [487]	train-rmse:0.62563	val-rmse:0.69003
    [488]	train-rmse:0.62519	val-rmse:0.68971
    [489]	train-rmse:0.62474	val-rmse:0.68936
    [490]	train-rmse:0.62429	val-rmse:0.68902
    [491]	train-rmse:0.62389	val-rmse:0.68866
    [492]	train-rmse:0.62348	val-rmse:0.68834
    [493]	train-rmse:0.62301	val-rmse:0.68802
    [494]	train-rmse:0.62263	val-rmse:0.68772
    [495]	train-rmse:0.62219	val-rmse:0.68738
    [496]	train-rmse:0.62175	val-rmse:0.68707
    [497]	train-rmse:0.62135	val-rmse:0.68680
    [498]	train-rmse:0.62096	val-rmse:0.68650
    [499]	train-rmse:0.62056	val-rmse:0.68624
    best best_ntree_limit 500
   error=0.837333
   xgboost success! 
     cost time: 7.657426118850708 (s)......

2.使用scikit-learn接口

会改变的函数名是:

eta -> learning_rate

lambda -> reg_lambda

alpha -> reg_alpha

from sklearn.model_selection import train_test_split
from sklearn import metrics
from xgboost import XGBClassifier

clf = XGBClassifier(
    #     silent=0,  #设置成1则没有运行信息输出,最好是设置为0.是否在运行升级时打印消息。
    #nthread=4,# cpu 线程数 默认最大
    learning_rate=0.3,  # 如同学习率
    min_child_weight=1,
    # 这个参数默认是 1,是每个叶子里面 h 的和至少是多少,对正负样本不均衡时的 0-1 分类而言
    #假设 h 在 0.01 附近,min_child_weight 为 1 意味着叶子节点中最少需要包含 100 个样本。
    #这个参数非常影响结果,控制叶子节点中二阶导的和的最小值,该参数值越小,越容易 overfitting。
    max_depth=6,  # 构建树的深度,越大越容易过拟合
    gamma=0,  # 树的叶子节点上作进一步分区所需的最小损失减少,越大越保守,一般0.1、0.2这样子。
    subsample=1,  # 随机采样训练样本 训练实例的子采样比
    max_delta_step=0,  #最大增量步长,我们允许每个树的权重估计。
    colsample_bytree=1,  # 生成树时进行的列采样 
    reg_lambda=1,  # 控制模型复杂度的权重值的L2正则化项参数,参数越大,模型越不容易过拟合。
    #reg_alpha=0, # L1 正则项参数
    #scale_pos_weight=1, #如果取值大于0的话,在类别样本不平衡的情况下有助于快速收敛。平衡正负权重
    #objective= 'multi:softmax', #多分类的问题 指定学习任务和相应的学习目标
    #num_class=10, # 类别数,多分类与 multisoftmax 并用
    n_estimators=100,  #树的个数
    seed=1000  #随机种子
    #eval_metric= 'auc'
)
clf.fit(X_train, y_train)

y_true, y_pred = y_test, clf.predict(X_test)
print("Accuracy : %.4g" % metrics.accuracy_score(y_true, y_pred))
[16:03:06] WARNING: C:/Users/Administrator/workspace/xgboost-win64_release_1.5.1/src/learner.cc:1115: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
Accuracy : 0.9273

LIghtGBM的使用

1.原生接口

import lightgbm as lgb
from sklearn.metrics import mean_squared_error
# 加载你的数据
# print('Load data...')
# df_train = pd.read_csv('../regression/regression.train', header=None, sep='\t')
# df_test = pd.read_csv('../regression/regression.test', header=None, sep='\t')
#
# y_train = df_train[0].values
# y_test = df_test[0].values
# X_train = df_train.drop(0, axis=1).values
# X_test = df_test.drop(0, axis=1).values

# 创建成lgb特征的数据集格式
lgb_train = lgb.Dataset(X_train, y_train)  # 将数据保存到LightGBM二进制文件将使加载更快
lgb_eval = lgb.Dataset(X_test, y_test, reference=lgb_train)  # 创建验证数据

# 将参数写成字典下形式
params = {
    'task': 'train',
    'boosting_type': 'gbdt',  # 设置提升类型
    'objective': 'regression',  # 目标函数
    'metric': {'l2', 'auc'},  # 评估函数
    'num_leaves': 31,  # 叶子节点数
    'learning_rate': 0.05,  # 学习速率
    'feature_fraction': 0.9,  # 建树的特征选择比例
    'bagging_fraction': 0.8,  # 建树的样本采样比例
    'bagging_freq': 5,  # k 意味着每 k 次迭代执行bagging
    'verbose': 1  # <0 显示致命的, =0 显示错误 (警告), >0 显示信息
}

print('Start training...')
# 训练 cv and train
gbm = lgb.train(params, lgb_train,num_boost_round=500,valid_sets=lgb_eval,early_stopping_rounds=5)  # 训练数据需要参数列表和数据集

print('Save model...')
gbm.save_model('model.txt')  # 训练后保存模型到文件

print('Start predicting...')
# 预测数据集
y_pred = gbm.predict(X_test, num_iteration=gbm.best_iteration)  #如果在训练期间启用了早期停止,可以通过best_iteration方式从最佳迭代中获得预测
# 评估模型
print('error=%f' %
      (sum(1
           for i in range(len(y_pred)) if int(y_pred[i] > 0.5) != y_test[i]) /
       float(len(y_pred))))
Start training...
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000448 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2550
[LightGBM] [Info] Number of data points in the train set: 9000, number of used features: 10
[LightGBM] [Info] Start training from score 0.012000
[1]	valid_0's auc: 0.814399	valid_0's l2: 0.965563
Training until validation scores don't improve for 5 rounds
[2]	valid_0's auc: 0.84729	valid_0's l2: 0.934647
[3]	valid_0's auc: 0.872805	valid_0's l2: 0.905265
[4]	valid_0's auc: 0.884117	valid_0's l2: 0.877875
[5]	valid_0's auc: 0.895115	valid_0's l2: 0.852189
[6]	valid_0's auc: 0.905545	valid_0's l2: 0.826298
[7]	valid_0's auc: 0.909113	valid_0's l2: 0.803776
[8]	valid_0's auc: 0.913303	valid_0's l2: 0.781627
[9]	valid_0's auc: 0.917894	valid_0's l2: 0.760624
[10]	valid_0's auc: 0.919443	valid_0's l2: 0.742882
[11]	valid_0's auc: 0.921543	valid_0's l2: 0.723811
[12]	valid_0's auc: 0.923021	valid_0's l2: 0.707255
[13]	valid_0's auc: 0.9257	valid_0's l2: 0.69078
[14]	valid_0's auc: 0.928892	valid_0's l2: 0.675987
[15]	valid_0's auc: 0.930132	valid_0's l2: 0.661313
[16]	valid_0's auc: 0.931587	valid_0's l2: 0.646023
[17]	valid_0's auc: 0.932941	valid_0's l2: 0.634004
[18]	valid_0's auc: 0.934165	valid_0's l2: 0.622429
[19]	valid_0's auc: 0.935885	valid_0's l2: 0.610132
[20]	valid_0's auc: 0.936883	valid_0's l2: 0.599122
[21]	valid_0's auc: 0.93814	valid_0's l2: 0.589571
[22]	valid_0's auc: 0.940452	valid_0's l2: 0.580309
[23]	valid_0's auc: 0.941039	valid_0's l2: 0.571361
[24]	valid_0's auc: 0.943049	valid_0's l2: 0.562062
[25]	valid_0's auc: 0.9446	valid_0's l2: 0.551967
[26]	valid_0's auc: 0.946498	valid_0's l2: 0.543442
[27]	valid_0's auc: 0.94763	valid_0's l2: 0.535659
[28]	valid_0's auc: 0.94871	valid_0's l2: 0.527913
[29]	valid_0's auc: 0.949753	valid_0's l2: 0.521228
[30]	valid_0's auc: 0.950816	valid_0's l2: 0.513909
[31]	valid_0's auc: 0.95184	valid_0's l2: 0.507784
[32]	valid_0's auc: 0.953109	valid_0's l2: 0.501336
[33]	valid_0's auc: 0.954351	valid_0's l2: 0.494439
[34]	valid_0's auc: 0.955716	valid_0's l2: 0.488722
[35]	valid_0's auc: 0.956098	valid_0's l2: 0.483373
[36]	valid_0's auc: 0.956495	valid_0's l2: 0.477602
[37]	valid_0's auc: 0.956717	valid_0's l2: 0.473033
[38]	valid_0's auc: 0.957213	valid_0's l2: 0.468013
[39]	valid_0's auc: 0.957812	valid_0's l2: 0.463634
[40]	valid_0's auc: 0.957862	valid_0's l2: 0.459433
[41]	valid_0's auc: 0.958249	valid_0's l2: 0.455687
[42]	valid_0's auc: 0.958799	valid_0's l2: 0.450696
[43]	valid_0's auc: 0.959311	valid_0's l2: 0.446838
[44]	valid_0's auc: 0.959835	valid_0's l2: 0.44233
[45]	valid_0's auc: 0.960234	valid_0's l2: 0.438117
[46]	valid_0's auc: 0.960826	valid_0's l2: 0.43469
[47]	valid_0's auc: 0.961647	valid_0's l2: 0.430488
[48]	valid_0's auc: 0.962359	valid_0's l2: 0.427449
[49]	valid_0's auc: 0.962506	valid_0's l2: 0.424433
[50]	valid_0's auc: 0.962897	valid_0's l2: 0.420571
[51]	valid_0's auc: 0.963657	valid_0's l2: 0.417288
[52]	valid_0's auc: 0.964224	valid_0's l2: 0.414743
[53]	valid_0's auc: 0.964903	valid_0's l2: 0.412255
[54]	valid_0's auc: 0.965508	valid_0's l2: 0.40907
[55]	valid_0's auc: 0.966194	valid_0's l2: 0.406477
[56]	valid_0's auc: 0.966759	valid_0's l2: 0.403771
[57]	valid_0's auc: 0.966901	valid_0's l2: 0.400885
[58]	valid_0's auc: 0.967291	valid_0's l2: 0.398386
[59]	valid_0's auc: 0.967779	valid_0's l2: 0.395949
[60]	valid_0's auc: 0.968119	valid_0's l2: 0.393905
[61]	valid_0's auc: 0.968517	valid_0's l2: 0.391743
[62]	valid_0's auc: 0.968891	valid_0's l2: 0.389717
[63]	valid_0's auc: 0.969304	valid_0's l2: 0.387769
[64]	valid_0's auc: 0.969598	valid_0's l2: 0.385498
[65]	valid_0's auc: 0.969953	valid_0's l2: 0.383139
[66]	valid_0's auc: 0.970443	valid_0's l2: 0.38094
[67]	valid_0's auc: 0.970888	valid_0's l2: 0.378793
[68]	valid_0's auc: 0.971189	valid_0's l2: 0.376754
[69]	valid_0's auc: 0.971377	valid_0's l2: 0.37495
[70]	valid_0's auc: 0.971692	valid_0's l2: 0.37324
[71]	valid_0's auc: 0.971954	valid_0's l2: 0.371629
[72]	valid_0's auc: 0.972278	valid_0's l2: 0.370046
[73]	valid_0's auc: 0.972622	valid_0's l2: 0.368577
[74]	valid_0's auc: 0.972986	valid_0's l2: 0.366746
[75]	valid_0's auc: 0.973308	valid_0's l2: 0.365326
[76]	valid_0's auc: 0.973449	valid_0's l2: 0.364078
[77]	valid_0's auc: 0.973681	valid_0's l2: 0.362431
[78]	valid_0's auc: 0.973941	valid_0's l2: 0.361071
[79]	valid_0's auc: 0.97428	valid_0's l2: 0.359825
[80]	valid_0's auc: 0.974554	valid_0's l2: 0.358506
[81]	valid_0's auc: 0.974731	valid_0's l2: 0.357538
[82]	valid_0's auc: 0.975094	valid_0's l2: 0.355998
[83]	valid_0's auc: 0.97531	valid_0's l2: 0.354819
[84]	valid_0's auc: 0.975363	valid_0's l2: 0.353645
[85]	valid_0's auc: 0.9756	valid_0's l2: 0.352575
[86]	valid_0's auc: 0.975688	valid_0's l2: 0.351995
[87]	valid_0's auc: 0.975909	valid_0's l2: 0.350867
[88]	valid_0's auc: 0.97603	valid_0's l2: 0.350146
[89]	valid_0's auc: 0.976171	valid_0's l2: 0.34933
[90]	valid_0's auc: 0.976264	valid_0's l2: 0.348303
[91]	valid_0's auc: 0.976501	valid_0's l2: 0.347415
[92]	valid_0's auc: 0.976681	valid_0's l2: 0.346621
[93]	valid_0's auc: 0.976794	valid_0's l2: 0.345989
[94]	valid_0's auc: 0.976892	valid_0's l2: 0.345124
[95]	valid_0's auc: 0.977077	valid_0's l2: 0.344425
[96]	valid_0's auc: 0.9771	valid_0's l2: 0.343969
[97]	valid_0's auc: 0.977176	valid_0's l2: 0.343221
[98]	valid_0's auc: 0.977239	valid_0's l2: 0.342578
[99]	valid_0's auc: 0.977433	valid_0's l2: 0.341817
[100]	valid_0's auc: 0.977516	valid_0's l2: 0.341229
[101]	valid_0's auc: 0.977559	valid_0's l2: 0.340357
[102]	valid_0's auc: 0.977707	valid_0's l2: 0.339484
[103]	valid_0's auc: 0.977742	valid_0's l2: 0.339004
[104]	valid_0's auc: 0.977806	valid_0's l2: 0.338581
[105]	valid_0's auc: 0.977983	valid_0's l2: 0.338095
[106]	valid_0's auc: 0.978113	valid_0's l2: 0.337505
[107]	valid_0's auc: 0.978251	valid_0's l2: 0.336939
[108]	valid_0's auc: 0.978479	valid_0's l2: 0.336443
[109]	valid_0's auc: 0.978611	valid_0's l2: 0.336062
[110]	valid_0's auc: 0.978694	valid_0's l2: 0.335636
[111]	valid_0's auc: 0.97885	valid_0's l2: 0.335083
[112]	valid_0's auc: 0.979037	valid_0's l2: 0.334435
[113]	valid_0's auc: 0.979209	valid_0's l2: 0.333876
[114]	valid_0's auc: 0.97939	valid_0's l2: 0.333341
[115]	valid_0's auc: 0.979513	valid_0's l2: 0.332968
[116]	valid_0's auc: 0.979615	valid_0's l2: 0.332583
[117]	valid_0's auc: 0.979741	valid_0's l2: 0.332138
[118]	valid_0's auc: 0.979883	valid_0's l2: 0.331546
[119]	valid_0's auc: 0.979971	valid_0's l2: 0.331399
[120]	valid_0's auc: 0.980002	valid_0's l2: 0.331036
[121]	valid_0's auc: 0.980098	valid_0's l2: 0.330674
[122]	valid_0's auc: 0.980204	valid_0's l2: 0.330228
[123]	valid_0's auc: 0.980204	valid_0's l2: 0.330131
[124]	valid_0's auc: 0.980271	valid_0's l2: 0.329895
[125]	valid_0's auc: 0.980441	valid_0's l2: 0.329194
[126]	valid_0's auc: 0.980456	valid_0's l2: 0.328811
[127]	valid_0's auc: 0.980472	valid_0's l2: 0.328493
[128]	valid_0's auc: 0.980519	valid_0's l2: 0.328459
[129]	valid_0's auc: 0.980578	valid_0's l2: 0.32832
[130]	valid_0's auc: 0.980635	valid_0's l2: 0.328198
[131]	valid_0's auc: 0.980771	valid_0's l2: 0.327791
[132]	valid_0's auc: 0.980872	valid_0's l2: 0.327462
[133]	valid_0's auc: 0.980884	valid_0's l2: 0.327269
[134]	valid_0's auc: 0.980951	valid_0's l2: 0.327037
[135]	valid_0's auc: 0.980989	valid_0's l2: 0.326838
[136]	valid_0's auc: 0.981031	valid_0's l2: 0.32665
[137]	valid_0's auc: 0.981025	valid_0's l2: 0.326543
[138]	valid_0's auc: 0.981099	valid_0's l2: 0.326342
[139]	valid_0's auc: 0.981079	valid_0's l2: 0.326256
[140]	valid_0's auc: 0.981083	valid_0's l2: 0.326143
[141]	valid_0's auc: 0.981149	valid_0's l2: 0.32578
[142]	valid_0's auc: 0.981222	valid_0's l2: 0.325428
[143]	valid_0's auc: 0.98131	valid_0's l2: 0.325142
[144]	valid_0's auc: 0.981348	valid_0's l2: 0.324963
[145]	valid_0's auc: 0.981395	valid_0's l2: 0.324856
[146]	valid_0's auc: 0.981461	valid_0's l2: 0.324682
[147]	valid_0's auc: 0.981544	valid_0's l2: 0.324538
[148]	valid_0's auc: 0.981605	valid_0's l2: 0.324309
[149]	valid_0's auc: 0.981641	valid_0's l2: 0.324249
[150]	valid_0's auc: 0.981707	valid_0's l2: 0.324083
[151]	valid_0's auc: 0.981747	valid_0's l2: 0.323942
[152]	valid_0's auc: 0.981823	valid_0's l2: 0.323728
[153]	valid_0's auc: 0.981888	valid_0's l2: 0.323549
[154]	valid_0's auc: 0.981936	valid_0's l2: 0.323444
[155]	valid_0's auc: 0.982036	valid_0's l2: 0.323267
[156]	valid_0's auc: 0.982064	valid_0's l2: 0.323081
[157]	valid_0's auc: 0.982064	valid_0's l2: 0.323087
[158]	valid_0's auc: 0.982105	valid_0's l2: 0.322942
[159]	valid_0's auc: 0.982106	valid_0's l2: 0.322876
[160]	valid_0's auc: 0.982114	valid_0's l2: 0.322758
[161]	valid_0's auc: 0.982147	valid_0's l2: 0.322571
[162]	valid_0's auc: 0.982193	valid_0's l2: 0.322484
[163]	valid_0's auc: 0.982217	valid_0's l2: 0.322336
[164]	valid_0's auc: 0.982271	valid_0's l2: 0.322134
[165]	valid_0's auc: 0.982268	valid_0's l2: 0.322089
[166]	valid_0's auc: 0.982285	valid_0's l2: 0.322113
[167]	valid_0's auc: 0.982278	valid_0's l2: 0.322163
[168]	valid_0's auc: 0.982341	valid_0's l2: 0.322001
[169]	valid_0's auc: 0.982368	valid_0's l2: 0.322046
[170]	valid_0's auc: 0.982379	valid_0's l2: 0.321915
[171]	valid_0's auc: 0.982344	valid_0's l2: 0.32184
[172]	valid_0's auc: 0.982431	valid_0's l2: 0.32163
[173]	valid_0's auc: 0.982449	valid_0's l2: 0.321532
[174]	valid_0's auc: 0.982469	valid_0's l2: 0.321488
[175]	valid_0's auc: 0.982556	valid_0's l2: 0.321291
[176]	valid_0's auc: 0.982616	valid_0's l2: 0.320934
[177]	valid_0's auc: 0.982631	valid_0's l2: 0.320862
[178]	valid_0's auc: 0.982641	valid_0's l2: 0.32074
[179]	valid_0's auc: 0.982714	valid_0's l2: 0.320641
[180]	valid_0's auc: 0.982727	valid_0's l2: 0.320571
[181]	valid_0's auc: 0.982733	valid_0's l2: 0.320354
[182]	valid_0's auc: 0.982752	valid_0's l2: 0.32015
[183]	valid_0's auc: 0.982776	valid_0's l2: 0.320041
[184]	valid_0's auc: 0.982755	valid_0's l2: 0.320013
[185]	valid_0's auc: 0.982758	valid_0's l2: 0.319983
[186]	valid_0's auc: 0.98273	valid_0's l2: 0.320012
[187]	valid_0's auc: 0.98274	valid_0's l2: 0.319916
[188]	valid_0's auc: 0.982794	valid_0's l2: 0.319746
[189]	valid_0's auc: 0.982785	valid_0's l2: 0.31972
[190]	valid_0's auc: 0.982773	valid_0's l2: 0.319747
[191]	valid_0's auc: 0.982783	valid_0's l2: 0.319851
[192]	valid_0's auc: 0.982751	valid_0's l2: 0.319971
[193]	valid_0's auc: 0.982685	valid_0's l2: 0.320043
Early stopping, best iteration is:
[188]	valid_0's auc: 0.982794	valid_0's l2: 0.319746
Save model...
Start predicting...
error=0.664000

2.scikit-learn接口

from sklearn import metrics
from lightgbm import LGBMClassifier

clf = LGBMClassifier(
    boosting_type='gbdt',  # 提升树的类型 gbdt,dart,goss,rf
    num_leaves=31,  #树的最大叶子数,对比xgboost一般为2^(max_depth)
    max_depth=-1,  #最大树的深度
    learning_rate=0.1,  #学习率
    n_estimators=100,  # 拟合的树的棵树,相当于训练轮数
    subsample_for_bin=200000,
    objective=None,
    class_weight=None,
    min_split_gain=0.0,  # 最小分割增益
    min_child_weight=0.001,  # 分支结点的最小权重
    min_child_samples=20,
    subsample=1.0,  # 训练样本采样率 行
    subsample_freq=0,  # 子样本频率
    colsample_bytree=1.0,  # 训练特征采样率 列
    reg_alpha=0.0,  # L1正则化系数
    reg_lambda=0.0,  # L2正则化系数
    random_state=None,
    n_jobs=-1,
    silent=True,
)
clf.fit(X_train, y_train, eval_metric='auc')
#设置验证集合 verbose=False不打印过程
# clf.fit(X_train, y_train)

y_true, y_pred = y_test, clf.predict(X_test)
print("Accuracy : %.4g" % metrics.accuracy_score(y_true, y_pred))
Accuracy : 0.9347

eval_metric是评价函数,对模型的训练没有影响,而是在模型训练完成之后评估模型效果。如我们经常使用logloss作为objective,经常与之搭配的评价函数是auc、acc等。

评估标准。使用方法: eval_metric = 'error'

回归任务(默认rmse)
	rmse--均方根误差
	mae--平均绝对误差
	
分类任务(默认error)
	auc--roc曲线下面积
	error--错误率(二分类)
	merror--错误率(多分类)
	logloss--负对数似然函数(二分类)
	mlogloss--负对数似然函数(多分类)
    map--平均正确率

————————————————
版权声明:最后的eval_metric部分为CSDN博主「缘 源 园」的原创文章,原文链接:https://blog.csdn.net/weixin_48135624/article/details/115173785

  • 1
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值