XGBoost与Lightgbm

最新推荐文章于 2024-03-11 15:18:18 发布

VIP文章 chenguiyuan1234

最新推荐文章于 2024-03-11 15:18:18 发布

阅读量482

点赞数

分类专栏： python

本文链接：https://blog.csdn.net/chenguiyuan1234/article/details/87913290

版权

本文主要参考自以下网站
https://cloud.tencent.com/developer/article/1389899
https://cloud.tencent.com/developer/article/1052678
https://cloud.tencent.com/developer/article/1052664

XGBoost
1、重要参数详解
booster[default=gbtree]： gbtree, gblinear
nthread: 线程数
eta[default=0.3]: 收缩步长，防止过拟合
max_depth[default=6]: 树的最大深度
min_child_weight: 孩子节点中最小的样本权重和
subsample[default=1]: 用于训练模型的子样本占整个样本集合的比例
lambda[default=0]:　L2正则的惩罚系数
alpha [default=0] ： L1 正则的惩罚系数
objective [ default=reg:linear ] ：定义学习任务及相应的学习目标
可选的目标函数如下：
“reg:linear” —— 线性回归。
“reg:logistic”—— 逻辑回归。
“binary:logistic”—— 二分类的逻辑回归问题，输出为概率。
“binary:logitraw”—— 二分类的逻辑回归问题，输出的结果为wTx。
“count:poisson”—— 计数问题的poisson回归，输出结果为poisson分布。在poisson回归中，max_delta_step的缺省值为0.7。
“multi:softmax” –让XGBoost采用softmax目标函数处理多分类问题，同时需要设置参数num_class（类别个数）
“multi:softprob” –和softmax一样，但是输出的是ndata * nclass的向量，可以将该向量reshape成ndata行nclass列的矩阵。没行数据表示样本所属于每个类别的概率。
“rank:pairwise” –set XGBoost to do ranking task by minimizing the pairwise loss
eval_metric [ default according to objective ]：校验数据所需要的评价标准
“rmse”: root mean square error
“logloss”: negative log-likelihood
“error”: Binary classification error rate
“merror”: Multiclass classification error rate.
“mlogloss”: Multiclass logloss.
“auc”: Area under the curve for ranking evaluation.
“ndcg”:Normalized Discounted Cumulative Gain
“map”:Mean average precision

2、具体操作
a、加载数据
libsvm 格式的文本数据；
Numpy 的二维数组；
XGBoost 的二进制的缓存文件。加载的数据存储在对象 DMatrix 中。
train = xgb.DMatrix(‘train.txt’)

最低0.47元/天解锁文章

chenguiyuan1234

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
XGBoost与Lightgbm

本文参考自https://cloud.tencent.com/developer/article/1389899https://cloud.tencent.com/developer/article/1052678https://cloud.tencent.com/developer/article/1052664XGBoost1、重要参数详解booster[default=gbtree...
复制链接

扫一扫