一、spark xgboost 模型
1 xgboost 的默认参数:
xgb 参数参考链接 https://blog.csdn.net/yyy430/article/details/85179638 这个链接整理的比较全,但是这个参数是关于python版本的xgb,spark版本的xgboost默认参数和这个有出入
1.1 默认参数如下:
/*
默认参数
eta -> 0.3
, gamma -> 0
, maxDepth -> 6,
minChildWeight -> 1
,maxDeltaStep -> 0,
growPolicy -> "depthwise"
,maxBins -> 16,// python默认是256
subsample -> 1
,colsampleBytree -> 1
,colsampleBylevel -> 1
,lambda -> 1
,alpha -> 0
,treeMethod -> "auto"
,sketchEps -> 0.03,
scalePosWeight -> 1.0
,sampleType -> "uniform"
,normalizeType -> "tree"
,rateDrop -> 0.0
,skipDrop -> 0.0
,lambdaBias -> 0
,treeLimit -> 0
*/
2 、spark xgb 模型的参数
val paraMap = List(
//参数解释 https://blog.csdn.net/Leo_Sheng/article/details/80852328
"eta" -> 0.3f // learning rate
,"gamma" -> 0.1 //用于控制是否后剪枝的参数,越大越保守,一般0.1、0.2这样子。
,"max_depth" -> max_depth
,"num_round"->max_iter
,"objective" -> "binary:logistic"