xgboost+Focal Loss
Focal Loss的主要思想就是改变损失函数。可以参考Focal Loss理解 - 三年一梦 - 博客园
xgboost如果想应用focal loss可以使用自定义损失函数来实现。
首先介绍一个比较好用的求导的python包 sympy。用这个包来求导,可以比较轻松求出一阶导和二阶导。
from sympy import *
import numpy as np
# y指的是true label
# p指的是预测出来的概率值(是经过sigmoid转换之后的)
# a指的是local foss的参数
# b指的是用来处理非平衡的参数
y,p,a,b=symbols('y p a b')
loss=b*(-y*log(p)*(1-p)**a)-(1-y)*log(1-p)*p**a #同时考虑了focal loss参数和处理非平衡参数的损失函数
grad=diff(loss,p)*p*(1-p) #求一阶导
print('grad:')
print(grad)
hess=diff(grad,p)*p*(1-p) #求二阶导
print('hess:')
print(hess)
###输出###
grad:
p*(1 - p)*(a*b*y*(1 - p)**a*log(p)/(1 - p) - a*p**a*(1 - y)*log(1 - p)/p - b*y*(1 - p)**a/p + p**a*(1 - y)/(1 - p))
hess:
p*(1 - p)*(p*(1 - p)*(-a**2*b*y*(1 - p)**a*log(p)/(1 - p)**2 - a**2*p**a*(1 - y)*log(1 - p)/p**2 + a*b*y*(1 - p)**a*log(p)/(1 - p)**2 + 2*a*b*y*(1 - p)**a/(p*(1 - p)) + 2*a*p**a*(1 - y)/(p*(1 - p)) + a*p**a*(1 - y)*log(1 - p)/p**2 + b*y*(1 - p)**a/p**2 + p**a*(1 - y)/(1 - p)**2) - p*(a*b*y*(1 - p)**a*log(p)/(1 - p) - a*p**a*(1 - y)*log(1 - p)/p - b*y*(1 - p)**a/p + p**a*(1 - y)/(1 - p)) + (1 - p)*(a*b*y*(1 - p)**a*log(p)/(1 - p) - a*p**a*(1 - y)*log(1 - p)/p - b*y*(1 - p)**a/p + p**a*(1 - y)/(1 - p)))
在xgboost里边设置自定义损失函数,将上边生成的一阶导和二阶导贴过来。
当设置b=1,a=0时,就是原始的损失函数。对比输出的结果可以判断求导是否正确。从打印出来的auc指标可以看到我们的求导是正确的~
# focal loss
b=1#label为1的样本的损失是label为0的多少倍
a=0#focal loss参数,一般为2。
def logistic_obj(y_hat, dtrain):
y = dtrain.get_label()
p=y_hat
grad=p*(1 - p)*(a*b*y*(1 - p)**a*np.log(p)/(1 - p) - a*p**a*(1 - y)*np.log(1 - p)/p - b*y*(1 - p)**a/p + p**a*(1 - y)/(1 - p))
hess=p*(1 - p)*(p*(1 - p)*(-a**2*b*y*(1 - p)**a*np.log(p)/(1 - p)**2 - a**2*p**a*(1 - y)*np.log(1 - p)/p**2 + a*b*y*(1 - p)**a*np.log(p)/(1 - p)**2 + 2*a*b*y*(1 - p)**a/(p*(1 - p)) + 2*a*p**a*(1 - y)/(p*(1 - p)) + a*p**a*(1 - y)*np.log(1 - p)/p**2 + b*y*(1 - p)**a/p**2 + p**a*(1 - y)/(1 - p)**2) - p*(a*b*y*(1 - p)**a*np.log(p)/(1 - p) - a*p**a*(1 - y)*np.log(1 - p)/p - b*y*(1 - p)**a/p + p**a*(1 - y)/(1 - p)) + (1 - p)*(a*b*y*(1 - p)**a*np.log(p)/(1 - p) - a*p**a*(1 - y)*np.log(1 - p)/p - b*y*(1 - p)**a/p + p**a*(1 - y)/(1 - p)))
return grad, hess
# 数据部分
train_data=pd.read_csv('xxx.csv')#读取一份总数据集
#feature_in_model 入模特征列表
#target 目标特征
x=train_data[feature_in_model]
y=train_data[target]
folds = list(StratifiedKFold(n_splits=5, shuffle=True,random_state=2020).split(x, y))
train_idx=folds[0][0]
valid_idx=folds[0][1]
X_train,y_train= x.iloc[train_idx],y.iloc[train_idx]
X_valid,y_valid= x.iloc[valid_idx],y.iloc[valid_idx]
# dmatrix
dtrain=xgb.DMatrix(data=X_train,label=y_train)
dvalid=xgb.DMatrix(data=X_valid,label=y_valid)
# params
params = {
'booster': 'gbtree',
'objective': 'binary:logistic',
'metric':'auc',
'eval_metric':'auc',
'max_depth':5,
'min_child_weight':100,
# 'lambda':100,
'gamma':0.1,
'subsample':1,
'colsample_bytree':1,
'scale_pos_weight':1,
'tree_method':'hist',
'nthread':32,
'eta':0.02
}
# train
print("---------no focal loss-----------")
xgbM = xgb.train(params=params, dtrain=dtrain, num_boost_round=50,evals=[(dtrain, 'train'),(dvalid,'valid')],verbose_eval=10)
print("---------focal loss-----------")
xgbM = xgb.train(params=params, dtrain=dtrain, num_boost_round=50,evals=[(dtrain, 'train'),(dvalid,'valid')],verbose_eval=10,obj=logistic_obj)
####输出####
---------no focal loss-----------
[0] train-auc:0.630931 train-logloss:0.69206 valid-auc:0.622562 valid-logloss:0.692136
[10] train-auc:0.646874 train-logloss:0.682963 valid-auc:0.636531 valid-logloss:0.683895
[20] train-auc:0.65542 train-logloss:0.676095 valid-auc:0.644896 valid-logloss:0.677697
[30] train-auc:0.664053 train-logloss:0.67044 valid-auc:0.653452 valid-logloss:0.672515
[40] train-auc:0.671685 train-logloss:0.665367 valid-auc:0.660784 valid-logloss:0.667895
[49] train-auc:0.675656 train-logloss:0.661691 valid-auc:0.665224 valid-logloss:0.6644
---------focal loss-----------
[0] train-auc:0.630931 train-logloss:0.69206 valid-auc:0.622562 valid-logloss:0.692136
[10] train-auc:0.646874 train-logloss:0.682963 valid-auc:0.636531 valid-logloss:0.683895
[20] train-auc:0.65542 train-logloss:0.676095 valid-auc:0.644896 valid-logloss:0.677697
[30] train-auc:0.664053 train-logloss:0.67044 valid-auc:0.653452 valid-logloss:0.672515
[40] train-auc:0.671685 train-logloss:0.665367 valid-auc:0.660784 valid-logloss:0.667895
[49] train-auc:0.675656 train-logloss:0.661691 valid-auc:0.665224 valid-logloss:0.6644
可以设置a=2,使用xgboost+focal loss。可自行实验。
结束啦!
有问题欢迎评论区留言吖~