kdd2017
目录
1、XGBRegressor之objective和eval_metric
更改模型的目标函数和评价指标如下所示:
def objective(preds, dtrain):
labels = dtrain#.get_label()
grad = (preds - labels)*-1~~~#此处省略
hess = np.ones_like(grad)
return grad, hess
def eval_metric(preds,dtrain):
truth = dtrain
return 'error', np.mean(np.abs(preds-truth)/truth)
按照比赛评价指标更改后的目标函数和评价指标对模型训练没有效果。
目标函数无法直接求导,因此先对其进行了平方。一阶导与rmse求出的目标函数只有系数不同,二阶导则同样为常数。
比赛评价指标如下所示:
Evaluation Metrics
We choose Mean Absolute Percentage Error (MAPE) to evaluate the result.
Task 1: Let drt and prt be the actual and predicted average travel time for route r during time window t. The MAPE for travel time prediction is defined as:
MAPE=1R∑r=1R(1T∣∣∣drt−prtdrt∣∣∣)
R and T are the number of routes and number of to-predict time windows in the testing period respectively.
Task 2: Let C be the number of tollgate-direction pairs (as aforementioned: 1-entry, 1-exit, 2-entry, 3-entry and 3-exit), T be the number of time windows in the testing period, and fct and pct be the actual and predicted traffic volume for a specific tollgate-direction pair c during time window t. The MAPE for traffic volume prediction is defined as:
MAPE=1C∑c=1C(1T∣∣∣fct−pctfct∣∣∣)
填充缺失值后使用随机森林模型—failed!
天气数据(海平面气压-实时气压)—failed!
过滤06:00:00以前的训练数据—failed!
对均值进行平滑?
公式输入的案例
Γ(n)=(n−1)!∀n∈N
∑ni=0i2=(n2+n)(2n+1)6