Home Credit Default Risk 违约风险预测,kaggle比赛,进阶篇,LB 0.792

Home Credit Default Risk 违约风险预测


只用application_train数据集,AUC的分数可以达到0.76381。加上另外两组数据集,AUC的分数可以达到0.8。说明另外两组数据集能够带来约0.04的AUC提升。这篇就对另外两组数据进行分析和使用。

Bureau 征信数据主要包括征信状态,贷款时间,用途,总量等等。

Home Credit(捷信)的产品逻辑是这样的。

  • 首先提供POS,入门级产品,类似于消费贷。只能用于消费,金额有限,风险最小,客户提供的信息也最少。
  • 然后是credit card,在本次竞赛中又称为revolving loan。循环授信,主要用于消费。
  • 最后才是cash loan,用户能得到现金,风险最大。

Application train和test中只有credit card和cash的数据。也许因为POS门槛太低,风险太小,而且能用的信息最少,不在预测范围内。在历史数据中,POS,Credit Card, Cash Loans都有,但是Credit Card数量最少。

以上数据集的分布基本一致,整个数据中,训练集在85%左右,测试集15% ,而且POS和credit互相不包含


    length of train is 307511 the percent is 86.32 %
    length of test is 48744 the percent is 13.68 %
    
    length of bureau is 305811
    intersection with train is 263491 the percent is 86.16 %
    intersection with test is 42320 the percent is 13.84 %
    
    length of previous is 338857
    intersection with train is 291057 the percent is 85.89 %
    intersection with test is 47800 the percent is 14.11 %
    
    length of POS is 337252
    intersection with train is 289444 the percent is 85.82 %
    intersection with test is 47808 the percent is 14.18 %
    
    length of credit is 103558
    intersection with train is 86905 the percent is 83.92 %
    intersection with test is 16653 the percent is 16.08 %
    
    length of installment is 339587
    intersection with train is 291643 the percent is 85.88 %
    intersection with test is 47944 the percent is 14.12 %
    intersection between credit and POS is 0
  • 0
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值