前言
保险是重要的金融体系,对社会发展,民生保障起到重要作用。保险欺诈近些年层出不穷,在某些险种上保险欺诈的金额已经占到了理赔金额的20%甚至更多。对保险欺诈的识别成为保险行业中的关键应用场景。
数据集
阿里云天池里面可以找到数据集
一,数据加载
import pandas as pd
# 数据加载
train = pd.read_csv('train.csv')
train
policy_id | age | customer_months | policy_bind_date | policy_state | policy_csl | policy_deductable | policy_annual_premium | umbrella_limit | insured_zip | ... | witnesses | police_report_available | total_claim_amount | injury_claim | property_claim | vehicle_claim | auto_make | auto_model | auto_year | fraud | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 122576 | 37 | 189 | 2013-08-21 | C | 500/1000 | 1000 | 1465.71 | 5000000 | 455456 | ... | 3 | ? | 54930 | 6029 | 5752 | 44452 | Nissan | Maxima | 2000 | 0 |
1 | 937713 | 44 | 234 | 1998-01-04 | B | 250/500 | 500 | 821.24 | 0 | 591805 | ... | 1 | YES | 50680 | 5376 | 10156 | 37347 | Honda | Civic | 1996 | 0 |
2 | 680237 | 33 | 23 | 1996-02-06 | B | 500/1000 | 1000 | 1844.00 | 0 | 442490 | ... | 1 | NO | 47829 | 4460 | 9247 | 33644 | Jeep | Wrangler | 2002 | 0 |
3 | 513080 | 42 | 210 | 2008-11-14 | A | 500/1000 | 500 | 1867.29 | 0 | 439408 | ... | 2 | YES | 68862 | 11043 | 5955 | 53548 | Suburu | Legacy | 2003 | 1 |
4 | 192875 | 29 | 81 | 2002-01-08 | A | 100/300 | 1000 | 816.25 | 0 | 640575 | ... | 1 | YES | 59726 | 5617 | 10301 | 41550 | Ford | F150 | 2004 | 0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
695 | 1008425 | 37 | 196 | 1997-06-29 | C | 250/500 | 500 | 1301.20 | 0 | 474615 | ... | 3 | NO | 61433 | 10436 | 11432 | 39745 | Nissan | Pathfinder | 2011 | 1 |
696 | 770702 | 43 | 229 | 2001-05-29 | A | 250/500 | 500 | 1434.94 | 8000000 | 444476 | ... | 1 | ? | 68623 | 6798 | 14557 | 50606 | Volkswagen | Passat | 2013 | 1 |
697 | 755099 | 35 | 209 | 2003-01-11 | C | 100/300 | 500 | 1639.46 | 0 | 639608 | ... | 0 | YES | 58033 | 9129 | 4598 | 40740 | Mercedes | C300 | 2002 | 0 |
698 | 693804 | 44 | 275 | 2003-07-22 | B | 500/1000 | 2000 | 1042.29 | 0 | 432061 | ... | 0 | NO | 35253 | 7359 | 3464 | 24677 | Audi | A3 | 2007 | 1 |
699 | 598086 | 47 | 263 | 1996-08-15 | C | 500/1000 | 500 |
test = pd.read_csv('./test.csv')
test
policy_id | age | customer_months | policy_bind_date | policy_state | policy_csl | policy_deductable | policy_annual_premium | umbrella_limit | insured_zip | ... | bodily_injuries | witnesses | police_report_available | total_claim_amount | injury_claim | property_claim | vehicle_claim | auto_make | auto_model | auto_year | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 681822 | 60 | 473 | 2002-12-17 | B | 500/1000 | 1000 | 1134.96 | 0 | 445975 | ... | 0 | 3 | ? | 53253 | 5212 | 10251 | 39503 | Saab | 95 | 2006 |
1 | 301288 | 36 | 173 | 1994-01-15 | B | 100/300 | 1000 | 916.20 | 0 | 469238 | ... | 0 | 0 | NO | 69401 | 8309 | 8439 | 50012 | Mercedes | ML350 | 2008 |
2 | 212001 | 36 | 147 | 1995-12-19 | B | 500/1000 | 1000 | 1175.74 | 5000000 | 595953 | ... | 2 | 0 | NO | 63919 | 5572 | 11477 | 42801 | Dodge | Neon | 2009 |
3 | 797680 | 24 | 71 | 1992-06-20 | C | 500/1000 | 500 | 1472.40 | 0 | 613103 | ... | 0 | 0 | NO | 63173 | 12027 | 6500 | 43423 | Dodge | RAM | 2012 |
4 | 789334 | 39 | 230 | 1996-11-28 | C | 250/500 | 1000 | 1159.44 | 4000000 | 581581 | ... | 0 | 0 | ? | 8847 | 904 | 1786 | 6138 | Accura | RSX | 2003 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
295 | 663065 | 36 | 30 | 1999-08-18 | B | 500/1000 | 2000 | 1384.15 | 9000000 | 593323 | ... | 0 | 1 | YES | 4507 | 970 | 477 | 3339 | Dodge | Neon | 2002 |
296 | 283767 | 47 | 285 | 2009-12-23 | C | 250/500 | 500 | 1590.78 | 7000000 | 447235 | ... | 0 | 3 | YES | 45909 | 5599 | 5627 | 34598 | Jeep | Grand Cherokee | 1999 |
297 | 325099 | 39 | 256 | 1999-04-08 | C | 500/1000 | 2000 | 1265.24 | 0 | 592069 | ... | 0 | 0 | ? | 42293 | 5773 | 5491 | 34805 | Dodge | RAM | 1997 |
298 | 465673 | 35 | 54 | 2010-09-08 | C | 100/300 | 500 | 1229.74 | 0 | 451451 | ... | 2 | 0 | ? | 76875 | 14955 | 7312 | 59418 | Nissan | Maxima | 2012 |
299 | 913900 | 34 | 154 | 1990-09-27 | C | 100/300 | 500 | 1744.33 | 0 | 462941 | ... | 1 | 1 | YES | 76269< |