项目中处理好数据后,进行特征筛选,并将筛选好的特征按照IV值大小进行倒序排序。sorted排好序后,输出的list类型数据,需要将其转为pandas中的dataframe,方便后续存储。
先按照特征的IV值排序:
dic_sort = sorted(result_list.items(), key=lambda item: item[1], reverse=True)
sorted后的数据 dic_sort 内容如下:
[('m_cnt_grp_partner_Loan_all_all', 2.045825052735046),
('i_cnt_grp_partner_Loan_all_all', 1.9682290399223903),
('i_cnt_partner_Loan_Offloan_365day', 1.116658285932447),
('i_cnt_partner_Loan_Offloan_540day', 1.116658285932447)
('m_up_creditquota', 1.0425101803971297)
('m_up_zhimapoint', 1.0425101803971297),
('m_cnt_grp_total_mobile_Loan_all_all', 1.0377289408959118),
('i_cnt_partner_Loan_Offloan_180day', 1.0287554995963795),
('i_cnt_grp_total_Loan_all_all', 1.0021418968609577),
('m_cnt_partner_Loan_P2pweb_1080day', 0.9704039423892912),
('m_cnt_partner_Loan_P2pweb_1800day', 0.9704039423892912)]
list转为dataframe的方法:
df = pd.DataFrame(dic_sort, columns=['one', 'two'])
df.head(4) 查看:
one two
0 m_cnt_grp_partner_Loan_all_all 2.045825
1 i_cnt_grp_partner_Loan_all_all 1.968229
2 i_cnt_partner_Loan_Offloan_1800day 1.116658
3 i_cnt_partner_Loan_Offloan_365day 1.116658
4 i_cnt_partner_Loan_Offloan_540day 1.116658