- 博客(5)
- 收藏
- 关注
原创 数据挖掘实践(金融风控)-第五次任务
pre = (pre1 + pre2 + pre3 +…+pren )/npre = 0.3pre1 + 0.3pre2 + 0.4pre3from xgboost import XGBClassifierfrom sklearn.linear_model import LogisticRegressionfrom sklearn.ensemble import RandomForestClassifier, VotingClassifierclf1 = LogisticRegression(ra
2020-09-27 22:24:28 131
原创 数据挖掘实践(金融风控)-第四次任务
from sklearn.model_selection import KFold分离数据集,方便进行交叉验证X_train = data.loc[data[‘sample’]‘train’, :].drop([‘id’,‘issueDate’,‘isDefault’, ‘sample’], axis=1)X_test = data.loc[data[‘sample’]‘test’, :].drop([‘id’,‘issueDate’,‘isDefault’, ‘sample’], axis=1)y
2020-09-24 23:51:07 1684
原创 数据挖掘实践(金融风控)-第三次任务
have_null_fea_dict = (data_train.isnull().sum()/len(data_train)).to_dict()fea_null_moreThanHalf = {}for key,value in have_null_fea_dict.items():if value > 0.5:fea_null_moreThanHalf[key] = value
2020-09-21 23:45:02 137
原创 数据挖掘实践(金融风控)-第二次任务
data_train = pd.read_csv(’./train.csv’)data_test_a = pd.read_csv(’./testA.csv’)data_train_sample = pd.read_csv("./train.csv",nrows=5)#设置chunksize参数,来控制每次迭代数据的大小chunker = pd.read_csv("./train.csv",chunksize=5)for item in chunker:print(type(item))#<
2020-09-18 23:55:03 219
空空如也
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人