随机森林在python上的实现

最新推荐文章于 2024-05-10 15:03:16 发布

Python小萝卜

最新推荐文章于 2024-05-10 15:03:16 发布

阅读量2.3k

点赞数 2

分类专栏： python 机器学习文章标签：随机森林模型保存模型调用

本文链接：https://blog.csdn.net/qq_23860475/article/details/81509707

版权

python 同时被 2 个专栏收录

35 篇文章 5 订阅

订阅专栏

机器学习

11 篇文章 0 订阅

订阅专栏

默认参数

class sklearn.ensemble.RandomForestClassifier(n_estimators=10, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=’auto’, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, bootstrap=True, oob_score=False, n_jobs=1, random_state=None, verbose=0, warm_start=False, class_weight=None)

参数介绍

https://blog.csdn.net/w952470866/article/details/78987265

https://www.cnblogs.com/jasonfreak/p/5720137.html

https://www.cnblogs.com/pinard/p/6160412.html?utm_source=itdadao&utm_medium=referral

训练阶段

import pandas as pd  
from sklearn.ensemble import RandomForestClassifier  
from sklearn import cross_validation,metrics
from sklearn.externals import joblib  
train= pd.read_excel('C:/Users/yangge/Desktop/fenx/buyData.xlsx')  
target='isbad' # isbad的值就是二元分类的输出   
x_columns = [x for x in train.columns if x not in [target]]  
X = train[x_columns]  
y = train[target] 
rf0 = RandomForestClassifier(oob_score=True, random_state=10,n_estimators=500)  
rf0.fit(X,y)  
#scores=cross_validation.cross_val_score(rf0,X,y,cv=5)
#print("交叉验证准确率：",scores)
y_predprob = rf0.predict_proba(X)[:,1] 
y_pred = rf0.predict(X)
fpr, tpr, thresholds = metrics.roc_curve(y,y_predprob, pos_label=1)
auc=metrics.auc(fpr, tpr)
cm=metrics.confusion_matrix(y, y_pred)#混淆矩阵
featurImportances=sorted(zip(map(lambda x: round(x, 4), rf0.feature_importances_), x_columns),reverse=True)
print("特征重要性:",featurImportances)
print("泛化能力:",rf0.oob_score_) 
print("AUC:",metrics.roc_auc_score(y,y_predprob))
print("准确率:",metrics.accuracy_score(y,y_pred))
print("召回率:",metrics.recall_score(y,y_pred))
print("F测度:",metrics.f1_score(y, y_pred))
joblib.dump(rf0, "rfTrainModel.m")#保存模型

预测阶段

import pandas as pd  
from sklearn.externals import joblib  
x= pd.read_excel('C:/Users/yangge/Desktop/fenx/buyDataTest.xlsx')  
rf0= joblib.load("rfTrainModel.m") 
y_pred = rf0.predict(x)

偏差与方差

https://www.cnblogs.com/daguankele/p/6561419.html 机器学习中的方差和偏差

https://blog.csdn.net/hit0803107/article/details/71108563 偏差bias/方差variance 的理解

https://blog.csdn.net/accumulate_zhang/article/details/63251337 模型的偏差与方差的理解

https://blog.csdn.net/simple_the_best/article/details/71167786 理解机器学习中的偏差与方差

https://blog.csdn.net/liweibin1994/article/details/76859743 机器学习：偏差、方差与欠拟合、过拟合

如果对你有帮助，请点下赞，予人玫瑰手有余香！

时时仰望天空，理想就会离现实越来越近！

Python小萝卜

关注

2
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
随机森林在python上的实现

默认参数class sklearn.ensemble.RandomForestClassifier(n_estimators=10, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=’auto’, max_...
复制链接

扫一扫