绘制ROC 曲线 计算 AUC PR曲线(精准率 召回率)示例

#一 ROC评价 及曲线  AUC的值   准确率

 

auc 的参数来自假阳率 真阳率,一般与 metrics.roc_curve配合使用,来自sklearn正规示例网站

>>> fpr, tpr, thresholds = metrics.roc_curve(y, pred, pos_label=2)
>>> metrics.auc(fpr, tpr)   

 

colors = ['r', 'g', 'b', 'y', 'k', 'c', 'm', 'brown', 'r']
lw = 1
Cs = [1e-6, 1e-4, 1e0]

plt.figure(figsize=(12,8))
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve for different classifiers')

plt.plot([0, 1], [0, 1], color='navy', lw=lw, linestyle='--')

labels = []
for idx, C in enumerate(Cs):
    clf = LogisticRegression(C = C)
    clf.fit(X_train, y_train)
    print("C: {}, parameters {} and intercept {}".format(C, clf.coef_, clf.intercept_))
    preds = clf.predict_proba(X_test)[:,1]
    print("clf.predict_proba(X_test=",clf.predict_proba(X_test))
    print("y_test=",y_test)
    fpr, tpr, _ = roc_curve(y_test, preds)
    correct_prediction = np.equal(np.round(preds), y_test)
    print("准确率=",np.mean(correct_prediction))
    roc_auc = auc(fpr, tpr)  
    plt.plot(fpr, tpr, lw=lw, color=colors[idx])
    labels.append("C: {}, AUC = {}".format(C, np.round(roc_auc, 4)))

plt.legend(['random AUC = 0.5'] + labels)

 

 

 

#result

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:432: FutureWarning:

Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.

C: 1e-06, parameters [[-0.00424654 -0.00232424 -0.00354647 -0.00199886 -0.00186031]] and intercept [-0.03324687]
clf.predict_proba(X_test= [[0.50885744 0.49114256]
 [0.50879396 0.49120604]
 [0.50869638 0.49130362]
 ...
 [0.50899422 0.49100578]
 [0.5086329  0.4913671 ]
 [0.50862148 0.49137852]]
y_test= 8067      0
368101    0
70497     0
226567    1
73186     1
         ..
98574     0
334252    1
293289    0
167582    0
231389    0
Name: is_duplicate, Length: 133416, dtype: int64
准确率= 0.629422258199916
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:432: FutureWarning:

Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.

C: 0.0001, parameters [[-0.16061857 -0.09048901 -0.13250978 -0.07998846  0.68641435]] and intercept [-0.70401556]
clf.predict_proba(X_test= [[0.53112308 0.46887692]
 [0.58009823 0.41990177]
 [0.65390497 0.34609503]
 ...
 [0.52987886 0.47012114]
 [0.64214599 0.35785401]
 [0.6472506  0.3527494 ]]
y_test= 8067      0
368101    0
70497     0
226567    1
73186     1
         ..
98574     0
334252    1
293289    0
167582    0
231389    0
Name: is_duplicate, Length: 133416, dtype: int64
准确率= 0.629422258199916
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:432: FutureWarning:

Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.

C: 1.0, parameters [[-10.25339741  -0.91965268   6.77946546  -7.16268424   3.29874476]] and intercept [-1.34016168]
clf.predict_proba(X_test= [[0.24412863 0.75587137]
 [0.49234752 0.50765248]
 [0.85791823 0.14208177]
 ...
 [0.3597533  0.6402467 ]
 [0.81310546 0.18689454]
 [0.8053035  0.1946965 ]]
y_test= 8067      0
368101    0
70497     0
226567    1
73186     1
         ..
98574     0
334252    1
293289    0
167582    0
231389    0
Name: is_duplicate, Length: 133416, dtype: int64
准确率= 0.6547715416441806

Out[27]:

<matplotlib.legend.Legend at 0x2c7f8a48>

 

 

#二   PR曲线绘制  代码

# precision_recall_curve 评价
pr, re, _ = precision_recall_curve(y_test, cv.best_estimator_.predict_proba(X_test)[:,1])
plt.figure(figsize=(12,8))
plt.plot(re, pr)
plt.title('PR Curve (AUC {})'.format(auc(re, pr)))
plt.xlabel('Recall')
plt.ylabel('Precision')

#result

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值