马修斯相关系数代码Python版

马修斯相关系数介绍

马修斯相关系数相关介绍来自维基百科,可自行查看。

python代码及数据格式

下面是python代码

import pandas as pd
from math import sqrt
def get_data():
    df = pd.read_csv('data.csv')
    TP = df.iloc[0]["zero"]
    FP = df.iloc[1]["zero"]
    FN = df.iloc[0]["one"]
    TN = df.iloc[1]["one"]
    return TP,FP,FN,TN

def calculate_data(TP,FP,FN,TN):
    numerator = (TP * TN) - (FP * FN) #马修斯相关系数公式分子部分
    denominator = sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN)) #马修斯相关系数公式分母部分
    result = numerator/denominator
    return result

if __name__ == '__main__':
    TP,FP,FN,TN = get_data()
    result = calculate_data(TP,FP,FN,TN)
    print(result) #打印出结果

数据表格格式:
在这里插入图片描述

强调

整篇文章均会使用网页中的原文

原文部分内容

While there is no perfect way of describing the confusion matrix of true and false positives and negatives by a single number, the Matthews correlation coefficient is generally regarded as being one of the best such measures.Other measures, such as the proportion of correct predictions (also termed accuracy), are not useful when the two classes are of very different sizes. For example, assigning every object to the larger set achieves a high proportion of correct predictions, but is not generally a useful classification.

The MCC can be calculated directly from the confusion matrix using the formula:
MCC = T P × T N − F P × F N ( T P + F P ) ( T P + F N ) ( T N + F P ) ( T N + F N ) {\displaystyle {\text{MCC}}={\frac {{\mathit {TP}}\times {\mathit {TN}}-{\mathit {FP}}\times {\mathit {FN}}}{\sqrt {({\mathit {TP}}+{\mathit {FP}})({\mathit {TP}}+{\mathit {FN}})({\mathit {TN}}+{\mathit {FP}})({\mathit {TN}}+{\mathit {FN}})}}}} MCC=(TP+FP)(TP+FN)(TN+FP)(TN+FN) TP×TNFP×FN
In this equation, TP is the number of true positives, TN the number of true negatives, FP the number of false positives and FN the number of false negatives. If any of the four sums in the denominator is zero, the denominator can be arbitrarily set to one; this results in a Matthews correlation coefficient of zero, which can be shown to be the correct limiting value.

The MCC can be calculated with the formula:
MCC = P P V × T P R × T N R × N P V − F D R × F N R × F P R × F O R {\displaystyle {\text{MCC}}={\sqrt {{\mathit {PPV}}\times {\mathit {TPR}}\times {\mathit {TNR}}\times {\mathit {NPV}}}}-{\sqrt {{\mathit {FDR}}\times {\mathit {FNR}}\times {\mathit {FPR}}\times {\mathit {FOR}}}}} MCC=PPV×TPR×TNR×NPV FDR×FNR×FPR×FOR
using the positive predictive value, the true positive rate, the true negative rate, the negative predictive value, the false discovery rate, the false negative rate, the false positive rate, and the false omission rate.

The original formula as given by Matthews was:
N = T N + T P + F N + F P S = T P + F N N P = T P + F P N MCC = T P / N − S × P P S ( 1 − S ) ( 1 − P ) {\displaystyle {\begin{aligned}N&={\mathit {TN}}+{\mathit {TP}}+{\mathit {FN}}+{\mathit {FP}}\\S&={\frac {{\mathit {TP}}+{\mathit {FN}}}{N}}\\P&={\frac {{\mathit {TP}}+{\mathit {FP}}}{N}}\\{\text{MCC}}&={\frac {{\mathit {TP}}/N-S\times P}{\sqrt {PS(1-S)(1-P)}}}\end{aligned}}} NSPMCC=TN+TP+FN+FP=NTP+FN=NTP+FP=PS(1S)(1P) TP/NS×P
This is equal to the formula given above. As a correlation coefficient, the Matthews correlation coefficient is the geometric mean of the regression coefficients of the problem and its dual. The component regression coefficients of the Matthews correlation coefficient are Markedness (Δp) and Youden’s J statistic (Informedness or Δp’). Markedness and Informedness correspond to different directions of information flow and generalize Youden’s J statistic, the δp statistics and (as their geometric mean) the Matthews Correlation Coefficient to more than two classes.

Some scientists claim the Matthews correlation coefficient to be the most informative single score to establish the quality of a binary classifier prediction in a confusion matrix context.

In abstract terms, the confusion matrix is as follows:Alt

  • 0
    点赞
  • 12
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值