马修斯相关系数代码Python版

最新推荐文章于 2023-01-30 09:32:54 发布

i7366464

最新推荐文章于 2023-01-30 09:32:54 发布

阅读量1.7k

点赞数

文章标签：矩阵

本文链接：https://blog.csdn.net/qq_45656422/article/details/115520738

版权

马修斯相关系数介绍

马修斯相关系数相关介绍来自维基百科，可自行查看。

python代码及数据格式

下面是python代码

import pandas as pd
from math import sqrt
def get_data():
    df = pd.read_csv('data.csv')
    TP = df.iloc[0]["zero"]
    FP = df.iloc[1]["zero"]
    FN = df.iloc[0]["one"]
    TN = df.iloc[1]["one"]
    return TP,FP,FN,TN

def calculate_data(TP,FP,FN,TN):
    numerator = (TP * TN) - (FP * FN) #马修斯相关系数公式分子部分
    denominator = sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN)) #马修斯相关系数公式分母部分
    result = numerator/denominator
    return result

if __name__ == '__main__':
    TP,FP,FN,TN = get_data()
    result = calculate_data(TP,FP,FN,TN)
    print(result) #打印出结果

数据表格格式：
在这里插入图片描述

强调

整篇文章均会使用网页中的原文

原文部分内容

While there is no perfect way of describing the confusion matrix of true and false positives and negatives by a single number, the Matthews correlation coefficient is generally regarded as being one of the best such measures.Other measures, such as the proportion of correct predictions (also termed accuracy), are not useful when the two classes are of very different sizes. For example, assigning every object to the larger set achieves a high proportion of correct predictions, but is not generally a useful classification.

The MCC can be calculated directly from the confusion matrix using the formula:
${\text{MCC}}={\frac {{\mathit {TP}}\times {\mathit {TN}}-{\mathit {FP}}\times {\mathit {FN}}}{\sqrt {({\mathit {TP}}+{\mathit {FP}})({\mathit {TP}}+{\mathit {FN}})({\mathit {TN}}+{\mathit {FP}})({\mathit {TN}}+{\mathit {FN}})}}}$
In this equation, TP is the number of true positives, TN the number of true negatives, FP the number of false positives and FN the number of false negatives. If any of the four sums in the denominator is zero, the denominator can be arbitrarily set to one; this results in a Matthews correlation coefficient of zero, which can be shown to be the correct limiting value.

The MCC can be calculated with the formula:
${\text{MCC}}={\sqrt {{\mathit {PPV}}\times {\mathit {TPR}}\times {\mathit {TNR}}\times {\mathit {NPV}}}}-{\sqrt {{\mathit {FDR}}\times {\mathit {FNR}}\times {\mathit {FPR}}\times {\mathit {FOR}}}}$
using the positive predictive value, the true positive rate, the true negative rate, the negative predictive value, the false discovery rate, the false negative rate, the false positive rate, and the false omission rate.

The original formula as given by Matthews was:
${\begin{aligned}N&={\mathit {TN}}+{\mathit {TP}}+{\mathit {FN}}+{\mathit {FP}}\\S&={\frac {{\mathit {TP}}+{\mathit {FN}}}{N}}\\P&={\frac {{\mathit {TP}}+{\mathit {FP}}}{N}}\\{\text{MCC}}&={\frac {{\mathit {TP}}/N-S\times P}{\sqrt {PS(1-S)(1-P)}}}\end{aligned}}$
This is equal to the formula given above. As a correlation coefficient, the Matthews correlation coefficient is the geometric mean of the regression coefficients of the problem and its dual. The component regression coefficients of the Matthews correlation coefficient are Markedness (Δp) and Youden’s J statistic (Informedness or Δp’). Markedness and Informedness correspond to different directions of information flow and generalize Youden’s J statistic, the δp statistics and (as their geometric mean) the Matthews Correlation Coefficient to more than two classes.

Some scientists claim the Matthews correlation coefficient to be the most informative single score to establish the quality of a binary classifier prediction in a confusion matrix context.

In abstract terms, the confusion matrix is as follows: Alt

i7366464

关注

0
点赞
踩
12

收藏

觉得还不错? 一键收藏
0
评论
马修斯相关系数代码Python版

目录马修斯相关系数介绍强调原文内容python代码马修斯相关系数介绍马修斯相关系数相关介绍来自维基百科，有技术的可以自行查看。强调整篇文章均会使用网页中的原文原文内容While there is no perfect way of describing the confusion matrix of true and false positives and negatives by a single number, the Matthews correlation coefficient is
复制链接

扫一扫