# 【收藏】深度学习用于疾病诊断-第一课第二周作业-学会计算分类各种指标-超详细教程

43 篇文章 174 订阅

1. Packages
2. Overview
3. Metrics
3.1 True Positives, False Positives, True Negatives, and False Negatives
3.2 Accuracy
3.3 Prevalence
3.4 Sensitivity and Specificity
3.5 PPV and NPV
3.6 ROC Curve
4. Confidence Intervals 置信区间
5. Precision-Recall Curve
6. F1 Score
7. Calibration(校准)

1. Accuracy
2. Prevalence
3. Specificity & Sensitivity
4. PPV and NPV
5. ROC curve and AUCROC (c-statistic)
6. Confidence Intervals

### 1.3 从csv获得预测结果和真实标签

train_results = pd.read_csv("train_preds.csv")


y = valid_results[class_labels].values
pred = valid_results[pred_labels].values


### 1.4.1 真阳性、假阳性、真阴性和假阴性

• 模型的输出结果 > t h th ， 则 预测结果 pred = 1
• 模型的输出结果 < t h th ， 则 预测结果 pred = 0

def true_positives(y, pred, th=0.5):
TP = 0
# get thresholded predictions
thresholded_preds = pred >= th
# compute TP
TP = np.sum((y == 1) & (thresholded_preds == 1))
return TP
def true_negatives(y, pred, th=0.5):
TN = 0
# get thresholded predictions
thresholded_preds = pred >= th
# compute TN
TN = np.sum((y == 0) & (thresholded_preds == 0))
return TN
def false_positives(y, pred, th=0.5):
FP = 0
# get thresholded predictions
thresholded_preds = pred >= th
# compute FP
FP = np.sum((y == 0) & (thresholded_preds == 1))
return FP
def false_negatives(y, pred, th=0.5):
FN = 0
# get thresholded predictions
thresholded_preds = pred >= th
# compute FN
FN = np.sum((y == 1) & (thresholded_preds == 0))
return FN


util.get_performance_metrics(y, pred, class_labels)

#### 1.4.2 计算 Accuracy

a c c u r a c y = true positives + true negatives true positives + true negatives + false positives + false negatives accuracy = \frac{\text{true positives} + \text{true negatives}}{\text{true positives} + \text{true negatives} + \text{false positives} + \text{false negatives}}

def get_accuracy(y, pred, th=0.5):
TP = true_positives(y,pred,th)
FP = false_positives(y,pred,th)
TN = true_negatives(y,pred,th)
FN = false_negatives(y,pred,th)
# Compute accuracy using TP, FP, TN, FN
accuracy = (TP + TN) / (TP + TN + FP + FN)
return accuracy


### 1.4.3 Prevalence 患病率

1. 在医学背景下，患病率是人口中患有疾病（或病症等）的人的比例。
2. 在机器学习术语中，这是阳性的比例。流行度的表达式为：
p r e v a l e n c e = 1 N ∑ i y i prevalence = \frac{1}{N} \sum_{i} y_i
N：样本总数
def get_prevalence(y):
prevalence = 0.0
prevalence = np.mean(y)
return prevalence


# Test
y_test = np.array([1, 0, 0, 1, 1, 0, 0, 0, 0, 1])
print(f"computed prevalence: {get_prevalence(y_test)}")


computed prevalence: 4 / 10 = 0.4 4 / 10 = 0.4

### 1.4.4 Sensitivity and Specificity

• 敏感性是判断如果案例实际是阳性，我们预测为阳性的概率。
• 特异性是给定病例实际上是阴性的情况下模型输出阴性的概率。
s e n s i t i v i t y = true positives true positives + false negatives sensitivity = \frac{\text{true positives}}{\text{true positives} + \text{false negatives}}

s p e c i f i c i t y = true negatives true negatives + false positives specificity = \frac{\text{true negatives}}{\text{true negatives} + \text{false positives}}

def get_sensitivity(y, pred, th=0.5):
TP = true_positives(y,pred,th)
FN = false_negatives(y,pred,th)
# use TP and FN to compute sensitivity
sensitivity = TP/(TP+FN)
return sensitivity

def get_specificity(y, pred, th=0.5):
TN = true_negatives(y,pred,th)
FP = false_positives(y,pred,th)
# use TN and FP to compute specificity
specificity = TN/(TN+FP)
return specificity


### 1.4.5 PPV and NPV

• 阳性预测值 (PPV) 是筛查试验阳性的受试者确实患有该疾病的概率。
• 阴性预测值 (NPV) 是筛查试验阴性的受试者确实没有患病的概率。

P P V = true positives true positives + false positives PPV = \frac{\text{true positives}}{\text{true positives} + \text{false positives}}

N P V = true negatives true negatives + false negatives NPV = \frac{\text{true negatives}}{\text{true negatives} + \text{false negatives}}

def get_ppv(y, pred, th=0.5):
TP = true_positives(y,pred,th)
FP = false_positives(y,pred,th)
# use TP and FP to compute PPV
PPV =  TP/(TP+FP)
return PPV

def get_npv(y, pred, th=0.5):
TN = true_negatives(y,pred,th)
FN = false_negatives(y,pred,th)
# use TN and FN to compute NPV
NPV = TN/(TN+FN)
return NPV


### 1.4.6 ROC Curve and AUC

util.get_curve(y, pred, class_labels)


ROC 曲线下的面积也称为 AUROC 或 C 统计量，是拟合优度的度量。

from sklearn.metrics import roc_auc_score
auc = roc_auc_score(y[:, 0], pred[:, 0])


### 1.5 置信区间

1. 做分层采样 n 次：

def bootstrap_auc(y, pred, classes, bootstraps = 100, fold_size = 1000):
statistics = np.zeros((len(classes), bootstraps))

for c in range(len(classes)):
df = pd.DataFrame(columns=['y', 'pred'])
df.loc[:, 'y'] = y[:, c]
df.loc[:, 'pred'] = pred[:, c]
# get positive examples for stratified sampling
df_pos = df[df.y == 1]
df_neg = df[df.y == 0]
prevalence = len(df_pos) / len(df)
for i in range(bootstraps):
# stratified sampling of positive and negative examples
pos_sample = df_pos.sample(n = int(fold_size * prevalence), replace=True)
neg_sample = df_neg.sample(n = int(fold_size * (1-prevalence)), replace=True)

y_sample = np.concatenate([pos_sample.y.values, neg_sample.y.values])
pred_sample = np.concatenate([pos_sample.pred.values, neg_sample.pred.values])
score = roc_auc_score(y_sample, pred_sample)
statistics[c][i] = score
return statistics

statistics = bootstrap_auc(y, pred, class_labels)


1. 根据 statistics 计算每个类别的均值和置信区间
• 均值：就是求这100次的平均值
• 置信区间 假设我们要求的置信区间范围是 5%-95%
可以使用函数 np.quantile()
def print_confidence_intervals(class_labels, statistics):
df = pd.DataFrame(columns=["Mean AUC (CI 5%-95%)"])
for i in range(len(class_labels)):
mean = statistics.mean(axis=1)[i]
max_ = np.quantile(statistics, .95, axis=1)[i]
min_ = np.quantile(statistics, .05, axis=1)[i]
df.loc[class_labels[i]] = ["%.2f (%.2f-%.2f)" % (mean, min_, max_)]
return df


print_confidence_intervals这个函数在 utils.py里

### 1.6 Precision-Recall Curve

• 精度是衡量结果相关性的指标，相当于我们之前定义的 PPV。
• 召回率衡量返回了多少真正相关的结果，这相当于我们之前定义的敏感度。

PRC 可用 sklearn.metrics.precision_recall_curve 求得

def get_curve(gt, pred, target_names, curve='roc'):
for i in range(len(target_names)):
if curve == 'roc':
curve_function = roc_curve
auc_roc = roc_auc_score(gt[:, i], pred[:, i])
label = target_names[i] + " AUC: %.3f " % auc_roc
xlabel = "False positive rate"
ylabel = "True positive rate"
a, b, _ = curve_function(gt[:, i], pred[:, i])
plt.figure(1, figsize=(7, 7))
plt.plot([0, 1], [0, 1], 'k--')
plt.plot(a, b, label=label)
plt.xlabel(xlabel)
plt.ylabel(ylabel)

plt.legend(loc='upper center', bbox_to_anchor=(1.3, 1),
fancybox=True, ncol=1)
elif curve == 'prc':
precision, recall, _ = precision_recall_curve(gt[:, i], pred[:, i])
average_precision = average_precision_score(gt[:, i], pred[:, i])
label = target_names[i] + " Avg.: %.3f " % average_precision
plt.figure(1, figsize=(7, 7))
plt.step(recall, precision, where='post', label=label)
plt.xlabel('Recall')
plt.ylabel('Precision')
plt.ylim([0.0, 1.05])
plt.xlim([0.0, 1.0])
plt.legend(loc='upper center', bbox_to_anchor=(1.3, 1),
fancybox=True, ncol=1)

util.get_curve(y, pred, class_labels, curve='prc')


### 1.7 F1 Score

F1 分数是精确率和召回率的调和平均值，其中 F1 分数在 1（完美的精确率和召回率）时达到其最佳值，在 0 时达到最差值。
F 1 = 2 ∗ ( p r e c i s i o n ∗ r e c a l l ) / ( p r e c i s i o n + r e c a l l ) F1 = 2 * (precision * recall) / (precision + recall)

from sklearn.metrics import f1_score
f1_score(y[:,0], pred[:,0] > 0.5)


### 1.8 校准曲线

from sklearn.calibration import calibration_curve
def plot_calibration_curve(y, pred):
plt.figure(figsize=(20, 20))
for i in range(len(class_labels)):
plt.subplot(4, 4, i + 1)
fraction_of_positives, mean_predicted_value = calibration_curve(y[:,i], pred[:,i], n_bins=20)
plt.plot([0, 1], [0, 1], linestyle='--')
plt.plot(mean_predicted_value, fraction_of_positives, marker='.')
plt.xlabel("Predicted Value")
plt.ylabel("Fraction of Positives")
plt.title(class_labels[i])
plt.tight_layout()
plt.show()

plot_calibration_curve(y, pred)


from sklearn.linear_model import LogisticRegression as LR

y_train = train_results[class_labels].values
pred_train = train_results[pred_labels].values
pred_calibrated = np.zeros_like(pred)

for i in range(len(class_labels)):
lr = LR(solver='liblinear', max_iter=10000)
lr.fit(pred_train[:, i].reshape(-1, 1), y_train[:, i])
pred_calibrated[:, i] = lr.predict_proba(pred[:, i].reshape(-1, 1))[:,1]

plot_calibration_curve(y[:,], pred_calibrated)


• 2
点赞
• 8
收藏
觉得还不错? 一键收藏
• 打赏
• 0
评论
05-12 3466
06-05 2万+
11-17
05-02
05-02
04-29 1153

### “相关推荐”对你有帮助么？

• 非常没帮助
• 没帮助
• 一般
• 有帮助
• 非常有帮助

Tina姐

¥1 ¥2 ¥4 ¥6 ¥10 ¥20

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。