解读model_selection.py
模块
This module contains functions and classes for model evaluation and selection.
该模块包含用于(模型评估和选择的)函数和类。
1.cm_element(y_true, y_pred)
函数:
(1)函数功能:
It computes the elements of a confusion matrix,返回tp, tn, fp, fn.
def cm_element(y_true, y_pred):
(2)参数解释:
Parameters(函数输入参数):
y_true
: array-like
Target values of samples.
y_pred
: array-like
Predicted class lables.
(3)Returns(返回值):
tp
: int
True positive.
tn
: int
True negative.
fp
: int
False positive.
fn
: int
False negative.
#Elements of confusion matrix
#tp, tn, fp, fn计算过程
tp, tn, fp, fn = 0, 0, 0, 0
for i in range(y_true.shape[0]):
# True positive
if y_true[i] == 1 and y_pred[i] == 1:
tp = tp + 1
# True negative
elif y_true[i] == -1 and y_pred[i] == -1:
tn = tn + 1
# False positive
elif y_true[i] == -1 and y_pred[i] == 1:
fp = fp + 1
# False negative
elif y_true[i] == 1 and y_pred[i] == -1:
fn = fn + 1
return tp, tn, fp, fn
2.performance_eval(tp, tn, fp, fn)
函数:
(1)函数功能:
It computes common evaluation metrics based on the elements of a confusion matrix.
它基于混淆矩阵的元素tp, tn, fp, fn来计算通用评估指标accuracy , recall_p,precision_p, f1_p,recall_n ,precision_n ,f1_n。
def performance_eval(tp, tn, fp, fn):
(2)参数解释:
Parameters(函数输入参数):
performance_eval(tp, tn, fp, fn):
tp
: int
True positive.
tn
: int
True negative.
fp
: int
False positive.
fn
: int
False negative.
(3)Returns(返回值):
accuracy :
float
Overall accuracy of the model,模型的整体准确性,精度.
accuracy = (tp + tn) / (positives + negatives)
recall_p :
float
Recall of positive class. rec_p=tp / (tp + fn)
precision_p :
float
Precision of positive class. prec_p =tp / (tp + fp)
f1_p :
float
F1-measure of positive class. f1_p=(2 * r_p * p_p) / (p_p + r_p)
recall_n :
float
Recall of negative class. rec_n=tn / (tn + fp)
precision_n :
float
Precision of negative class. prec_n=tn / (tn + fn)
f1_n :
float
F1-measure of negative class. f1_n = (2 * r_n * p_n) / (p_n + r_n)
(4)代码解释:
TODO: This method should be reviewed!
需作:此方法需要进行审查
# Compute total positives and negatives
positives = tp + fp
negatives = tn + fn
# Metric functions
rec_p = lambda tp, fn: 0.0 if tp + fn == 0 else tp / (tp + fn)
#若tp + fn == 0即tp, fn=0则rec_p=0,否则rec_p=tp / (tp + fn)
prec_p = lambda tp, fp: 0.0 if tp + fp == 0 else tp / (tp + fp)
#prec_p =tp / (tp + fp)
f1_p = lambda r_p, p_p: 0.0 if r_p == 0.0 or p_p == 0.0 else (2 * r_p * p_p) / (p_p + r_p)
#f1_p=(2 * r_p * p_p) / (p_p + r_p)
rec_n = lambda tn, fp: 0.0 if tn + fp == 0 else tn / (tn + fp)
#rec_n=tn / (tn + fp)
prec_n = lambda tn, fn: 0.0 if tn + fn == 0 else tn / (tn + fn)
#prec_n=tn / (tn + fn)
f1_n = lambda r_n, p_n: 0.0 if r_n == 0.0 or p_n == 0.0 else (2 * r_n * p_n) / (p_n + r_n)
#f1_n = (2 * r_n * p_n) / (p_n + r_n)
accuracy = (tp + tn) / (positives + negatives)
# Positive class
recall_p = rec_p(tp, fn)
precision_p = prec_p(tp, fp)
f1m_p = f1_p(recall_p, precision_p)
# Negative class
recall_n = rec_n(tn, fp)
precision_n = prec_n(tp, fn)
f1m_n = f1_n(recall_n, precision_n)
return accuracy * 100, recall_p * 100, precision_p * 100, f1m_p * 100, recall_n * 100, precision_n * 100, f1m_n * 100
3.eval_metrics(y_true, y_pred)
函数:
(1)函数功能:
def eval_metrics(y_true, y_pred):
It computes common evaluation metrics such as Accuracy, Recall, Precision, F1-measure, and elements of the confusion matrix.
它计算常见的评估指标,例如Accuracy, Recall, Precision, F1-measure, and elements of the confusion matrix即准确性,召回率,精度,F1度量和混淆矩阵的元素。
(2)参数解释:
Parameters(函数输入参数):
eval_metrics(y_true, y_pred):
y_true
: array-like
Target values of samples.
y_pred
: array-like
Predicted class lables.
(3)Returns(返回值):
tp
: int
True positive.
tn
: int
True negative.
fp
: int
False positive.
fn
: int
False negative
accuracy :
float
Overall accuracy of the model,模型的整体准确性,精度.
accuracy = (tp + tn) / (positives + negatives)
recall_p :
float
Recall of positive class. rec_p=tp / (tp + fn)
precision_p :
float
Precision of positive class. prec_p =tp / (tp + fp)
f1_p :
float
F1-measure of positive class. f1_p=(2 * r_p * p_p) / (p_p + r_p)
recall_n :
float
Recall of negative class. rec_n=tn / (tn + fp)
precision_n :
float
Precision of negative class. prec_n=tn / (tn + fn)
f1_n :
float
F1-measure of negative class. f1_n = (2 * r_n * p_n) / (p_n + r_n)
(4)代码解释:
#调用上面定义的函数cm_element(y_true, y_pred)计算tp, tn, fp, fn,
tp, tn, fp, fn = cm_element(y_true, y_pred)
#调用上面定义的函数performance_eval(tp, tn, fp, fn),计算accuracy, recall_p, precision_p, f1_p, recall_n, precision_n, f1_n
accuracy, recall_p, precision_p, f1_p, recall_n, precision_n, f1_n = performance_eval(tp, tn, fp, fn)
4.class Validator
:
类功能: 验证器
It evaluates a TSVM-based estimator based on the specified evaluation method.
(1)class Validator(X_train, y_train, validator_type, estimator)
: It evaluates a TSVM-based estimator based on the specified evaluation method.
根据指定的评估方法评估基于TSVM的估计器。
(2)使用方式:Validator(X_train, y_train, validator_type, estimator)
(3)参数解释:
X_train :
array-like, shape (n_samples, n_features)
Training feature vectors, where n_samples is the number of samples
and n_features is the number of features.
y_train :
array-like, shape (n_samples,)
Target values or class labels.
validator_type :
tuple
A two-element tuple which contains type of evaluation method and its
parameter. Example: (‘CV’, 5) -> 5-fold cross-validation,
(‘t_t_split’, 30) -> 30% of samples for test set.
estimator :
estimator object
A TSVM-based estimator which inherits from the :class:BaseTSVM
.
类中的函数:
【1】def __init__(self, X_train, y_train, validator_type, estimator):
Initialize self. 参数含义见上。
self.train_data = X_train
self.labels_data = y_train
self.validator = validator_type
self.estimator = estimator
【2】def cv_validator(self, dict_param):
(1)函数功能:
It evaluates a TSVM-based estimator using the cross-validation
method.
(2)参数解释:
dict_param
: dict
Values of hyper-parameters for a TSVM-based estimator
(3)返回值:
float
Mean accuracy of the model.
float
Standard deviation of accuracy.
dict
Evaluation metrics such as Recall, Percision and F1-measure for
both classes as well as elements of the confusion matrix.
(4)代码解释:
self.estimator.set_params(**dict_param)
k_fold = KFold(self.validator[1])
# Store result after each run
mean_accuracy = []
# Postive class
mean_recall_p, mean_precision_p, mean_f1_p = [], [], []
# Negative class
mean_recall_n, mean_precision_n, mean_f1_n = [], [], []
# Count elements of confusion matrix
tp, tn, fp, fn = 0, 0, 0, 0
for train_index, test_index in k_fold.split(self.train_data):
# Extract data based on index created by k_fold
X_train = np.take(self.train_data, train_index, axis=0)
X_test = np.take(self.train_data, test_index, axis=0)
y_train = np.take(self.labels_data, train_index, axis=0)
y_test = np.take(self.labels_data, test_index, axis=0)
self.estimator.fit(X_train, y_train)
output = self.estimator.predict(X_test)
accuracy_test = eval_metrics(y_test, output)
mean_accuracy.append(accuracy_test[4])
# Positive class
mean_recall_p.append(accuracy_test[5])
mean_precision_p.append(accuracy_test[6])
mean_f1_p.append(accuracy_test[7])
# Negative class
mean_recall_n.append(accuracy_test[8])
mean_precision_n.append(accuracy_test[9])
mean_f1_n.append(accuracy_test[10])
# Count
tp = tp + accuracy_test[0]
tn = tn + accuracy_test[1]
fp = fp + accuracy_test[2]
fn = fn + accuracy_test[3]
需要完整代码或书籍PDF版的小伙伴可关注微信公众号:菜田里守望者
打开微信扫一扫关注吧,你们的支持就是我的动力