从user guide开始:
By default, the score computed at each CV iteration is the score
method of the estimator. It is possible to change this by using the
scoring parameter:
从DecisionTreeClassifier documentation:
Returns the mean accuracy on the given test data and labels. In
multi-label classification, this is the subset accuracy which is a
harsh metric since you require for each sample that each label set be
correctly predicted.
不要被“平均准确性”所迷惑,这只是人们计算准确性的常规方式.点击链接到source:
from .metrics import accuracy_score
return accuracy_score(y, self.predict(X), sample_weight=sample_weight)
现在是source作为metrics.accuracy_score
def accuracy_score(y_true, y_pred, normalize=True, sample_weight=None):
...
# Compute accuracy for each possible representation
y_type, y_true, y_pred = _check_targets(y_true, y_pred)
if y_type.startswith('multilabel'):
differing_labels = count_nonzero(y_true - y_pred, axis=1)
score = differing_labels == 0
else:
score = y_true == y_pred
return _weighted_sum(score, sample_weight, normalize)
def _weighted_sum(sample_score, sample_weight, normalize=False):
if normalize:
return np.average(sample_score, weights=sample_weight)
elif sample_weight is not None:
return np.dot(sample_score, sample_weight)
else:
return sample_score.sum()
注意:对于precision_score,归一化参数默认为True,因此它仅返回布尔numpy数组的np.average,因此仅是正确预测的平均数.