# ML之sklearn：sklearn.metrics中常用的函数参数(比如confusion_matrix等 )解释及其用法说明之详细攻略

ML之sklearn：sklearn.metrics中常用的函数参数(比如confusion_matrix等 )解释及其用法说明之详细攻略

sklearn.metrics中常用的函数参数

confusion_matrix

# sklearn.metrics中常用的函数参数

## confusion_matrix函数解释

 预测 0 1 真实 0 1

 def confusion_matrix Found at: sklearn.metrics._classification @_deprecate_positional_args def confusion_matrix(y_true, y_pred, *, labels=None, sample_weight=None,  normalize=None):     """Compute confusion matrix to evaluate the accuracy of a classification.          By definition a confusion matrix :math:C is such that :math:C_{i, j} is equal to the number of observations known to be in group :math:i and predicted to be in group :math:j.          Thus in binary classification, the count of true negatives is     :math:C_{0,0}, false negatives is :math:C_{1,0}, true positives is     :math:C_{1,1} and false positives is :math:C_{0,1}.          Read more in the :ref:User Guide .          Parameters     ----------     y_true : array-like of shape (n_samples,) Ground truth (correct) target values.     y_pred : array-like of shape (n_samples,) Estimated targets as returned by a classifier.     labels : array-like of shape (n_classes), default=None.  List of labels to index the matrix. This may be used to reorder     or select a subset of labels.  If None is given, those that appear at least once in y_true or y_pred are used in sorted order.          sample_weight : array-like of shape (n_samples,), default=None. Sample weights.          .. versionadded:: 0.18          normalize : {'true', 'pred', 'all'}, default=None. Normalizes confusion matrix over the true (rows), predicted (columns)     conditions or all the population. If None, confusion matrix will not be normalized.          Returns     -------     C : ndarray of shape (n_classes, n_classes)     Confusion matrix whose i-th row and j-th column entry indicates the number of samples with true label being i-th class and prediced label being j-th class.          References     ----------     .. [1] Wikipedia entry for the Confusion matrix _  (Wikipedia and other references may use a different convention for axes) 在:sklear. metrics._classification找到的def confusion_matrix @_deprecate_positional_args defconfusion_matrix (y_true, y_pred， *， label =None, sample_weight=None， normalize= None):计算混淆矩阵来评估分类的准确性。 根据定义，一个混淆矩阵:math: ' C '是这样的:math: ' C_{i, j} '等于已知在:math: ' i '组和预测在:math: ' j '组的观测数。 因此，在二元分类法中，true negatives的数量是     :math:C_{0,0}, false negatives is :math:C_{1,0}, true positives is     :math:C_{1,1} and false positives is :math:C_{0,1}. 更多信息见:ref: ' User Guide '。 参数 ---------- y_true:类数组形状(n_samples，) Ground truth (correct)目标值。 y_pred:分类器返回的估计目标的类数组形状(n_samples，)。 标签:类数组形状(n_classes)，默认=无。索引矩阵的标签列表。这可以用于重新排序 或者选择标签的子集。如果给出了' ' None ' '，则在' ' y_true ' '或' ' y_pred ' '中至少出现一次的值将按排序顺序使用。 sample_weight:类似数组的形状(n_samples，)，默认=None。样本权重。 . .versionadded:: 0.18 {'true'， 'pred'， 'all'}， default=None。对真实(行)、预测(列)的混淆矩阵进行规范化 条件或所有的人口。如果没有，混淆矩阵将不会被标准化。 返回 ------- C:形状的ndarray (n_classes, n_classes) 第i行和第j列项表示真标签样本个数为第i类，谓词标签样本个数为第j类的混淆矩阵。      引用 ---------- . .[1] '用于混淆矩阵的维基百科条目 ' _(维基百科和其他引用可能对轴使用不同的约定) Examples     --------     >>> from sklearn.metrics import confusion_matrix     >>> y_true = [2, 0, 2, 2, 0, 1]     >>> y_pred = [0, 0, 2, 2, 0, 2]     >>> confusion_matrix(y_true, y_pred)     array([[2, 0, 0],     [0, 0, 1],     [1, 0, 2]])          >>> y_true = ["cat", "ant", "cat", "cat", "ant", "bird"]     >>> y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"]     >>> confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"])     array([[2, 0, 0],     [0, 0, 1],     [1, 0, 2]])          In the binary case, we can extract true positives, etc as follows:          >>> tn, fp, fn, tp = confusion_matrix([0, 1, 0, 1], [1, 1, 1, 0]).ravel()     >>> (tn, fp, fn, tp)     (0, 2, 1, 1) """     y_type, y_true, y_pred = _check_targets(y_true, y_pred)     if y_type not in ("binary", "multiclass"):         raise ValueError("%s is not supported" % y_type)     if labels is None:         labels = unique_labels(y_true, y_pred)     else:         labels = np.asarray(labels)         n_labels = labels.size         if n_labels == 0:             raise ValueError("'labels' should contains at least one label.")         elif y_true.size == 0:             return np.zeros((n_labels, n_labels), dtype=np.int)         elif np.all([l not in y_true for l in labels]):             raise ValueError("At least one label specified must be in y_true")     if sample_weight is None:         sample_weight = np.ones(y_true.shape[0], dtype=np.int64)     else:         sample_weight = np.asarray(sample_weight)     check_consistent_length(y_true, y_pred, sample_weight)     if normalize not in ['true', 'pred', 'all', None]:         raise ValueError("normalize must be one of {'true', 'pred', "             "'all', None}")     n_labels = labels.size     label_to_ind = {y:x for x, y in enumerate(labels)}     # convert yt, yp into index     y_pred = np.array([label_to_ind.get(x, n_labels + 1) for x in y_pred])     y_true = np.array([label_to_ind.get(x, n_labels + 1) for x in y_true])     # intersect y_pred, y_true with labels, eliminate items not in labels     ind = np.logical_and(y_pred < n_labels, y_true < n_labels)     y_pred = y_pred[ind]     y_true = y_true[ind] # also eliminate weights of eliminated items     sample_weight = sample_weight[ind]     # Choose the accumulator dtype to always have high precision     if sample_weight.dtype.kind in {'i', 'u', 'b'}:         dtype = np.int64     else:         dtype = np.float64     cm = coo_matrix((sample_weight, (y_true, y_pred)), shape=(n_labels,       n_labels), dtype=dtype).toarray()     with np.errstate(all='ignore'):         if normalize == 'true':             cm = cm / cm.sum(axis=1, keepdims=True)         elif normalize == 'pred':             cm = cm / cm.sum(axis=0, keepdims=True)         elif normalize == 'all':             cm = cm / cm.sum()         cm = np.nan_to_num(cm)     return cm

07-18 327

04-19 9316
07-03 8725
08-13 3194
03-26 456