1.confusion_matrix
利用混淆矩阵进行评估
混淆矩阵说白了就是一张表格-
所有正确的预测结果都在对角线上,所以从混淆矩阵中可以很方便直观的看出哪里有错误,因为他们呈现在对角线外面。
举个直观的例子
这个表格是一个混淆矩阵
正确的值是上边的表格,混淆矩阵是下面的表格,这就表示,apple应该有两个,但是只预测对了一个,其中一个判断为banana了,banana应该有8ge,但是5个预测对了3个判断为pear了,pear有应该有6个,但是2个判断为apple了,可见对角线上是正确的预测值,对角线之外的都是错误的。
这个混淆矩阵的实现代码
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
y_test=["a","b","p","b","b","b","b","p","b","p","b","b","p","p","p","a"]
y_pred=["a","b","p","p","p","p","b","p","b","p","b","b","a","a","p","b"]
confusion_matrix(y_test, y_pred,labels=["a", "b","p"])
#array([[1, 1, 0],
[0, 5, 3],
[2, 0, 4]], dtype=int64)
print(classification_report(y_test,y_pred))
##
precision recall f1-score support
a 0.33 0.50 0.40 2
b 0.83 0.62 0.71 8
p 0.57 0.67 0.62 6
avg / total 0.67 0.62 0.64 16
我传到github上面了
复现代码1
# Import necessary modules
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
# Create training and test set
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.4,random_state=42)
# Instantiate a k-NN classifier: knn
knn = KNeighborsClassifier(6)
# Fit the classifier to the training data
knn.fit(X_train,y_train)
# Predict the labels of the test data: y_pred
y_pred = knn.predict(X_test)
# Generate the confusion matrix and classification report
print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))
复现代码2
补充知识
先给一个二分类的例子
其他同理
TP(True Positive):将正类预测为正类数,真实为0,预测也为0
FN(False Negative):将正类预测为负类数,真实为0,预测为1
FP(False Positive):将负类预测为正类数, 真实为1,预测为0
TN(True Negative):将负类预测为负类数,真实为1,预测也为1
因此:预测性分类模型&