测试分类器的正确率_在测试集上计算分类器的准确率-CSDN博客

本文链接：https://blog.csdn.net/gaoyueace/article/details/78726433

对于分类器来说，错误率就是分类器给出的错误结果的次数初一测试数据的总数。完美分类器的错误率为0，错误率为1的分类器不会给出任何正确的结果。测试函数为：

def datingClassTest():
    hoRatio = 0.10 #测试数据占总数据的百分比
    datingDataMat, datingLabels = file2matrix('datingTestSet2.txt') #将文本信息转成numpy格式
    #datingDataMat为数据集，datingLabels为标签集
    normMat, ranges, minVals = autoNorm(datingDataMat)  #将datingDataMat数据归一化
    #normMat为归一化数据特征值，ranges为特征最大值-最小值，minVals为最小值
    m = normMat.shape[0] #取normMat的行数
    numTestVecs = int(m*hoRatio) #测试数据的行数
    errorCount = 0.0 #错误数据数量
    for i in range(numTestVecs):
        classifierResult = classify0(normMat[i,:], normMat[numTestVecs:m, :], datingLabels[numTestVecs:m], 3)
        #classify0为kNN分类器，normMat为用于分类的输入向量，normMat为输入的训练样本集（剩余的90%）
        #datingLabels为训练标签，3表示用于选择最近邻居的数目
        print("the classifier came back with: %d, the real answer is: %d" %(classifierResult, datingLabels[i]))
        if (classifierResult != datingLabels[i]):errorCount += 1.0 #分类器结果和原标签不一样，则errorCount加1
    print("the total error rate is : %f" %(errorCount/float(numTestVecs)))

datingClassTest()

>>
the classifier came back with: 3, the real answer is: 3
the classifier came back with: 2, the real answer is: 2
.
.
.
the classifier came back with: 1, the real answer is: 1
the classifier came back with: 3, the real answer is: 1
the total error rate is : 0.050000