从更全面的看待推荐系统:通过偏好值来生成推荐结果并非绝对必要。给出一个从优到劣排序的推荐结果在很多场景就够用了,而不用必须包含估计的偏好值;而事实上,有时候精确的列表顺序也不是那么必要,有几个好的结果就可以了。
从这种普遍的视角,可以运用经典的信息检索(information retrieval)度量标准来评估推荐系统:查准率(precision)、查全率(recall)
一个评估查全率与查准率的简单demo:
package com.xh.recommender;
import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.eval.IRStatistics;
import org.apache.mahout.cf.taste.eval.RecommenderBuilder;
import org.apache.mahout.cf.taste.eval.RecommenderIRStatsEvaluator;
import org.apache.mahout.cf.taste.impl.eval.GenericRecommenderIRStatsEvaluator;
import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;
import org.apache.mahout.cf.taste.impl.neighborhood.NearestNUserNeighborhood;
import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender;
import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity;
import org.apache.mahout.cf.taste.model.DataModel;
import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood;
import org.apache.mahout.cf.taste.recommender.Recommender;
import org.apache.mahout.cf.taste.similarity.UserSimilarity;
import org.apache.mahout.common.RandomUtils;
import java.io.File;
import java.net.URL;
/**
* @author xiaohe
* @version V1.0.0
* @Description: 评估 查全率 与 查准率
* @date: 2018-8-25 18:52
* @Copyright:
*/
public class IREvaluatorIntro {
public static void main(String[] args) throws Exception {
/**
*
* 强制 每次选择相同的随机值
* 只是为了获取可重复的结果
* 可以用在demo和测试用例,不能用在实际代码中
*
*/
RandomUtils.useTestSeed();
final String filePath = "intro.csv";
URL url = RecommenderIntro.class.getClassLoader()
.getResource(filePath);
File modelFile = new File(url.getFile());
if(!modelFile.exists()) {
System.err.println("Please, specify name of file, or put file 'input.csv' into current directory!");
System.exit(1);
}
// 装载数据文件
DataModel model = new FileDataModel(modelFile);
RecommenderIRStatsEvaluator evaluator =
new GenericRecommenderIRStatsEvaluator();
// 构建 推荐 示例
RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
/**
*
* Recommender 是由新的 DataModel 构建的
*
*/
@Override
public Recommender buildRecommender(DataModel model) throws TasteException {
UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
UserNeighborhood neighborhood =
new NearestNUserNeighborhood(2, similarity, model);
return new GenericUserBasedRecommender(model, neighborhood, similarity);
}
};
IRStatistics stats = evaluator.evaluate(recommenderBuilder,
null, model, null, 2,
GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD,
1.0);
// 查准率 有多少结果是好的
System.out.println(stats.getPrecision());
// 查全率 有多少好的推荐包含在里面
System.out.println(stats.getRecall());
}
}
项目结构: