Evaluator.py
这个python文件调用的是前面写的那两个python文件EvaluatedAlgorithm和EvaluationData,所以就是再将这两个文件的功能整合一下。
下面是调用这两个python文件的代码
from EvaluatedAlgorithm import EvaluatedAlgorithm
from EvaluationData import EvaluationData
init函数
EvaluationData类的返回值作为dataset。
def __init__(self,dataset,rankings):
ed=EvaluationData(dataset,rankings)
self.dataset = ed
AddAlgorithm函数
将EvaluationAlgorithm类的返回值加到algorithm列表中,因为后面不止一个算法需要评估。
def AddAlgorithm(self, algorithm, name):
alg = EvaluatedAlgorithm(algorithm,name)
self.algorithms.append(alg)
Evaluate函数
评估函数,将评估算法的推荐度,参数还需要TopN,如果有TopN,则计算所有推荐指标,没有TopN则只计算准确率MAE和RMSE。再将所有计算的值打印出来。
def Evaluate(self, doTopN):
results = {}
for algorithm in self.algorithms:
print("Evaluating ", algorithm.GetName(), "...")
results[algorithm.GetName()] = algorithm.Evaluate(self.dataset, doTopN)
# Print results
if (doTopN):
print("{:<10} {:<10} {:<10} {:<10} {:<10} {:<10} {:<10}".format(
"Algorithm", "RMSE", "MAE", "HR", "ARHR", "Diversity", "Novelty"))
for (name, metrics) in results.items():
print("{:<10} {:<10.4f} {:<10.4f} {:<10.4f} {:<10.4f} {:<10.4f} {:<10.4f}".format(
name, metrics["RMSE"], metrics["MAE"], metrics["HR"], metrics["ARHR"],
metrics["Diversity"], metrics["Novelty"]))
else:
print("{:<10} {:<10} {:<10}".format("Algorithm", "RMSE", "MAE"))
for (name, metrics) in results.items():
print("{:<10} {:<10.4f} {:<10.4f}".format(name, metrics["RMSE"], metrics["MAE"]))
print("\nLegend:\n")
print("RMSE: Root Mean Squared Error. Lower values mean better accuracy.")
print("MAE: Mean Absolute Error. Lower values mean better accuracy.")
if (doTopN):
print("HR: Hit Rate; how often we are able to recommend a left-out rating. Higher is better.")
print("ARHR: Average Reciprocal Hit Rank - Hit rate that takes the ranking into account. Higher is better." )
print("Diversity: 1-S, where S is the average similarity score between every possible pair of recommendations")
print(" for a given user. Higher means more diverse.")
print("Novelty: Average popularity rank of recommended items. Higher means more novel.")
SampleTopNRecs函数
将algorithm算法集中每个算法推荐指标进行统计分数,然后排序给出推荐电影并打印出来。
def SampleTopNRecs(self, ml, testSubject=0, k=10):
for algo in self.algorithms:
print("\nUsing ", algo.GetName())
trainSet = self.dataset.GetFullTrainSet()
algo.GetAlgorithm().fit(trainSet)
print("Computing Recommendations")
testSet = self.dataset.GetAntiTestSetForUser(testSubject)
predictions = algo.GetAlgorithm().test(testSet)
recommendations = []
print("\nFor user ", testSubject, " we recommend:")
for userID, movieID, actualRating, estimatedRating, _ in predictions:
intMovieID = int(movieID)
recommendations.append((intMovieID,estimatedRating))
recommendations.sort(key = lambda x:x[1],reverse = True)
number=0
for ratings in recommendations[:15]:
number += 1
print(number," - ",ml.getMovieName(ratings[0]), round(ratings[1],1))
总结
这个python文件就是将前面的两个python文件功能统一使用,并将算法评分,将算法的推荐指标综合计算,然后给出关于某一算法推荐的电影。