Mahout给我们提供的强大的协同过滤算法。需要新建一个基于Maven的工程,下面是
pom.xml需要导入的包。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
<project xmlns=
"http://maven.apache.org/POM/4.0.0"
xmlns:xsi=
"http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation=
"http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"
>
<modelVersion>
4.0
.
0
</modelVersion>
<groupId>mahouttest</groupId>
<artifactId>mahouttest</artifactId>
<version>
0.0
.
1
-SNAPSHOT</version>
<packaging>jar</packaging>
<name>mahouttest</name>
<url>http:
//maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-
8
</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>
4.8
.
1
</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.mahout</groupId>
<artifactId>mahout-core</artifactId>
<version>
0.8
-SNAPSHOT</version>
<type>jar</type>
<scope>compile</scope>
</dependency>
</dependencies>
|
这里我们导入的是最新的Mahout包,需要在本地的maven库中安装好。
首先我们需要准备好测试的数据,我们就用《Mahout in action》中的例子:
1,101,5
1,102,3
1,103,2.5
2,101,2
2,102,2.5
2,103,5
2,104,2
3,101,2.5
3,104,4
3,105,4.5
3,107,5
4,101,5
4,103,3
4,104,4.5
4,106,4
5,101,4
5,102,3
5,103,2
5,104,4
5,105,3.5
5,106,4
|
具体对应的关系图如下:
下面我们用Mahout中三种不同的推荐代码来执行以下刚才给出的数据,看看Mahout中的推荐接口是
如何使用的。
1. 基于用户的协同推荐的代码:
1
2
3
4
5
6
7
8
|
DataModel model =
new
FileDataModel(
new
File(
"data/intro.csv"
));
UserSimilarity similarity =
new
PearsonCorrelationSimilarity(model);
UserNeighborhood neighborhood =
new
NearestNUserNeighborhood(
2
,similarity,model);
Recommender recommender=
new
GenericUserBasedRecommender(model,neighborhood,similarity);
List<RecommendedItem> recommendations =recommender.recommend(
1
,
1
);
for
(RecommendedItem recommendation :recommendations){
System.out.println(recommendation);
}
|
执行后的结果是:RecommendedItem[item:104, value:4.257081]
2. 基于Item的协同过滤的代码:
1
2
3
4
5
6
7
|
DataModel model =
new
FileDataModel(
new
File(
"data/intro.csv"
));
ItemSimilarity similarity =
new
PearsonCorrelationSimilarity(model);
Recommender recommender=
new
GenericItemBasedRecommender(model,similarity);
List<RecommendedItem> recommendations =recommender.recommend(
1
,
1
);
for
(RecommendedItem recommendation :recommendations){
System.out.println(recommendation);
}
|
执行后的结果是:RecommendedItem[item:104, value:5.0]
3. SlopeOne推荐算法
1
2
3
4
5
6
|
DataModel model =
new
FileDataModel(
new
File(
"data/intro.csv"
));
Recommender recommender=
new
SlopeOneRecommender(model);
List<RecommendedItem> recommendations =recommender.recommend(
1
,
1
);
for
(RecommendedItem recommendation :recommendations){
System.out.println(recommendation);
}
|
执行结果是:RecommendedItem[item:105, value:5.75]