Mahout 0.9在windows上安装
需要:eclipse,jdk,maven,hadoop
详见:http://blog.fens.me/hadoop-mahout-maven-eclipse/
1. 安装maven
下载最新版的maven:http://maven.apache.org/download.cgi
解压到E:\maven-3.2.1
将E:\maven-3.2.1\bin添加到环境变量Path中去。
在cmd下输入mvn验证是否成功。
2 eclipse安装maven插件
去下载插件http://www.eclipse.org/m2e/
3 用Maven构建Mahout开发环境
1. 用Maven创建一个标准化的Java项目
2. 导入项目到eclipse
3. 增加mahout依赖,修改pom.xml
4. 下载依赖
C:\Users\Administrator\workspace是我java的工作目录
1) 用Maven创建一个标准化的Java项目
C:\Users\Administrator\workspace> mvnarchetype:generate -DarchetypeGroupId=org.apache.maven.archetypes -DgroupId=org.conan.mymahout-DartifactId=myMahout -DpackageName=org.conan.mymahout -Dversion=1.0-SNAPSHOT-DinteractiveMode=false
进入项目,执行mvn命令
C:\Users\Administrator\workspace > cdmyMahout
C:\Users\Administrator\workspace \myMahout>mvn clean install
2) 导入项目到eclipse
我们创建好了一个基本的maven项目,然后导入到eclipse中。这里我们最好已安装好了Maven的插件。
3) 增加mahout依赖,修改pom.xml
<projectxmlns="http://maven.apache.org/POM/4.0.0"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.conan.mymahout</groupId>
<artifactId>myMahout</artifactId>
<packaging>jar</packaging>
<version>1.0-SNAPSHOT</version>
<name>myMahout</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<mahout.version>0.9</mahout.version> //这里是mahout的版本
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.mahout</groupId>
<artifactId>mahout-core</artifactId>
<version>${mahout.version}</version>
</dependency>
<dependency>
<groupId>org.apache.mahout</groupId>
<artifactId>mahout-integration</artifactId>
<version>${mahout.version}</version>
<exclusions>
<exclusion>
<groupId>org.mortbay.jetty</groupId>
<artifactId>jetty</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.cassandra</groupId>
<artifactId>cassandra-all</artifactId>
</exclusion>
<exclusion>
<groupId>me.prettyprint</groupId>
<artifactId>hector-core</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
</project>
4) 下载依赖
C:\Users\Administrator\workspace \myMahout> mvn cleaninstall
项目的依赖程序,被自动加载的库路径下面。
实验
用Mahout实现协同过滤userCF
· 1. 准备数据文件: item.csv
· 2. Java程序:UserCF.java
· 3. 运行程序
· 4. 推荐结果解读
1. 准备数据
2. ~ mkdir datafile
3. ~ vidatafile/item.csv
4.
5. 1,101,5.0
6. 1,102,3.0
7. 1,103,2.5
8. 2,101,2.0
9. 2,102,2.5
10. 2,103,5.0
11. 2,104,2.0
12. 3,101,2.5
13. 3,104,4.0
14. 3,105,4.5
15. 3,107,5.0
16. 4,101,5.0
17. 4,103,3.0
18. 4,104,4.5
19. 4,106,4.0
20. 5,101,4.0
21. 5,102,3.0
22. 5,103,2.0
23. 5,104,4.0
24. 5,105,3.5
25. 5,106,4.0
2. Java程序
新建JAVA类:org.conan.mymahout.recommendation.UserCF.java
package org.conan.mymahout.recommendation;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import org.apache.mahout.cf.taste.common.TasteException;
importorg.apache.mahout.cf.taste.impl.common.LongPrimitiveIterator;
importorg.apache.mahout.cf.taste.impl.model.file.FileDataModel;
importorg.apache.mahout.cf.taste.impl.neighborhood.NearestNUserNeighborhood;
import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender;
importorg.apache.mahout.cf.taste.impl.similarity.EuclideanDistanceSimilarity;
importorg.apache.mahout.cf.taste.model.DataModel;
importorg.apache.mahout.cf.taste.recommender.RecommendedItem;
importorg.apache.mahout.cf.taste.recommender.Recommender;
importorg.apache.mahout.cf.taste.similarity.UserSimilarity;
public class UserCF {
finalstatic int NEIGHBORHOOD_NUM = 2;
finalstatic int RECOMMENDER_NUM = 3;
staticString InputFile = "datafile/Buyer.csv";
staticList<Float> l = new ArrayList<Float>();
publicstatic void main(String[] args) throws IOException, TasteException {
Stringfile = InputFile;
DataModelmodel = new FileDataModel(new File(file));
UserSimilarityuser = new EuclideanDistanceSimilarity(model);
NearestNUserNeighborhoodneighbor = new NearestNUserNeighborhood(
NEIGHBORHOOD_NUM,user, model);
Recommenderr = new GenericUserBasedRecommender(model, neighbor, user);
LongPrimitiveIteratoriter = model.getUserIDs();
ReadAveBuyClick();
Iteratoriterl = l.iterator();
while(iter.hasNext()) {
longuid = iter.nextLong();
List<RecommendedItem>list = r.recommend(uid, RECOMMENDER_NUM);
System.out.printf("uid:%s",uid);
for(RecommendedItem ritem : list) {
if(ritem.getValue() >= x) {
System.out.printf("(%s,%f)",ritem.getItemID(),
ritem.getValue());
}
}
System.out.println();
}
}
}
3. 运行程序
右击UserCF.java运行就可以了。
输出:
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for furtherdetails.
uid:1(104,4.274336)(106,4.000000)
uid:2(105,4.055916)
uid:3(103,3.360987)(102,2.773169)
uid:4(102,3.000000)
uid:5
4. 推荐结果解读
1. 向用户ID1,推荐前二个最相关的物品, 104和106
2. 向用户ID2,推荐前二个最相关的物品, 但只有一个105
3. 向用户ID3,推荐前二个最相关的物品, 103和102
4. 向用户ID4,推荐前二个最相关的物品, 但只有一个102
5. 向用户ID5,推荐前二个最相关的物品, 没有符合的