一:Mahout配置
1、下载mahout-0.9
sudo wget https://archive.apache.org/dist/mahout/0.9/mahout-distribution-0.9.tar.gz
2、解压并修改文件名
sudo tar -zxvf mahout-distribution-0.9.tar.gz
sudo mv mahout-distribution-0.9 mahout
3、配置环境变量
sudo vim /etc/profile
export MAHOUT_HOME=/usr/local/mahout export MAHOUT_CONF_DIR=$MAHOUT_HOME/conf
重新编译配置文件
source /etc/profile
4、环境测试,如下
mahout
二:算法测试
1、获取测试数据
wget http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data
2、上传hadoop集群
hadoop fs -mkdir testdata hadoop fs -put synthetic_control.data testdata
3、测试算法
cd /usr/local/mahout/ # canopy hadoop jar mahout-examples-0.9-job.jar org.apache.mahout.clustering.syntheticcontrol.canopy.Job # kmeans hadoop jar mahout-examples-0.9-job.jar org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
Hadoop环境搭建:见 http://blog.csdn.net/baalhuo/article/details/51440765
mahout_quick_guide:go to it
Mahout 0.9 API Doc
Mahout 0.10.x API Doc