mahout所实现的算法

https://cwiki.apache.org/confluence/display/MAHOUT/Algorithms 列出mahout所实现或正在实现的一些算法

Classification

Logistic Regression (SGD)

Bayesian

Support Vector Machines (SVM) (open: MAHOUT-14, MAHOUT-232 and MAHOUT-334)

Perceptron and Winnow (open: MAHOUT-85)

Neural Network (open, but MAHOUT-228 might help)

Random Forests (integrated - MAHOUT-122, MAHOUT-140, MAHOUT-145)

Restricted Boltzmann Machines (open, MAHOUT-375, GSOC2010)

Online Passive Aggressive (integrated, MAHOUT-702)

Boosting (awaiting patch commit, MAHOUT-716)

Hidden Markov Models (HMM) (MAHOUT-627, MAHOUT-396, MAHOUT-734) - Training is done in Map-Reduce

Clustering

Reference Reading

Canopy Clustering (MAHOUT-3 - integrated)

K-Means Clustering (MAHOUT-5 - integrated)

Fuzzy K-Means (MAHOUT-74 - integrated)

Expectation Maximization (EM) (MAHOUT-28)

Mean Shift Clustering (MAHOUT-15 - integrated)

Hierarchical Clustering (MAHOUT-19)

Dirichlet Process Clustering (MAHOUT-30 - integrated)

Latent Dirichlet Allocation (MAHOUT-123 - integrated)

Spectral Clustering (MAHOUT-363 - integrated)

Minhash Clustering (MAHOUT-344 - integrated)

Top Down Clustering (MAHOUT-843 - integrated)

Pattern Mining

Parallel FP Growth Algorithm (Also known as Frequent Itemset mining)

Regression

Locally Weighted Linear Regression (open)

Dimension reduction

Singular Value Decomposition and other Dimension Reduction Techniques (available since 0.3)

Stochastic Singular Value Decomposition with PCA workflow (PCA and dimensionality reduction workflow is now integrated with SSVD)

Principal Components Analysis (PCA) (open)

Independent Component Analysis (open)

Gaussian Discriminative Analysis (GDA) (open)

Evolutionary Algorithms

  • NOTE: * Watchmaker support has been removed as of 0.7

see also: MAHOUT-56 (integrated)

You will find here information, examples, use cases, etc. related to Evolutionary Algorithms.

Introductions and Tutorials:

Examples:

Recommenders / Collaborative Filtering

Mahout contains both simple non-distributed recommender implementations and distributed Hadoop-based recommenders.

Vector Similarity

Mahout contains implementations that allow one to compare one or more vectors with another set of vectors.  This can be useful if one is, for instance, trying to calculate the pairwise similarity between all documents (or a subset of docs) in a corpus.

  • RowSimilarityJob – Builds an inverted index and then computes distances between items that have co-occurrences.  This is a fully distributed calculation.

  • VectorDistanceJob – Does a map side join between a set of "seed" vectors and all of the input vectors.

Other


本文出自 “某人说我技术宅” 博客,请务必保留此出处http://1992mrwang.blog.51cto.com/3265935/1337941

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值