Spark机器学习有哪些算法?
Algorithms 算法:
MLlib contains many algorithms and utilities, including: MLLib包括许多算法和工具,包括:
- Classification: logistic regression, naive Bayes,... 分类:逻辑回归,朴素贝叶斯……
- Regression: generalized linear regression, isotonic regression,... 回归:线性回归,保序回归
- Decision trees, random forests, and gradient-boosted trees 决策树:随机森林,梯度提升决策树
- Recommendation: alternating least squares (ALS) 推荐:交替最小二乘法 (ALS)
- Clustering: K-means, Gaussian mixtures (GMMs),... 聚类:K-means,高斯混合模型(GMMs),...
- Topic modeling: latent Dirichlet allocation (LDA) 主题模型:隐含狄利克雷分布 (LDA)
- Feature transformations: standardization, normalization, hashing,... 特征传播:标准、正常、哈希
- ..
- Model evaluation and hyper-parameter tuning 模型评估和超参数整定
- ML Pipeline construction 机器学习管道创建
- ML persistence: saving and loading models and Pipelines 机器学习持久化:保持和载入模型和管道
- Survival analysis: accelerated failure time model 生存分析:加速失效时间模型
- Frequent itemset and sequential pattern mining: FP-growth, 频繁项集挖掘和序列模式挖掘技术: association rules, PrefixSpan FP-growth算法,关联规则
- Distributed linear algebra: singular value decomposition (SVD), 分布式线性代数:奇异值分解 (SVD),
- principal component analysis (PCA),... 主成分分析(PCA),...
- Statistics: summary statistics, hypothesis testing,... 统计:汇总统计,假设检验,...