Spark Machine Learning 总览

Spark的ML(Machine Learning)库提供了主流数据统计/挖掘算法的实现,威廉将在本文中做一个总览,具体的解析将会在之后的文章中来写

分类与回归算法

算法Spark算法类Spark模型类
SVM支持向量机SVMWithSGDSVMModel
Logistic回归LogisticRegressionWithLBFGS;LogisticRegressionWithSGDLogisticRegressionModel
线性回归LinearRegressionWithSGDLinearRegressionModel
实时线性回归StreamingLinearRegressionWithSGDLinearRegressionModel
岭回归RidgeRegressionWithSGDRidgeRegressionModel
Lasso回归LassoWithSGDLassoModel
朴素贝叶斯NaiveBayesNaiveBayesModel
决策树DecisionTreeDecisionTreeModel
随机森林RandomForestRandomForestModel
Gradient-Boosted TreesGradientBoostedTreesGradientBoostedTreesModel
Isotonic regressionIsotonicRegressionIsotonicRegressionModel

协同过滤算法

算法Spark算法类Spark模型类
alternating least squares (ALS)ALSMatrixFactorizationModel

聚类算法

算法Spark算法类Spark模型类
k-meansKMeansKMeansModel
Gaussian mixtureGaussianMixtureGaussianMixtureModel
power iteration clustering (PIC)PowerIterationClusteringPowerIterationClusteringModel
latent Dirichlet allocation (LDA)LDADistributedLDAModel
streaming k-meansStreamingKMeansKMeansModel

降维算法

算法Spark算法类
singular value decomposition (SVD)RowMatrix.computeSVD
principal component analysis (PCA)RowMatrix.computePrincipalComponents

特征提取与转换

算法Spark算法类Spark模型类
TF-IDFHashingTF;IDF
Word2VecWord2VecWord2VecModel
Standard ScalerStandardScalerStandardScalerModel
NormalizerNormalizer

频繁项集的挖掘

算法Spark算法类
FP-growthFPGrowth
association rulesAssociationRules
PrefixSpanPrefixSpan
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Machine Learning with Spark - Second Edition by Rajdeep Dua English | 4 May 2017 | ASIN: B01DPR2ELW | 532 Pages | AZW3 | 9.6 MB Key Features Get to the grips with the latest version of Apache Spark Utilize Spark's machine learning library to implement predictive analytics Leverage Spark’s powerful tools to load, analyze, clean, and transform your data Book Description This book will teach you about popular machine learning algorithms and their implementation. You will learn how various machine learning concepts are implemented in the context of Spark ML. You will start by installing Spark in a single and multinode cluster. Next you'll see how to execute Scala and Python based programs for Spark ML. Then we will take a few datasets and go deeper into clustering, classification, and regression. Toward the end, we will also cover text processing using Spark ML. Once you have learned the concepts, they can be applied to implement algorithms in either green-field implementations or to migrate existing systems to this new platform. You can migrate from Mahout or Scikit to use Spark ML. By the end of this book, you will acquire the skills to leverage Spark's features to create your own scalable machine learning applications and power a modern data-driven business. What you will learn Get hands-on with the latest version of Spark ML Create your first Spark program with Scala and Python Set up and configure a development environment for Spark on your own computer, as well as on Amazon EC2 Access public machine learning datasets and use Spark to load, process, clean, and transform data Use Spark's machine learning library to implement programs by utilizing well-known machine learning models Deal with large-scale text data, including feature extraction and using text data as input to your machine learning models Write Spark functions to evaluate the performance of your machine learning models

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值