spark
Star先生
技术为本,行业为先!
展开
-
《machine learning with spark》学习笔记--推荐模型
Prepare dataDataSource DownloadUpload data to HDFSIt’s easy for the programmers who are familiar to hadoop , not repeat them here, HDFS path data herein is located in hdfs://master:9000/user/root/inpu原创 2016-02-02 21:21:38 · 1498 阅读 · 0 评论 -
《machine learning with spark》学习笔记--分类
In this article, you will learn the basics of classification models and how they can be used in a variety of contexts. Classification generically refers to classifying things into distinct categories o原创 2016-02-13 22:09:54 · 1033 阅读 · 0 评论 -
《machine learning with spark》学习笔记--聚类
Next, we will consider the case when we do not have labeled data available.This is called unsupervised learning, as the model is not supervised with the true target label. The unsupervised case is very原创 2016-02-14 23:01:44 · 1789 阅读 · 0 评论 -
《machine learning with spark》学习笔记--文本挖掘
We will introduce more advanced text processing techniques available in MLlib to work with large-scale text datasets.In this article, we will: Work through detailed examples that illustrate data proces原创 2016-02-15 21:11:11 · 2475 阅读 · 0 评论 -
Spark学习笔记(一)--Spark架构
Spark架构采用了分布式计算中的Master-Slave模型。Master是对应集群中的含有Master进程的节点,Slave是集群中含有Worker进程的节点。Master作为整个集群的控制器,负责整个集群的正常运行;Worker相当于是计算节点,接收主节点命令与进行状态汇报;Executor负责任务的执行;Cluster作为用户的客户端负责提交应用,Driver负责控制一个应用的执行。具体如下原创 2016-03-10 11:15:39 · 3811 阅读 · 0 评论