spark
风吴痕
这个作者很懒,什么都没留下…
展开
-
pyspark-RDD API
参考:1、http://spark.apache.org/docs/latest/quick-start.html2、https://github.com/mahmoudparsian/pyspark-tutorial3、https://github.com/jkthompson/pyspark-pictures安装参考:启动spark执行 /home/wu/d翻译 2017-10-13 16:51:48 · 4676 阅读 · 0 评论 -
pyspark-Spark Streaming编程指南
参考:1、http://spark.apache.org/docs/latest/streaming-programming-guide.html2、https://github.com/apache/spark/tree/v2.2.0Spark Streaming Programming GuideOverviewA Quick ExampleBa翻译 2017-10-17 19:47:57 · 8668 阅读 · 1 评论 -
pyspark-结构化流编程指南
参考:1、http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html2、https://github.com/apache/spark/tree/v2.2.0Structured Streaming Programming GuideOverviewQui翻译 2017-10-17 20:17:12 · 557 阅读 · 0 评论 -
pyspark-聚类
参考地址:1、http://spark.apache.org/docs/latest/ml-guide.html2、https://github.com/apache/spark/tree/v2.2.03、http://spark.apache.org/docs/latest/ml-clustering.htmlK-meansfrom pyspark翻译 2017-10-18 11:09:15 · 3489 阅读 · 0 评论 -
pyspark-Collaborative 过滤
参考地址:1、http://spark.apache.org/docs/latest/ml-guide.html2、https://github.com/apache/spark/tree/v2.2.03、http://spark.apache.org/docs/latest/ml-collaborative-filtering.htmlfrom pys翻译 2017-10-18 11:13:38 · 1046 阅读 · 0 评论 -
pyspark-降维
参考地址:1、http://spark.apache.org/docs/latest/ml-guide.html2、https://github.com/apache/spark/tree/v2.2.03、http://spark.apache.org/docs/latest/mllib-dimensionality-reduction.htmlSVD Ex翻译 2017-10-18 11:35:53 · 1551 阅读 · 0 评论 -
pyspark-评估指标
参考地址:1、http://spark.apache.org/docs/latest/ml-guide.html2、https://github.com/apache/spark/tree/v2.2.03、http://spark.apache.org/docs/latest/mllib-evaluation-metrics.htmlClassificati翻译 2017-10-18 11:38:26 · 3342 阅读 · 0 评论 -
pyspark-DataFrame API
参考:1、http://spark.apache.org/docs/latest/quick-start.html2、https://github.com/mahmoudparsian/pyspark-tutorial3、https://github.com/jkthompson/pyspark-pictures4、http://spark.apache.org/docs/la翻译 2017-10-16 08:25:16 · 2244 阅读 · 0 评论 -
pyspark-hdfs数据操作
参考:1、http://spark.apache.org/docs/1.2.0/api/python/pyspark.html2、http://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.DataFrame一、SparkContext API1、读取hdfs数据转成nu原创 2017-10-16 17:04:26 · 31423 阅读 · 0 评论 -
安装Hadoop,Spark集群模式
参考:1、http://www.itnose.net/detail/6478156.htmlSpark On YARN 分布式集群安装2、http://blog.csdn.net/dream_an/article/details/52946840Hadoop2.7.3完全分布式集群部署过程Hadoop集群安装 参考:http://blog.csdn.net/dream_原创 2017-10-17 08:22:03 · 954 阅读 · 0 评论 -
TensorflowOnSpark-单机版
GetStarted_Standalone(单机版)参考别人部署:http://mp.weixin.qq.com/s/sLyjwU-FZqoOtiGasJnTZwhttps://github.com/yahoo/TensorFlowOnSpark/wiki/GetStarted_standalonehttp://blog.csdn.net/u013041398/articl原创 2017-10-17 08:32:45 · 965 阅读 · 0 评论 -
TensorflowOnSpark-集群版
参考:https://github.com/yahoo/TensorFlowOnSpark/wiki/GetStarted_YARNhttp://www.cnblogs.com/heimianshusheng/p/6768019.html基于Hadoop分布式集群YARN模式下的TensorFlowOnSpark平台搭建http://www.tuicool.com/articles原创 2017-10-17 08:43:49 · 3035 阅读 · 2 评论 -
pyspark-Spark编程指南
参考:http://spark.apache.org/docs/latest/rdd-programming-guide.htmlSpark Programming GuideOverviewLinking with SparkInitializing SparkUsing the ShellResilient Distributed Datasets (R翻译 2017-10-17 19:14:40 · 1943 阅读 · 0 评论 -
pyspark-快速开始
参考地址:http://spark.apache.org/docs/latest/quick-start.htmlQuick StartInteractive Analysis with the Spark ShellBasicsMore on Dataset OperationsCachingSelf-Contained ApplicationsWhere翻译 2017-10-17 15:09:36 · 6681 阅读 · 0 评论 -
pyspark-mllib
参考地址:https://github.com/jadianes/spark-py-notebooksMLlib: Basic Statistics and Exploratory Data Analysis #!/usr/bin/python# -*- coding: UTF-8 -*-import urllibfrom pyspark import SparkCo翻译 2017-10-17 11:06:14 · 573 阅读 · 0 评论 -
pyspark-Spark SQL, DataFrames and Datasets Guide
参考:1、https://github.com/apache/spark/tree/v2.2.02、http://spark.apache.org/docs/latest/sql-programming-guide.htmlSpark SQL, DataFrames and Datasets GuideOverviewSQLDatasets and Da翻译 2017-10-17 20:06:52 · 2187 阅读 · 0 评论 -
pyspark环境配置
参考地址:1、https://jingyan.baidu.com/article/86fae346b696633c49121a30.html使用参考:1、https://www.gitbook.com/book/aiyanbo/spark-programming-guide-zh-cn/details2、https://github.com/search?utf8=%E原创 2017-10-13 10:53:48 · 5141 阅读 · 1 评论 -
pyspark-MLlib(Data Types)
参考地址:1、http://spark.apache.org/docs/latest/ml-guide.html2、https://github.com/apache/spark/tree/v2.2.0翻译 2017-10-17 20:23:40 · 701 阅读 · 0 评论 -
pyspark-MLlib(Classification and Regression)
参考地址:1、http://spark.apache.org/docs/latest/ml-guide.html2、https://github.com/apache/spark/tree/v2.2.03、http://spark.apache.org/docs/latest/ml-classification-regression.htmlClassifica翻译 2017-10-18 10:52:34 · 2114 阅读 · 0 评论 -
pyspark -ML Pipelines
参考地址:1、http://spark.apache.org/docs/latest/ml-guide.html2、https://github.com/apache/spark/tree/v2.2.03、http://spark.apache.org/docs/latest/ml-pipeline.htmlCode examplesExample: Est翻译 2017-10-18 11:04:09 · 1207 阅读 · 0 评论 -
pyspark-ML Tuning
参考地址:1、http://spark.apache.org/docs/latest/ml-guide.html2、https://github.com/apache/spark/tree/v2.2.03、http://spark.apache.org/docs/latest/ml-tuning.htmlCross-Validationfrom pyspark.翻译 2017-10-18 11:06:31 · 1371 阅读 · 0 评论 -
pyspark-Frequent Pattern Mining
参考地址:1、http://spark.apache.org/docs/latest/ml-guide.html2、https://github.com/apache/spark/tree/v2.2.03、http://spark.apache.org/docs/latest/ml-frequent-pattern-mining.htmlfrom pyspark翻译 2017-10-18 11:16:03 · 763 阅读 · 0 评论 -
pyspark- ml-features
参考地址:1、http://spark.apache.org/docs/latest/ml-guide.html2、https://github.com/apache/spark/tree/v2.2.03、http://spark.apache.org/docs/latest/ml-features.htmlExtracting, transforming翻译 2017-10-18 11:19:18 · 4391 阅读 · 0 评论 -
pyspark-教程
参考:1、https://github.com/mahmoudparsian/pyspark-tutorialDownload, Install Spark and Run PySparkBasics of PySparkPySpark Examples and TutorialsDNA Base CountingClassic Word CountFind翻译 2017-10-16 11:07:30 · 2853 阅读 · 0 评论 -
pyspark-csv To DataFrame
参考:https://github.com/seahboonsiew/pyspark-csvcsv数据介绍# blah.csvName, Model, Size, Width, DtJag, 63, 4, 4, '2014-12-23'Pog, 7.0, 5, 5, '2014-12-23'Peek, 68 xp, 5, 5.5, ''Usage翻译 2017-10-16 13:54:43 · 1584 阅读 · 0 评论 -
pyspark-RDD
参考地址:https://github.com/jadianes/spark-py-notebooksRDD creation #!/usr/bin/python# -*- coding: UTF-8 -*-import urllibfrom pyspark import SparkContext,SparkConff = urllib.urlretrieve ("ht翻译 2017-10-17 10:06:16 · 498 阅读 · 0 评论 -
pyspark-Data Frames
参考地址:https://github.com/jadianes/spark-py-notebooksSpark SQL and Data Frames #!/usr/bin/python# -*- coding: UTF-8 -*-import urllibfrom pyspark import SparkContext,SparkConff = urllib.url翻译 2017-10-17 11:37:28 · 436 阅读 · 0 评论