spark eclipse java_在Eclipse上运行Spark(Standalone,Yarn-Client)-CSDN博客

本文链接：https://blog.csdn.net/weixin_42314648/article/details/114183145

欢迎转载，且请注明出处，在文章页面明显位置给出原文连接。原文链接：http://www.cnblogs.com/zdfjf/p/5175566.html我们知道有eclipse的Hadoop插件，能够在eclipse上操作hdfs上的文件和新建mapreduce程序，以及以Run On Hadoop方式运行程序。那么我们可不可以直接在eclipse上运行Spark程序，提交到集群上以YARN-Cl...

摘要由CSDN通过智能技术生成

欢迎转载，且请注明出处，在文章页面明显位置给出原文连接。

原文链接：http://www.cnblogs.com/zdfjf/p/5175566.html

我们知道有eclipse的Hadoop插件，能够在eclipse上操作hdfs上的文件和新建mapreduce程序，以及以Run On Hadoop方式运行程序。那么我们可不可以直接在eclipse上运行Spark程序，提交到集群上以YARN-Client方式运行，或者以Standalone方式运行呢？

答案是可以的。下面我来介绍一下如何在eclipse上运行Spark的wordcount程序。我用的hadoop 版本为2.6.2,spark版本为1.5.2。

1.Standalone方式运行

1.1 新建一个普通的java工程即可，下面直接上代码，

1 /*

2 * Licensed to the Apache Software Foundation (ASF) under one or more3 * contributor license agreements. See the NOTICE file distributed with4 * this work for additional information regarding copyright ownership.5 * The ASF licenses this file to You under the Apache License, Version 2.06 * (the "License"); you may not use this file except in compliance with7 * the License. You may obtain a copy of the License at8 *9 *http://www.apache.org/licenses/LICENSE-2.0

10 *11 * Unless required by applicable law or agreed to in writing, software12 * distributed under the License is distributed on an "AS IS" BASIS,13 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.14 * See the License for the specific language governing permissions and15 * limitations under the License.16 */

18 packagecom.frank.spark;19

20 importscala.Tuple2;21 importorg.apache.spark.SparkConf;22 importorg.apache.spark.api.java.JavaPairRDD;23 importorg.apache.spark.api.java.JavaRDD;24 importorg.apache.spark.api.java.JavaSparkContext;25 importorg.apache.spark.api.java.functio