-
目录
1. spark-shell
-
dos命令行下输入
spark-shell
-
引入依赖:
spark-shell --jars /path/myjar1.jar,/path/myjar2.jar
-
指定资源:
spark-shell --master yarn-client --driver-memory 16g --num-executors 60 --executor-memory 20g --executor-cores 2
-
自动加载内容
-
显示日志级别
spark.sparkContext.setLogLevel("ERROR")
-
-
intellij配置
-
修改pom文件添加依赖
<properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> <encoding>UTF-8</encoding> <scala.version>2.11.8</scala.version> <spark.version>2.2.0</spark.version> <hadoop.version>2.7.1</hadoop.version> <scala.compat.version>2.11</scala.compat.version> </properties> <!--声明并引入公有的依赖--> <dependencies> <dependency> <groupId>org.scala-lang</groupId> <artifactId>scala-library</artifactId> <version>${scala.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>${hadoop.version}</version> </dependency> </dependencies>
-
-
定义spark和sc
-
定义spark
val spark = SparkSession.builder().appName("Word Count").getOrCreate()
-
定义sc
sc = spark.sparkContext()
-
【极简spark教程】开始实战
于 2022-04-09 23:41:17 首次发布