Carbondata 1.4.0+Spark 2.2.1 On Yarn集成安装

微信公众号(SZBigdata-Club):后续博客的文档都会转到微信公众号中。 
1、公众号会持续给大家推送技术文档、学习视频、技术书籍、数据集等。 
2、接受大家投稿支持。 
3、对于各公司hr招聘的,可以私下联系我,把招聘信息发给我我会在公众号中进行推送。 

è¿éåå¾çæè¿°
技术交流群:59701880 深圳广州hadoop好友会 

è¿éåå¾çæè¿°

由于项目需要,近期一直在研究华为开源的carbondata项目,目前carbondata已经成为apache的顶级项目,项目活跃程度也比较高。

Carbondata介绍

纵观目前的大数据生态圈,针对于ad-hoc查询并没有什么好的解决方案,且查询效率不高。Carbondata针对ad-hoc给出了解决方案,但是目前版本这个方案并不完善。为什么说并不完善,后续会给大家解答,我们先来说下carbondata的优点

查询

  • 两级索引,减少IO:适合ad-hoc查询,任意维度组合查询场景
    • 第一级:文件索引,用于过滤hdfs block,避免扫描不必要的文件
    • 第二级:Blocklet索引,用于过滤文件内部的Blocklet,避免骚烤不必要的Blocklet
  • 延迟解码,向量化处理:适合全表扫描、汇总分析场景

数据管理

  • 数据增量入库,多级排序可调:有用户权衡入库时间和查询性能
    • 字典编码配置(Dictionary Encoding Configuration)
    • 反向索引配置(Inverted Index Configuration)
    • 排序列配置(Sort Columns Configuration)
  • 增量更新,批量合并:支持快速更新事实表或维表,闲时做Compaction合并

大规模

  • 计算与存储分离:支持从GB到PB大规模数据,万亿数据秒级响应

部署

  • Hadoop Native格式:与大数据生态无缝集成,利用已有hadoop集群资产
  • 与Spark2.x版本集成 carbondata 1.4.0新增了挺多强大的功能,主要关注的点是在datamap上,不过目前datamap对于单表的支持是比较好的,但是对于多表join的datamap支持还不够完善,还是有蛮多问题的。

单表的datamap目前包括:preaggregate、timeseries、bloomfilter、lucene

 

1

2

3

4

5

6

7

8

9

10

11

12

13

 

#创建datamap

CREATE DATAMAP [IF NOT EXISTS] ${dataMapName}

ON TABLE '${tableName}'

USING "datamap_provider" #preaggregate,timeseries,bloomfilter,lucene

DMPROPERTIES ('key'='value', ...)

AS

SELECT statement

#删除datamap

DROP DATAMAP ${dataMapName} ON TABLE ${tableName}

#查询datamap

SHOW DATAMAP ON TABLE '${tableName}'

多表datamap目前1.4.0官方文档并没有给出,我是通过carbondata jira的问题找到其创建方式的,多表的datamap预计在1.5.0版本才能使用,目前问题较多,这里暂时不建议使用,等1.5.0版本出来之后再试试看。1.5.0版本官方等版本计划是8-9月份完成。

 

1

2

3

4

 

CREATE DATAMAP [IF NOT EXISTS] ${dataMapName}

USING 'mv'

AS

SELECT statement

与Spark2.2.1+Hadoop2.7.2 On Yarn集成

前置条件

  • Hadoop HDFS 和 Yarn 需要安装和运行。
  • Spark 需要在所有的集群节点上安装并且运行。
  • CarbonData 用户需要有权限访问 HDFS.

可以通过https://dist.apache.org/repos/dist/release/carbondata/1.4.0/下载你对应的spark版本,目前carbondata支持的spark版本有两个:2.1.0、2.2.1

下载apache-carbondata-1.4.0-bin-spark2.2.1-hadoop2.7.2.jar

 

1

 

wget https://dist.apache.org/repos/dist/release/carbondata/1.4.0/apache-carbondata-1.4.0-bin-spark2.2.1-hadoop2.7.2.jar

依赖包拷贝

 

1

2

3

4

5

6

 

# 替换jersey-client、jersey-core包

cp jersey-client-1.9.jar jersey-core-1.9.jar $SPARK_HOME/jars

rm jersey-client-2.x.jar jersey-core-2.x.jar

# 拷贝apache-carbondata-1.4.0-bin-spark2.2.1-hadoop2.7.2.jar到$SPARK_HOME/jars

cp apache-carbondata-1.4.0-bin-spark2.2.1-hadoop2.7.2.jar $SPARK_HOME/jars

创建carbonlib目录且生成tar.gz压缩包

 

1

2

3

4

5

6

7

8

9

10

 

#1、创建carbonlib目录

mkdir $SPARK_HOME/carbonlib

#2、拷贝carbondata jar包到carbonlib

cp apache-carbondata-1.4.0-bin-spark2.2.1-hadoop2.7.2.jar $SPARK_HOME/carbonlib

#3、生成tar.gz包

cd $SPARK_HOME

tar -zcvf carbondata.tar.gz carbonlib/

mv carbondata.tar.gz carbonlib/

[注]:如果想要测试多表的datamap mv功能,需要编译carbondata-apache-carbondata-1.4.0-rc2/datamap/mv下的core和plan两个工程,生成carbondata-mv-core-1.4.0.jar以及carbondata-mv-plan-1.4.0.jar并拷贝到SPARKHOME/jars和SPARKHOME/jars和SPARK_HOME/carbonlib,并重新执行上述第三步的内容。

拷贝carbondata.properties

拷贝carbon.properties.template到$SPARK_HOME/conf目录且命名为carbon.properties

 

1

 

cp ./conf/carbon.properties.template $SPARK_HOME/conf/carbon.properties

 

配置spark-default.conf

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

 

spark.master yarn-client

spark.yarn.dist.files /usr/hdp/2.6.2.0-205/spark2/conf/carbon.properties

spark.yarn.dist.archives /usr/hdp/2.6.2.0-205/spark2/carbonlib/carbondata.tar.gz

spark.executor.extraJavaOptions -Dcarbon.properties.filepath=carbon.properties -XX:+OmitStackTraceInFastThrow -XX:+UseGCOverheadLimit

spark.executor.extraClassPath carbondata.tar.gz/carbonlib/*:/home/hadoop/hive/lib/*

spark.driver.extraClassPath /usr/hdp/2.6.2.0-205/spark2/carbonlib/*

spark.driver.extraJavaOptions -Dcarbon.properties.filepath=/usr/hdp/2.6.2.0-205/spark2/conf/carbon.properties -Dhdp.version=current

spark.yarn.executor.memoryOverhead 1024

spark.yarn.driver.memoryOverhead 1024

spark.yarn.am.extraJavaOptions -Dhdp.version=current

spark.yarn.scheduler.heartbeat.interval-ms 7200000

spark.executor.heartbeatInterval 7200000

spark.network.timeout 300000

验证

启动thrift server服务

 

1

 

$SPARK_HOME/bin/spark-submit --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer --num-executors 5 --driver-memory 6g --executor-memor^C10g --executor-cores 5 carbonlib/apache-carbondata-1.4.0-bin-spark2.2.1-hadoop2.7.2.jar hdfs://hacluster/user/hive/warehouse/carbon.store

执行后,任务会提交到yarn上,可以通过8088端口的web界面查看到。

如果在不配置spark-default.conf的情况下,其实也是可以运行的,不过是以单机模式运行,不会提交成yarn任务,通过4040端口可以查看到spark任务的运行情况

使用beeline链接

 

1

 

$SPARK_HOME/bin/beeline -u jdbc:hive2://<thriftserver_host>:10000 -n hdfs

以上就是基于Spark on yarn模式的carbondata集成,carbondata同时也支持Spark独立模式的集成安装。具体可参考下面的链接。

参考连接:
http://carbondata.iteblog.com/installation-guide.html
https://carbondata.apache.org/installation-guide.html

/home/hadoopmaster/jdk1.8.0_161/bin/java -javaagent:/home/hadoopmaster/idea-IC-221.6008.13/lib/idea_rt.jar=34515:/home/hadoopmaster/idea-IC-221.6008.13/bin -Dfile.encoding=UTF-8 -classpath /home/hadoopmaster/jdk1.8.0_161/jre/lib/charsets.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/deploy.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/cldrdata.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/dnsns.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/jaccess.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/jfxrt.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/localedata.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/nashorn.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/sunec.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/sunjce_provider.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/sunpkcs11.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/ext/zipfs.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/javaws.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/jce.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/jfr.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/jfxswt.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/jsse.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/management-agent.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/plugin.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/resources.jar:/home/hadoopmaster/jdk1.8.0_161/jre/lib/rt.jar:/root/IdeaProjects/kkk/out/production/kkk:/home/hadoopmaster/scala-2.12.15/lib/scala-reflect.jar:/home/hadoopmaster/scala-2.12.15/lib/scala-xml_2.12-1.0.6.jar:/home/hadoopmaster/scala-2.12.15/lib/scala-parser-combinators_2.12-1.0.7.jar:/home/hadoopmaster/scala-2.12.15/lib/scala-swing_2.12-2.0.3.jar:/home/hadoopmaster/scala-2.12.15/lib/scala-library.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/xz-1.8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jta-1.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jpam-1.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/json-1.8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/ST4-4.0.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/guice-3.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/ivy-2.5.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/oro-2.0.8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/blas-2.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/core-1.1.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/gson-2.2.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/tink-1.6.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/avro-1.10.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jsp-api-2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/okio-1.14.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/opencsv-2.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/shims-0.9.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/xmlenc-0.52.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/arpack-2.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/guava-14.0.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jetty-6.1.26.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jline-2.14.6.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jsr305-3.0.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/lapack-2.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/log4j-1.2.17.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/minlog-1.3.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/stream-2.9.6.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/velocity-1.5.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/generex-1.0.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hk2-api-2.6.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/janino-3.0.16.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jdo-api-3.0.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/objenesis-2.6.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/paranamer-2.8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/py4j-0.10.9.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/pyrolite-4.30.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/HikariCP-2.5.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-io-2.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-cli-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/javax.inject-1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/libfb303-0.9.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/lz4-java-1.7.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/okhttp-3.12.12.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/snakeyaml-1.27.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/stax-api-1.0.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/JTransforms-3.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/aopalliance-1.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/avro-ipc-1.10.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/breeze_2.12-1.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-cli-1.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-net-3.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/derby-10.14.2.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-jdbc-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hk2-utils-2.6.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/httpcore-4.4.14.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jaxb-api-2.2.11.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jersey-hk2-2.34.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jodd-core-3.5.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/orc-core-1.6.12.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/super-csv-2.2.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/xml-apis-1.4.01.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/zookeeper-3.6.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/JLargeArrays-1.5.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/activation-1.1.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/automaton-1.11-8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-dbcp-1.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-lang-2.6.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-text-1.6.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-serde-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-shims-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/javolution-5.5.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/libthrift-0.12.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/orc-shims-1.6.12.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/slf4j-api-1.7.30.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/zjsonpatch-0.3.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/zstd-jni-1.5.0-4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/chill-java-0.10.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/chill_2.12-0.10.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/guice-servlet-3.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-auth-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-hdfs-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-common-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hk2-locator-2.6.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/httpclient-4.5.13.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-xc-1.9.13.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jetty-util-6.1.26.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/joda-time-2.10.10.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kryo-shaded-4.0.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/metrics-jmx-4.2.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/metrics-jvm-4.2.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/rocksdbjni-6.20.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spire_2.12-0.17.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/xercesImpl-2.12.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/aircompressor-0.21.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/algebra_2.12-2.0.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/annotations-17.0.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/antlr4-runtime-4.8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/api-util-1.0.0-M20.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/arrow-format-2.0.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/arrow-vector-2.0.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/avro-mapred-1.10.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-codec-1.15.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-pool-1.5.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/compress-lzf-1.0.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-beeline-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/javax.jdo-3.2.0-m3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jaxb-runtime-2.3.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jersey-client-2.34.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jersey-common-2.34.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jersey-server-2.34.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/leveldbjni-all-1.8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/metrics-core-4.2.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/metrics-json-4.2.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/RoaringBitmap-0.9.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/antlr-runtime-3.5.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-math3-3.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-client-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-common-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-core-2.12.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/javassist-3.25.0-GA.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jul-to-slf4j-1.7.30.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/protobuf-java-2.5.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/snappy-java-1.1.8.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/transaction-api-1.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/bonecp-0.8.0.RELEASE.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-crypto-1.1.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-digester-1.8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-lang3-3.12.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/curator-client-2.7.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-exec-2.3.9-core.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-metastore-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-jaxrs-1.9.13.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jakarta.inject-2.6.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/orc-mapreduce-1.6.12.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/scala-xml_2.12-1.2.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/shapeless_2.12-2.3.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/slf4j-log4j12-1.7.30.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-sql_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/threeten-extra-1.5.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/zookeeper-jute-3.6.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-compress-1.21.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-logging-1.1.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/curator-recipes-2.7.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-yarn-api-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-shims-0.23-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jcl-over-slf4j-1.7.30.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/parquet-column-1.12.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/parquet-common-1.12.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/parquet-hadoop-1.12.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/scala-library-2.12.15.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/scala-reflect-2.12.15.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-core_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-hive_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-repl_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-tags_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-yarn_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/api-asn1-api-1.0.0-M20.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/breeze-macros_2.12-1.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/cats-kernel_2.12-2.1.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-httpclient-3.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/flatbuffers-java-1.9.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-llap-common-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-service-rpc-3.1.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-storage-api-2.7.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jetty-sslengine-6.1.26.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/metrics-graphite-4.2.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/netty-all-4.1.68.Final.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/parquet-jackson-1.12.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/scala-compiler-2.12.15.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-mesos_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-mllib_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spire-util_2.12-0.17.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/xbean-asm9-shaded-4.20.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/apacheds-i18n-2.0.0-M15.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/arpack_combined_all-0.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/arrow-memory-core-2.0.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-beanutils-1.9.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-compiler-3.0.16.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/curator-framework-2.7.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/datanucleus-core-4.1.17.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-shims-common-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-core-asl-1.9.13.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-databind-2.12.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jakarta.ws.rs-api-2.1.6.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-client-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/macro-compat_2.12-1.1.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/parquet-encoding-1.12.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-graphx_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-sketch_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-unsafe_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/univocity-parsers-2.9.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/arrow-memory-netty-2.0.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/datanucleus-rdbms-4.1.19.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-annotations-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-yarn-client-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-yarn-common-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-kvstore_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spire-macros_2.12-0.17.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-collections-3.2.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/commons-configuration-1.6.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/datanucleus-api-jdo-4.2.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-mapper-asl-1.9.13.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jakarta.servlet-api-4.0.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/json4s-ast_2.12-3.7.0-M11.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-catalyst_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-launcher_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/audience-annotations-0.5.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-shims-scheduler-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hive-vector-code-gen-2.3.9.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-annotations-2.12.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jakarta.xml.bind-api-2.3.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/json4s-core_2.12-3.7.0-M11.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-streaming_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spire-platform_2.12-0.17.0.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-apps-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-core-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-node-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-rbac-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/logging-interceptor-3.12.12.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/mesos-1.4.0-shaded-protobuf.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/osgi-resource-locator-1.0.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-kubernetes_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-tags_2.12-3.2.1-tests.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/aopalliance-repackaged-2.6.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/htrace-core-3.1.0-incubating.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/istack-commons-runtime-3.0.8.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jakarta.annotation-api-1.3.5.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jakarta.validation-api-2.0.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/json4s-scalap_2.12-3.7.0-M11.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-batch-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-mllib-local_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jersey-container-servlet-2.34.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/json4s-jackson_2.12-3.7.0-M11.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-common-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-events-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-policy-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-dataformat-yaml-2.12.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-datatype-jsr310-2.11.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-metrics-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-yarn-server-common-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-network-common_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jackson-module-scala_2.12-2.12.3.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-discovery-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/parquet-format-structures-1.12.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-network-shuffle_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/apacheds-kerberos-codec-2.0.0-M15.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-mapreduce-client-app-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-extensions-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-networking-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-scheduling-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-mapreduce-client-core-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-yarn-server-web-proxy-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/jersey-container-servlet-core-2.34.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-autoscaling-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-flowcontrol-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/scala-collection-compat_2.12-2.1.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/spark-hive-thriftserver_2.12-3.2.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-certificates-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-coordination-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-storageclass-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/scala-parser-combinators_2.12-1.1.2.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-mapreduce-client-common-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-apiextensions-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-mapreduce-client-shuffle-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/hadoop-mapreduce-client-jobclient-2.7.4.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/kubernetes-model-admissionregistration-5.4.1.jar:/home/hadoopmaster/spark-3.2.1-bin-hadoop2.7/jars/dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar kkk.WordCount Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 25/06/02 20:17:03 INFO SparkContext: Running Spark version 3.2.1 25/06/02 20:17:04 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 25/06/02 20:17:04 INFO ResourceUtils: ============================================================== 25/06/02 20:17:04 INFO ResourceUtils: No custom resources configured for spark.driver. 25/06/02 20:17:04 INFO ResourceUtils: ============================================================== 25/06/02 20:17:04 INFO SparkContext: Submitted application: WordCount 25/06/02 20:17:04 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0) 25/06/02 20:17:04 INFO ResourceProfile: Limiting resource is cpu 25/06/02 20:17:04 INFO ResourceProfileManager: Added ResourceProfile id: 0 25/06/02 20:17:04 INFO SecurityManager: Changing view acls to: root 25/06/02 20:17:04 INFO SecurityManager: Changing modify acls to: root 25/06/02 20:17:04 INFO SecurityManager: Changing view acls groups to: 25/06/02 20:17:04 INFO SecurityManager: Changing modify acls groups to: 25/06/02 20:17:04 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 25/06/02 20:17:05 INFO Utils: Successfully started service 'sparkDriver' on port 41615. 25/06/02 20:17:05 INFO SparkEnv: Registering MapOutputTracker 25/06/02 20:17:05 INFO SparkEnv: Registering BlockManagerMaster 25/06/02 20:17:05 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 25/06/02 20:17:05 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 25/06/02 20:17:05 INFO SparkEnv: Registering BlockManagerMasterHeartbeat 25/06/02 20:17:05 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-68486311-3f69-48e8-8f69-e7afffcf5979 25/06/02 20:17:05 INFO MemoryStore: MemoryStore started with capacity 258.5 MiB 25/06/02 20:17:05 INFO SparkEnv: Registering OutputCommitCoordinator 25/06/02 20:17:06 INFO Utils: Successfully started service 'SparkUI' on port 4040. 25/06/02 20:17:06 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://hadoopmaster:4040 25/06/02 20:17:06 INFO Executor: Starting executor ID driver on host hadoopmaster 25/06/02 20:17:06 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41907. 25/06/02 20:17:06 INFO NettyBlockTransferService: Server created on hadoopmaster:41907 25/06/02 20:17:06 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 25/06/02 20:17:06 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, hadoopmaster, 41907, None) 25/06/02 20:17:06 INFO BlockManagerMasterEndpoint: Registering block manager hadoopmaster:41907 with 258.5 MiB RAM, BlockManagerId(driver, hadoopmaster, 41907, None) 25/06/02 20:17:06 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, hadoopmaster, 41907, None) 25/06/02 20:17:06 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, hadoopmaster, 41907, None) 25/06/02 20:17:08 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 244.0 KiB, free 258.2 MiB) 25/06/02 20:17:09 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 23.4 KiB, free 258.2 MiB) 25/06/02 20:17:09 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on hadoopmaster:41907 (size: 23.4 KiB, free: 258.5 MiB) 25/06/02 20:17:09 INFO SparkContext: Created broadcast 0 from textFile at WordCount.scala:9 Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/home/hadoopmaster/words.txt at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:205) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:300) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:296) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:300) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:296) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:300) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:296) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:300) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:296) at org.apache.spark.Partitioner$.$anonfun$defaultPartitioner$4(Partitioner.scala:78) at org.apache.spark.Partitioner$.$anonfun$defaultPartitioner$4$adapted(Partitioner.scala:78) at scala.collection.immutable.List.map(List.scala:293) at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:78) at org.apache.spark.rdd.PairRDDFunctions.$anonfun$reduceByKey$4(PairRDDFunctions.scala:322) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:414) at org.apache.spark.rdd.PairRDDFunctions.reduceByKey(PairRDDFunctions.scala:322) at kkk.WordCount$.main(WordCount.scala:10) at kkk.WordCount.main(WordCount.scala) 25/06/02 20:17:09 INFO SparkContext: Invoking stop() from shutdown hook 25/06/02 20:17:09 INFO SparkUI: Stopped Spark web UI at http://hadoopmaster:4040 25/06/02 20:17:09 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 25/06/02 20:17:09 INFO MemoryStore: MemoryStore cleared 25/06/02 20:17:09 INFO BlockManager: BlockManager stopped 25/06/02 20:17:09 INFO BlockManagerMaster: BlockManagerMaster stopped 25/06/02 20:17:09 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 25/06/02 20:17:10 INFO SparkContext: Successfully stopped SparkContext 25/06/02 20:17:10 INFO ShutdownHookManager: Shutdown hook called 25/06/02 20:17:10 INFO ShutdownHookManager: Deleting directory /tmp/spark-213e3a91-227c-4e0a-8254-ab5c0f00786d Process finished with exit code 1
最新发布
06-04
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值