setup spark-cluster并运行sparksql example

setup spark-cluster并运行sparksql example

准备spark-standalone集群环境

上传spark-2.1.1-bin-hadoop2.7.tgz到/root下

tar zxvf spark-2.1.1-bin-hadoop2.7.tgz

此为演示,生产环境略有不同

cd spark-2.1.1-bin-hadoop2.7/conf

cp spark-env.sh.template spark-env.sh

echo SPARK_MASTER_HOST=0.0.0.0>>spark-env.sh

cd ../sbin

systemctl stop firewalld.service

systemctl disable firewalld.service

在开发机上确保telnet masterip 7077正常

cd spark-2.1.1-bin-hadoop2.7/sbin

[root@t430 sbin]# ./start-all.sh

starting org.apache.spark.deploy.master.Master, logging to /root/spark-2.1.1-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.master.Master-1-t430.out

localhost: starting org.apache.spark.deploy.worker.Worker, logging to /root/spark-2.1.1-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-t430.out

[root@t430 sbin]# cat /root/spark-2.1.1-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.master.Master-1-t430.out

Spark Command: /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el7_3.x86_64/jre/bin/java -cp /root/spark-2.1.1-bin-hadoop2.7/conf/:/root/spark-2.1.1-bin-hadoop2.7/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host t430 --port 7077 --webui-port 8080

========================================

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties

17/05/27 11:01:43 INFO Master: Started daemon with process name: 25154@t430

17/05/27 11:01:43 INFO SignalUtils: Registered signal handler for TERM

17/05/27 11:01:43 INFO SignalUtils: Registered signal handler for HUP

17/05/27 11:01:43 INFO SignalUtils: Registered signal handler for INT

17/05/27 11:01:43 WARN Utils: Your hostname, t430 resolves to a loopback address: 127.0.0.1; using 192.168.1.110 instead (on interface wlp3s0)

17/05/27 11:01:43 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address

17/05/27 11:01:44 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

17/05/27 11:01:44 INFO SecurityManager: Changing view acls to: root

17/05/27 11:01:44 INFO SecurityManager: Changing modify acls to: root

17/05/27 11:01:44 INFO SecurityManager: Changing view acls groups to:

17/05/27 11:01:44 INFO SecurityManager: Changing modify acls groups to:

17/05/27 11:01:44 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()

17/05/27 11:01:44 INFO Utils: Successfully started service 'sparkMaster' on port 7077.

17/05/27 11:01:44 INFO Master: Starting Spark master at spark://t430:7077

17/05/27 11:01:44 INFO Master: Running Spark version 2.1.1

17/05/27 11:01:44 INFO Utils: Successfully started service 'MasterUI' on port 8080.

17/05/27 11:01:44 INFO MasterWebUI: Bound MasterWebUI to 0.0.0.0, and started at http://192.168.1.110:8080

17/05/27 11:01:44 INFO Utils: Successfully started service on port 6066.

17/05/27 11:01:44 INFO StandaloneRestServer: Started REST server for submitting applications on port 6066

17/05/27 11:01:44 INFO Master: I have been elected leader! New state: ALIVE

17/05/27 11:01:47 INFO Master: Registering worker 192.168.1.110:15325 with 8 cores, 14.4 GB RAM

导出计算任务jar

以JavaSparkHiveExample为例

package :org.apache.spark.examples.sql

检查JavaSparkHiveExample.java

确保其中:

.config("spark.master", "local") //HERE

spark.sql("LOAD DATA LOCAL INPATH '/tmp/examples/src/main/resources/kv1.txt' INTO TABLE src");

Figure 1右键JavaSparkHiveExample.java选择export

clip_image001

Figure 2注意导出的jar文件

clip_image002

上传计算任务和数据文件

上传项目内目录example,整个目录到服务器/tmp下,具体位置在上文代码修改部分。

上传导出的sparksql.jar到/home下。

[root@t430 ~]# ls -R /tmp/examples/

/tmp/examples/:

src

/tmp/examples/src:

main

/tmp/examples/src/main:

resources

/tmp/examples/src/main/resources:

employees.json full_user.avsc kv1.txt people.json people.txt user.avsc users.avro users.parquet

[root@t430 ~]# ls /home/sparksql.jar

/home/sparksql.jar

[root@t430 ~]#

提交任务jar到集群

先上传资源文件和任务jar到集群后,执行下面命令

[root@t430 spark-2.1.1-bin-hadoop2.7]# pwd

/root/spark-2.1.1-bin-hadoop2.7

[root@t430 spark-2.1.1-bin-hadoop2.7]#

bin/spark-submit --class org.apache.spark.examples.sql.hive.JavaSparkHiveExample --master spark://192.168.1.110:7077 --executor-memory 10G --total-executor-cores 6 /home/sparksql.jar

计算结果如下,部分。

17/05/27 15:34:11 INFO CodeGenerator: Code generated in 8.29917 ms

+---+------+---+------+

|key| value|key| value|

+---+------+---+------+

| 2| val_2| 2| val_2|

| 2| val_2| 2| val_2|

| 4| val_4| 4| val_4|

| 4| val_4| 4| val_4|

| 5| val_5| 5| val_5|

| 5| val_5| 5| val_5|

| 5| val_5| 5| val_5|

| 5| val_5| 5| val_5|

| 5| val_5| 5| val_5|

| 5| val_5| 5| val_5|

| 8| val_8| 8| val_8|

| 8| val_8| 8| val_8|

| 9| val_9| 9| val_9|

| 9| val_9| 9| val_9|

| 10|val_10| 10|val_10|

| 10|val_10| 10|val_10|

| 11|val_11| 11|val_11|

| 11|val_11| 11|val_11|

| 12|val_12| 12|val_12|

| 12|val_12| 12|val_12|

+---+------+---+------+

only showing top 20 rows

17/05/27 15:34:11 INFO SparkUI: Stopped Spark web UI at http://192.168.1.110:4040

17/05/27 15:34:11 INFO StandaloneSchedulerBackend: Shutting down all executors

17/05/27 15:34:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down

17/05/27 15:34:11 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!

17/05/27 15:34:11 INFO MemoryStore: MemoryStore cleared

17/05/27 15:34:11 INFO BlockManager: BlockManager stopped

17/05/27 15:34:11 INFO BlockManagerMaster: BlockManagerMaster stopped

17/05/27 15:34:11 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!

17/05/27 15:34:11 INFO SparkContext: Successfully stopped SparkContext

17/05/27 15:34:11 INFO ShutdownHookManager: Shutdown hook called

17/05/27 15:34:11 INFO ShutdownHookManager: Deleting directory /tmp/spark-2c2b1724-6cc4-4e3f-8677-097a49c32709

[root@t430 spark-2.1.1-bin-hadoop2.7]#

总结

本文演示了将sparksql-hive例子在spark-standalone集群中运行。未使用hdfs。

转载于:https://www.cnblogs.com/wifi0/p/6950162.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值