spark's deploy mode

Two deploy mode: client and cluster.

 A common deployment strategy is to submit your application from a gateway machine that is physically co-located with your worker machines (e.g. Master node in a standalone EC2 cluster). In this setup, client mode is appropriate. In client mode, the driver is launched directly within thespark-submit process which acts as a client to the cluster. The input and output of the application is attached to the console. Thus, this mode is especially suitable for applications that involve the REPL (e.g. Spark shell).

Alternatively, if your application is submitted from a machine far from the worker machines (e.g. locally on your laptop), it is common to usecluster mode to minimize network latency between the drivers and the executors. Note that cluster mode is currently not supported for Mesos clusters. Currently only YARN supports cluster mode for Python applications

When I look up the help of spark-submit by command spark-submit --help, I get :

--deploy-mode: Whether to launch the driver program locally("client") or on one of the worker machines inside the cluster("cluster") (default:client)

显然,如果我们在集群中的master上提交程序,master上会跑driver program,采用的是client mode.

但是如果是在集群的其他的节点呢,提交程序呢?

这里面涉及driver program 会在哪里运行的问题。

So what's driver program?

Driver program it the process running the main() function of the application and creating the sprakContext.

在哪个节点提交,driver program就在哪个节点运行。


references:

[1]http://spark.apache.org/docs/latest/submitting-applications.html(Accessed:2016-06-02)

[2]http://spark.apache.org/docs/latest/cluster-overview.html(Accessed:2016-06-02)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值