Spark面试术语总结

今天你比昨天更博学了么,今天你比昨天更进步了么,雨爱把时间荒废在碌碌无为上,还不如踏踏实实学点东西,可能你进步很慢,只要你不放弃,一定可以的 ,,,,,,,,,
————————————————-送给正在努力的你
今天的学习:
Glossary
The following table summarizes terms you’ll see used to refer to cluster concepts:

Term Meaning
Application User program built on Spark. Consists of a driver program and executors on the cluster.
应用程序 用户程序构建在Spark上。由集群上的驱动程序和执行程序组成。
Application jar A jar containing the user’s Spark application. In some cases users will want to create an “uber jar” containing their application along with its dependencies. The user’s jar should never include Hadoop or Spark libraries, however, these will be added at runtime.
应用程序jia 包含用户的Spark应用程序的jar。在某些情况下,用户希望创建一个“uber jar”,其中包含他们的应用程序及其依赖项。用户的jar不应该包含Hadoop或Spark库,但是,这些库将在运行时添加。
Driver program The process running the main() function of the application and creating the SparkContext
驱动程序 运行应用程序的main()函数并创建SparkContext的进程
Cluster manager An external service for acquiring resources on the cluster (e.g. standalone manager, Mesos, YARN)
集群管理器 获取集群资源的外部服务(例如,独立管理器、Mesos、YARN)
Deploy mode Distinguishes where the driver process runs. In “cluster” mode, the framework launches the driver inside of the cluster. In “client” mode, the submitter launches the driver outside of the cluster.
部署模式区分驱动程序流程在何处运行。在“集群”模式下,框架在集群内部启动驱动程序。在“客户端”模式下,提交器在集群之外启动驱动程序。
Worker node Any node that can run application code in the cluster
工作节点 可以在集群中运行应用程序代码的任何节点
Executor A process launched for an application on a worker node, that runs tasks and keeps data in memory or disk storage across them. Each application has its own executors.
执行器 为工作节点上的应用程序启动的进程,它运行任务并将数据保存在内存或磁盘存储中。每个应用程序都有自己的执行器。
Task A unit of work that will be sent to one executor
tast 将发送给一个执行程序的工作单元
Job A parallel computation consisting of multiple tasks that gets spawned in response to a Spark action (e.g. save, collect); you’ll see this term used in the driver’s logs.
jop 作业由多个任务组成的并行计算,这些任务在触发操作(例如保存、收集)时生成;您将在驱动程序日志中看到这个术语。
Stage Each job gets divided into smaller sets of tasks called stages that depend on each other (similar to the map and reduce stages in MapReduce); you’ll see this term used in the driver’s logs.
阶段每个作业被划分成更小的任务集,称为相互依赖的阶段(类似于MapReduce中的map和reduce阶段);您将在驱动程序日志中看到这个术语。
总结一下:
Application = 1 Driver + N Executors
Driver:Process, main()创建SparkContext
client:
cluster:
Executor:Process,执行task(map/filter)
Worker ==> NM
算子
Job ==> action
Stage ==> 一个Job可能会被切分成多个stage
官网地址:https://spark.apache.org/docs/latest/cluster-overview.html
https://spark.apache.org/》Documentation》LatestRelease(Spark2.4.0)》Deploving》Overview
寻找流程

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值