Livy 安装使用说明

1.  为什么要用Livy
  • Have long running SparkContexts that can be used for multiple Spark jobs, by multiple clients
  • Share cached RDDs or Dataframes across multiple jobs and clients
  • Multiple SparkContexts can be managed simultaneously, and they run on the cluster (YARN/Mesos) instead of the Livy Server for good fault tolerance and concurrency
  • Jobs can be submitted as precompiled jars, snippets of code, or via Java/Scala client API
  • Ensure security via secure authenticated communication
  • Apache License, 100% open source

2.Livy的运行模式(local和Yarn模式)
Then we upload the Spark example jar /usr/lib/spark/lib/spark-examples.jar on HDFS and point to it. If you are using Livy in local mode and not YARN mode, just keep the local path /usr/lib/spark/lib/spark-examples.jar.
如果是Cluster模式的话,Livy会读取HDFS上的文件此时应该把依赖jar上传到HDFS上)
It is strongly recommended to configure Spark to submit applications in YARN cluster mode. That makes sure that user sessions have their resources properly accounted for in the YARN cluster, and that the host running the Livy server doesn't become overloaded when multiple user sessions are running.(当有多个session的时候为了减少Livy server的压力,建议部署成yarn的模式)

3.restful 接口
1.提交一个sparkjob
curl -X POST --data '{"file": "/opt/jars/testLivy.jar", "className": "com.testLivy.TestLivyJob"}' -H "Content-Type: application/json" localhost:8998/batches

2.查看状态( 有not_started starting idle running busy shutting_down error dead success 等状态
localhost:8998/batches/3  结果:"id": 3,  "state": "dead"

4.livy 参数修改

(1) which can be changed with the livy.server.port config option  默认端口为8998,在livy.conf中可修改参数
(2) livy.yarn.jar : this config has been replaced by separate configs listing specific archives for different Livy features. Refer to the default  livy.conf  file shipped with Livy for instructions.

//默认使用hiveContext
livy.repl.enableHiveContext = true
//开启用户代理
livy.impersonation.enabled = true
//设置session空闲过期时间
livy.server.session.timeout = 1h

{"name":"test",
"args":["2016-10-10 22:00:00"],
"proxyUser":"shilong",
"className":"com.test.livyJob",
"file":"/opt/jars/etl-livy.jar",
"jars":["/opt/jars/jar/ficus_2.10-1.0.1.jar","/opt/jars/jar/mysql-connector-java-5.1.39.jar"],
//livy hdfs上面的的依赖jar 问题
"conf":{"driverMemory":"1g","driverCores":1,"executorCores":2,"executorMemory":"3g","numExecutors":2}
}
Livy 提供的关键字参数

(16 known properties: "executorCores", "className", "conf", "driverMemory", "name", "driverCores", "pyFiles", "archives", "queue", "executorMemory", "files", "jars", "proxyUser", "numExecutors", "file" [truncated]]




 
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
使用Java编写的应用程序通过Livy提交Spark任务,您可以使用Livy的Java客户端库。以下是一些基本的步骤: 1. 首先,您需要将Livy的Java客户端库添加到您的应用程序中。您可以从Livy的Maven中央仓库中获取该库。例如,如果您使用Maven构建您的应用程序,可以在pom.xml文件中添加以下依赖项: ```xml <dependency> <groupId>org.apache.livy</groupId> <artifactId>livy-client-http</artifactId> <version>0.7.1-incubating</version> </dependency> ``` 2. 接下来,您需要创建一个LivyClient实例,该实例将用于与Livy服务器交互。例如,您可以使用以下代码创建一个LivyClient: ```java LivyClient client = new LivyClientBuilder() .setURI(new URI("http://<livy-server>:8998")) .build(); ``` 其中,`<livy-server>`是Livy服务器的主机名或IP地址。 3. 然后,您需要使用LivyClient提交Spark作业。您可以使用以下代码提交一个Java Spark作业: ```java Job job = new JavaJobBuilder(SparkJob.class) .appName("My Spark Job") .mainClass("com.example.spark.MySparkJob") .args("arg1", "arg2") .jars("/path/to/your/dependencies.jar") .pyFiles("/path/to/your/dependencies.py") .conf("spark.driver.memory", "4g") .conf("spark.executor.memory", "2g") .build(); long jobId = client.submit(job); ``` 其中,`SparkJob`是您的Spark作业类,`com.example.spark.MySparkJob`是您的Spark作业的主类,`/path/to/your/dependencies.jar`和`/path/to/your/dependencies.py`是您的Spark作业的依赖项。 4. 最后,您可以使用LivyClient获取Spark作业的状态和输出。例如,您可以使用以下代码获取Spark作业的状态: ```java JobStatus status = client.getJobStatus(jobId); ``` 您还可以使用以下代码获取Spark作业的输出: ```java List<String> output = client.getJobResult(jobId).stdout(); ``` 以上就是使用Java编写的应用程序通过Livy提交Spark任务的基本步骤。需要注意的是,Livy需要与Spark集群的网络和安全设置兼容,才能在集群模式下正常工作。因此,在使用Livy时,请确保您已经正确地设置了Spark集群的网络和安全设置。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值