Livy 安装使用说明

最新推荐文章于 2024-08-07 09:41:53 发布

bigdataCoding

最新推荐文章于 2024-08-07 09:41:53 发布

阅读量7.2k

点赞数 1

分类专栏： Livy

本文链接：https://blog.csdn.net/UnionIBM/article/details/52809141

版权

Livy 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

1. 为什么要用Livy

Have long running SparkContexts that can be used for multiple Spark jobs, by multiple clients
Share cached RDDs or Dataframes across multiple jobs and clients
Multiple SparkContexts can be managed simultaneously, and they run on the cluster (YARN/Mesos) instead of the Livy Server for good fault tolerance and concurrency
Jobs can be submitted as precompiled jars, snippets of code, or via Java/Scala client API
Ensure security via secure authenticated communication
Apache License, 100% open source

2.Livy的运行模式（local和Yarn模式）

Then we upload the Spark example jar /usr/lib/spark/lib/spark-examples.jar on HDFS and point to it. If you are using Livy in local mode and not YARN mode, just keep the local path /usr/lib/spark/lib/spark-examples.jar.

（如果是Cluster模式的话，Livy会读取HDFS上的文件此时应该把依赖jar上传到HDFS上)

It is strongly recommended to configure Spark to submit applications in YARN cluster mode. That makes sure that user sessions have their resources properly accounted for in the YARN cluster, and that the host running the Livy server doesn't become overloaded when multiple user sessions are running.(当有多个session的时候为了减少Livy server的压力，建议部署成yarn的模式)

3.restful 接口

1.提交一个sparkjob

curl -X POST --data '{"file": "/opt/jars/testLivy.jar", "className": "com.testLivy.TestLivyJob"}' -H "Content-Type: application/json" localhost:8998/batches

2.查看状态（有not_started starting idle running busy shutting_down error dead success 等状态）

localhost:8998/batches/3 结果："id": 3, "state": "dead"

4.livy 参数修改

(1) which can be changed with the livy.server.port config option 默认端口为8998，在livy.conf中可修改参数

(2) livy.yarn.jar : this config has been replaced by separate configs listing specific archives for different Livy features. Refer to the default livy.conf file shipped with Livy for instructions.

//默认使用hiveContext
livy.repl.enableHiveContext = true
//开启用户代理
livy.impersonation.enabled = true
//设置session空闲过期时间
livy.server.session.timeout = 1h

{"name":"test",
"args":["2016-10-10 22:00:00"],
"proxyUser":"shilong",
"className":"com.test.livyJob",
"file":"/opt/jars/etl-livy.jar",
"jars":["/opt/jars/jar/ficus_2.10-1.0.1.jar","/opt/jars/jar/mysql-connector-java-5.1.39.jar"],//livy hdfs上面的的依赖jar 问题
"conf":{"driverMemory":"1g","driverCores":1,"executorCores":2,"executorMemory":"3g","numExecutors":2}
}

Livy 提供的关键字参数

(16 known properties: "executorCores", "className", "conf", "driverMemory", "name", "driverCores", "pyFiles", "archives", "queue", "executorMemory", "files", "jars", "proxyUser", "numExecutors", "file" [truncated]]