安装
前提:需要安装HADOOP/HDFS/YARN、SPARK等组件,并且配置环境变量
1. 下载livy安装包
cd /opt
wget https://dlcdn.apache.org/incubator/livy/0.7.1-incubating/apache-livy-0.7.1-incubating-bin.zip
2. 解压安装包
unzip apache-livy-0.7.1-incubating-bin.zip
3. 配置
- 修改livy-env.sh
JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_211.jdk/Contents/Home
HADOOP_CONF_DIR=/Users/xxx/Documents/software/hadoop-3.3.1/etc/hadoop
SPARK_HOME=/Users/xxx/Documents/software/spark-3.2.1
SPARK_CONF_DIR=/Users/xxx/Documents/software/spark-3.2.1/conf
- 配置livy.conf
# 配置livy会话所使用的spark集群部署模式
livy.spark.master = yarn
# 配置livy会话所使用的Spark集群部署模式
livy.spark.deploy.mode = cluster
# 默认使用hiveContext
livy.repl.enable.hive-context = true
# 开启用户代理
livy.impersonation.enabled = true
# 配置session空闲过期时间
livy.server.session.timeout = 1h
# 配置thriftserver
livy.server.thrift.enabled = true
livy.server.thrift.port = 10002
# 配置 recovery
livy.server.recovery.mode = recovery
livy.server.recovery.state-store = filesystem
livy.server.recovery.state-store.url = hdfs://10.253.128.30:9000/livy/
- 配置log4j
cp log4j.properties.template log4j.properties
- 拷贝jersey-core-1.9.jar包到jars目录下
4. 启动livy
# 进入到livy目录下
cd /opt/livy-0.7.1
bin/livy-server start
访问livy-ui
curl http://ip:8998/ui
Livy配置项
配置 | header默认值 | 说明 |
---|---|---|
livy.server.spark-home | spark目录 | |
livy.spark.master | ||
livy.spark.deploy-mode | ||
livy.spark.scala-version | ||
livy.spark.version | ||
livy.session.staging-dir | ||
livy.file.upload.max.size | ||
livy.file.local-dir-whitelist | ||
livy.repl.enable-hive-context | ||
livy.environment | ||
livy.server.host | ||
livy.server.port | 8998 | |
livy.ui.basePath | ||
livy.ui.enabled | ||
livy.server.request-header.size | 131072 | |
livy.server.response-header.size | 131072 | |
livy.server.csrf-protection.enabled | false | |
livy.impersonation.enabled | false | |
livy.superusers | null | |
livy.server.access-control.enabled | false | |
livy.server.access-control.allowed-users | * | |
livy.server.access-control.modify-users | null | |
livy.server.access-control.view-users | null | |
livy.keystore | ||
livy.keystore.password | ||
livy.key-password |
Livy 使用
livy-session
通过livy-session, 可以通过rest来执行spark-shell,用于处理交互式的请求
- session的创建
curl -XPOST 'http://10.253.128.30:8998/sessions' -H 'Content-Type:application/json' --data '{"kind": "spark"}'
-
session查看
http://10.253.128.30:8998/ui -
session使用 curl -XPOST ‘http://10.253.128.30:8998/sessions/2/statements’ -H ‘Content-Type:application/json’ --d ‘{“code”: “sc.textFile(”“)”}’
注意: 待到livy server的状态转换成idle的时候,向其发送请求,才会去执行。执行时,其状态转变成busy。执行完毕之后,状态又变成idle
livyy-batch
通过livy-batch处理非交互式请求,即,相当于spark-submit操作。
examples:
curl -XPOST -H 'Content-Type:application/json' http://10.253.128.30:8998/batches --data '{"conf": {"spark.master": "yarn-cluster"}, "file": "hdfs://", "className":"", "name":"", "executorCores": "","executorMemory":"512m", "driverCores": 1, "driverMemory":"512m", "queue":"default","args":[\"100\"] }'