下载
上传并解压
tar -zxf presto-server-0.241.tar.gz -C /opt/software/
cd /opt/software
#名字有点长,改一下目录名的
mv presto-server-0.241 presto-0.241
安装
数据目录
# 创建一个data目录,官方建议创建在安装目录外
# Presto needs a data directory for storing logs, etc. We recommend creating a data directory outside of the installation directory, which allows it to be easily preserved when upgrading Presto.
mkdir -p /var/log/presto/data
配置目录
# 创建配置目录
# Create an etc directory inside the installation directory
cd presto-0.241/
mkdir etc
# 配置目录中包含四个配置文件和一个目录分别为node.properties、jvm.properties、config.properties、log.properties、catalog,作用分别是:
# Node Properties: environmental configuration specific to each node
# JVM Config: command line options for the Java Virtual Machine
# Config Properties: configuration for the Presto server
# Log Properties: allows setting the minimum log level for named logger hierarchies
# Catalog Properties: configuration for Connectors (data sources)
node.properties
属性名称 | 说明 |
---|---|
node.environment | The name of the environment. All Presto nodes in a cluster must have the same environment name |
node.id | The unique identifier for this installation of Presto. This must be unique for every node. This identifier should remain consistent across reboots or upgrades of Presto. If running multiple installations of Presto on a single machine (i.e. multiple nodes on the same machine), each installation must have a unique identifier. |
node.data-dir | The location (filesystem path) of the data directory. Presto will store logs and other data here. |
vim etc/node.properties
node.environment=production
node.id=node1
node.data-dir=/var/log/presto/data
jvm.config
此处用的是官方给出的配置,可以根据自己环境进行调整
vim jvm.properties
-server
-Xmx16G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError
config.properties
属性名称 | 说明 |
---|---|
coordinator | Allow this Presto instance to function as a coordinator (accept queries from clients and manage query execution). |
node-scheduler.include-coordinator | Allow scheduling work on the coordinator. For larger clusters, processing work on the coordinator can impact query performance because the machine’s resources are not available for the critical task of scheduling, managing and monitoring query execution. |
http-server.http.port | Specifies the port for the HTTP server. Presto uses HTTP for all communication, internal and external. |
query.max-memory | The maximum amount of distributed memory that a query may use. |
query.max-memory-per-node | The maximum amount of user memory that a query may use on any one machine. |
query.max-total-memory-per-node | The maximum amount of user and system memory that a query may use on any one machine, where system memory is the memory used during execution by readers, writers, and network buffers, etc. |
discovery-server.enabled | Presto uses the Discovery service to find all the nodes in the cluster. Every Presto instance will register itself with the Discovery service on startup. In order to simplify deployment and avoid running an additional service, the Presto coordinator can run an embedded version of the Discovery service. It shares the HTTP server with Presto and thus uses the same port. |
discovery.uri | The URI to the Discovery server. Because we have enabled the embedded version of Discovery in the Presto coordinator, this should be the URI of the Presto coordinator. Replace example.net:8080 to match the host and port of the Presto coordinator. This URI must not end in a slash. |
jmx.rmiregistry.port | Specifies the port for the JMX RMI registry. JMX clients should connect to this port. |
jmx.rmiserver.port | Specifies the port for the JMX RMI server. Presto exports many metrics that are useful for monitoring via JMX. |
一个节点可以即是coordinator又是worker,也可以只是coordinator或者worker,官方建议在集群模式下为了更好的性能单独部署coordinator
coordinator最低配置
coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=1GB
query.max-total-memory-per-node=2GB
discovery-server.enabled=true
discovery.uri=http://example.net:8080
worker最低配置
coordinator=false
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=1GB
query.max-total-memory-per-node=2GB
discovery.uri=http://example.net:8080
coordinator和worker在同一节点
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8080
query.max-memory=5GB
query.max-memory-per-node=1GB
query.max-total-memory-per-node=2GB
discovery-server.enabled=true
discovery.uri=http://example.net:8080
根据个人环境进行配置,我偷懒配置在了一个节点
vim config.properties
coordinator=true
node-scheduler.include-coordinator=true
#默认是8080端口
http-server.http.port=11211
query.max-memory=4GB
query.max-memory-per-node=1GB
query.max-total-memory-per-node=2GB
discovery-server.enabled=true
#和上面的端口一致,ip换成自己的ip
discovery.uri=http://example.net:11211
log.properties
配置日志级别,可以设置为DEBUG、INFO、WARN、ERROR,默认为INFO
vim log.properties
com.facebook.presto=INFO
catalog
presto通过connector访问数据,通过在catalog对connector进行配置
#创建catalog目录
mkdir etc/catalog
#添加hiveconnector
vim etc/catalog/hive.properties
connector.name=hive-hadoop2
#指定metastore ip和端口,不知道自己端口的可以查看hive-site.xml文件中hive.metastore.port的value
hive.metastore.uri=thrift://example.net:9083
#设置配置文件
hive.config.resources=/opt/software/hadoop/etc/hadoop/core-site.xml,/opt/software/hadoop/etc/hadoop/hdfs-site.xml
connector的配置可以参考官方connector配置
启动
激动人心的时刻到了,成败在此一举
守护进程启动
bin/launcher start
调试启动
bin/launcher run
启动成功后访问刚才配置的discovery.uri的地址即可,没启动成功就进data目录下的log目录里查看什么问题导致的启动失败