CentOS部署Presto

最新推荐文章于 2022-10-24 10:15:25 发布

脚气水蟑螂药

最新推荐文章于 2022-10-24 10:15:25 发布

阅读量279

点赞数

分类专栏： presto

本文链接：https://blog.csdn.net/fan_yi_bo/article/details/108793684

版权

presto 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

下载

presto官方下载地址

上传并解压

tar -zxf presto-server-0.241.tar.gz -C /opt/software/
cd /opt/software
#名字有点长，改一下目录名的
mv presto-server-0.241 presto-0.241

安装

数据目录

# 创建一个data目录，官方建议创建在安装目录外
# Presto needs a data directory for storing logs, etc. We recommend creating a data directory outside of the installation directory, which allows it to be easily preserved when upgrading Presto.
mkdir -p /var/log/presto/data

配置目录

# 创建配置目录
# Create an etc directory inside the installation directory
cd presto-0.241/
mkdir etc
# 配置目录中包含四个配置文件和一个目录分别为node.properties、jvm.properties、config.properties、log.properties、catalog，作用分别是：
# Node Properties: environmental configuration specific to each node
# JVM Config: command line options for the Java Virtual Machine
# Config Properties: configuration for the Presto server
# Log Properties: allows setting the minimum log level for named logger hierarchies
# Catalog Properties: configuration for Connectors (data sources)

node.properties

属性名称	说明
node.environment	The name of the environment. All Presto nodes in a cluster must have the same environment name
node.id	The unique identifier for this installation of Presto. This must be unique for every node. This identifier should remain consistent across reboots or upgrades of Presto. If running multiple installations of Presto on a single machine (i.e. multiple nodes on the same machine), each installation must have a unique identifier.
node.data-dir	The location (filesystem path) of the data directory. Presto will store logs and other data here.

vim etc/node.properties

node.environment=production
node.id=node1
node.data-dir=/var/log/presto/data

jvm.config

此处用的是官方给出的配置，可以根据自己环境进行调整

vim jvm.properties

-server
-Xmx16G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError

config.properties

属性名称	说明
coordinator	Allow this Presto instance to function as a coordinator (accept queries from clients and manage query execution).
node-scheduler.include-coordinator	Allow scheduling work on the coordinator. For larger clusters, processing work on the coordinator can impact query performance because the machine’s resources are not available for the critical task of scheduling, managing and monitoring query execution.
http-server.http.port	Specifies the port for the HTTP server. Presto uses HTTP for all communication, internal and external.
query.max-memory	The maximum amount of distributed memory that a query may use.
query.max-memory-per-node	The maximum amount of user memory that a query may use on any one machine.
query.max-total-memory-per-node	The maximum amount of user and system memory that a query may use on any one machine, where system memory is the memory used during execution by readers, writers, and network buffers, etc.
discovery-server.enabled	Presto uses the Discovery service to find all the nodes in the cluster. Every Presto instance will register itself with the Discovery service on startup. In order to simplify deployment and avoid running an additional service, the Presto coordinator can run an embedded version of the Discovery service. It shares the HTTP server with Presto and thus uses the same port.
discovery.uri	The URI to the Discovery server. Because we have enabled the embedded version of Discovery in the Presto coordinator, this should be the URI of the Presto coordinator. Replace example.net:8080 to match the host and port of the Presto coordinator. This URI must not end in a slash.
jmx.rmiregistry.port	Specifies the port for the JMX RMI registry. JMX clients should connect to this port.
jmx.rmiserver.port	Specifies the port for the JMX RMI server. Presto exports many metrics that are useful for monitoring via JMX.

一个节点可以即是coordinator又是worker，也可以只是coordinator或者worker，官方建议在集群模式下为了更好的性能单独部署coordinator

coordinator最低配置

coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=1GB
query.max-total-memory-per-node=2GB
discovery-server.enabled=true
discovery.uri=http://example.net:8080

worker最低配置

coordinator=false
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=1GB
query.max-total-memory-per-node=2GB
discovery.uri=http://example.net:8080

coordinator和worker在同一节点

coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8080
query.max-memory=5GB
query.max-memory-per-node=1GB
query.max-total-memory-per-node=2GB
discovery-server.enabled=true
discovery.uri=http://example.net:8080

根据个人环境进行配置，我偷懒配置在了一个节点

vim config.properties

coordinator=true
node-scheduler.include-coordinator=true
#默认是8080端口
http-server.http.port=11211
query.max-memory=4GB
query.max-memory-per-node=1GB
query.max-total-memory-per-node=2GB
discovery-server.enabled=true
#和上面的端口一致，ip换成自己的ip
discovery.uri=http://example.net:11211

log.properties

配置日志级别，可以设置为DEBUG、INFO、WARN、ERROR，默认为INFO

vim log.properties

com.facebook.presto=INFO

catalog

presto通过connector访问数据，通过在catalog对connector进行配置

#创建catalog目录
mkdir etc/catalog
#添加hiveconnector
vim etc/catalog/hive.properties

connector.name=hive-hadoop2
#指定metastore ip和端口，不知道自己端口的可以查看hive-site.xml文件中hive.metastore.port的value
hive.metastore.uri=thrift://example.net:9083
#设置配置文件
hive.config.resources=/opt/software/hadoop/etc/hadoop/core-site.xml,/opt/software/hadoop/etc/hadoop/hdfs-site.xml

connector的配置可以参考官方connector配置