Dolphinscheduler 安装笔记

一. DolphinScheduler部署说明

1.1 软硬件环境要求

1.1.1 操作系统版本要求

操作系统版本
Red Hat Enterprise Linux7.0 及以上
CentOS7.0 及以上
Oracle Enterprise Linux7.0 及以上
Ubuntu LTS16.04 及以上

1.1.2 服务器硬件要求

CPU内存网络
4核+8 GB+千兆网卡

1.2 部署模式

DolphinScheduler支持多种部署模式,包括单机模式(Standalone)、伪集群模式(Pseudo-Cluster)、集群模式(Cluster)等。
1.2.1 单机模式
单机模式(standalone)模式下,所有服务均集中于一个StandaloneServer进程中,并且其中内置了注册中心Zookeeper和数据库H2。只需配置JDK环境,就可一键启动DolphinScheduler,快速体验其功能。
1.2.2 伪集群模式
伪集群模式(Pseudo-Cluster)是在单台机器部署 DolphinScheduler 各项服务,该模式下master、worker、api server、logger server等服务都只在同一台机器上。Zookeeper和数据库需单独安装并进行相应配置。
1.2.3 集群模式
集群模式(Cluster)与伪集群模式的区别就是在多台机器部署 DolphinScheduler各项服务,并且可以配置多个Master及多个Worker。

二. DolphinScheduler集群模式部署

2.1 集群规划

集群模式下,可配置多个Master及多个Worker。通常可配置2~3个Master,若干个Worker。由于集群资源有限,此处配置一个Master,三个Worker,集群规划如下。
node1 master、worker
node2 worker
node3 worker

2.2 前置准备工作

1)三台节点均需部署JDK(1.8+),并配置相关环境变量。
2)需部署数据库,支持MySQL(5.7+)或者PostgreSQL(8.2.15+)。
3)需部署Zookeeper(3.4.6+)。
4)三台节点均需安装进程管理工具包psmisc。

[linux@node1 ~]$ sudo yum install -y psmisc
[linux@node2 ~]$ sudo yum install -y psmisc
[linux@node3 ~]$ sudo yum install -y psmisc

2.3 解压DolphinScheduler安装包

1)上传DolphinScheduler安装包到node1节点的/opt/software目录
2)解压安装包到当前目录
注:解压目录并非最终的安装目录

[linux@node1 software]$ tar -zxvf apache-dolphinscheduler-1.3.9-bin.tar.gz -C /opt/server/

3)改名

[linux@node1 server]$ mv apache-dolphinscheduler-1.3.9-bin dolphinscheduler-bin

2.4 初始化数据库

DolphinScheduler 元数据存储在关系型数据库中,故需创建相应的数据库和用户。
1)创建数据库

mysql> CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

2)创建用户

mysql> CREATE USER 'dolphinscheduler'@'%' IDENTIFIED BY 'dolphinscheduler';

注:
若出现以下错误信息,表明新建用户的密码过于简单。

ERROR 1819 (HY000): Your password does not satisfy the current policy requirements

可提高密码复杂度或者执行以下命令降低MySQL密码强度级别。

mysql> set global validate_password_length=4;
mysql> set global validate_password_policy=0;

3)赋予用户相应权限

mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'dolphinscheduler'@'%';
mysql> flush privileges;

4)修改数据源配置文件
进入DolphinScheduler解压目录

[linux@node1 ~]$ cd /opt/server/dolphinscheduler-bin/conf/

修改conf目录下的datasource.properties文件

[linux@node1 conf]$ vim datasource.properties 

修改内容如下

spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.url=jdbc:mysql://node1:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8
spring.datasource.username=dolphinscheduler
spring.datasource.password=dolphinscheduler

5)拷贝MySQL驱动到DolphinScheduler的解压目录下的lib中

[linux@node1 conf]$ cp /opt/software/mysql/mysql-connector-java-5.1.27-bin.jar /opt/server/dolphinscheduler-bin/lib/

6)执行数据库初始化脚本

[linux@node1 dolphinscheduler-bin]$ sh script/create-dolphinscheduler.sh 

2.5 配置一键部署脚本

修改解压目录下的conf/config目录下的install_config.conf文件

[linux@node1 dolphinscheduler-bin]$ cd conf/config/
[linux@node1 config]$ vim install_config.conf 
# postgresql or mysql
dbtype="mysql"

# db config

# db username
username="dolphinscheduler"

# database name
dbname="dolphinscheduler"

# db passwprd
# NOTICE: if there are special characters, please use the \ to escape, for example, `[` escape to `\[`
password="dolphinscheduler"

# zk cluster
zkQuorum="node1:2181,node2:2181,node3:2181"

# Note: the target installation path for dolphinscheduler, please not config as the same as the current path (pwd)
installPath="/opt/server/dolphinscheduler"

# deployment user
deployUser="linux"


# alert config
# mail server host
mailServerHost="smtp.exmail.qq.com"

# mail server port
mailServerPort="25"


# user
mailUser="xxxxxxxxxx"

# sender password
# note: The mail.passwd is email service authorization code, not the email login password.
mailPassword="xxxxxxxxxx"

# TLS mail protocol support
starttlsEnable="true"

# SSL mail protocol support
# only one of TLS and SSL can be in the true state.
sslEnable="false"


# user data local directory path, please make sure the directory exists and have read write permissions
dataBasedirPath="/tmp/dolphinscheduler"

# resource storage type: HDFS, S3, NONE
resourceStorageType="HDFS"

resourceUploadPath="/dolphinscheduler"

# if S3,write S3 address,HA,for example :s3a://dolphinscheduler,
# Note,s3 be sure to create the root directory /dolphinscheduler
defaultFS="hdfs://node1:9820"

# if resourceStorageType is S3, the following three configuration is required, otherwise please ignore
s3Endpoint="http://192.168.xx.xx:9010"
s3AccessKey="xxxxxxxxxx"
s3SecretKey="xxxxxxxxxx"

# resourcemanager port, the default value is 8088 if not specified
resourceManagerHttpAddressPort="8088"

# if resourcemanager HA is enabled, please set the HA IPs; if resourcemanager is single, keep this value empty
yarnHaIps=""

singleYarnIp="node2"

# who have permissions to create directory under HDFS/S3 root path
# Note: if kerberos is enabled, please config hdfsRootUser=
hdfsRootUser="linux"

# kerberos config
# whether kerberos starts, if kerberos starts, following four items need to config, otherwise please ignore
kerberosStartUp="false"
# kdc krb5 config file path
krb5ConfPath="$installPath/conf/krb5.conf"
# keytab username
keytabUserName="hdfs-mycluster@ESZ.COM"
# username keytab path
keytabPath="$installPath/conf/hdfs.headless.keytab"
krb5ConfPath="$installPath/conf/krb5.conf"
# keytab username
keytabUserName="hdfs-mycluster@ESZ.COM"
# username keytab path
keytabPath="$installPath/conf/hdfs.headless.keytab"
# kerberos expire time, the unit is hour
kerberosExpireTime="2"

  api server port
apiServerPort="12345"


# install hosts
# Note: install the scheduled hostname list. If it is pseudo-distributed, just write a pseudo-distributed hostname
ips="node1,node2,node3"

# ssh port, default 22
# Note: if ssh port is not default, modify here
sshPort="22"

# run master machine
# Note: list of hosts hostname for deploying master
masters="node1"

# run worker machine
# note: need to write the worker group name of each worker, the default value is "default"
workers="node1:default,node2:default,node3:default"

# run alert machine
# note: list of machine hostnames for deploying alert server
alertServer="node1"

# run api machine
# note: list of machine hostnames for deploying api server
apiServers="node1"

2.6 一键部署DolphinScheduler

1)启动Zookeeper集群和Hadoop集群

[linux@node1 dolphinscheduler-bin]$ zookeeper.sh start
[linux@node1 dolphinscheduler-bin]$ cluster.sh start

2)一键部署并启动DolphinScheduler

[linux@node1 dolphinscheduler-bin]$ ./install.sh 

3)查看DolphinScheduler进程

==========node1============
3009 NodeManager
3187 JobHistoryServer
2708 DataNode
3750 MasterServer
3798 WorkerServer
3846 LoggerServer
3975 Jps
2344 QuorumPeerMain
2556 NameNode
3900 AlertServer
3949 ApiApplicationServer
==========node2============
2432 NodeManager
3072 LoggerServer
2018 QuorumPeerMain
3028 WorkerServer
2118 DataNode
3110 Jps
2300 ResourceManager
==========node3============
2130 SecondaryNameNode
2227 NodeManager
2643 LoggerServer
2599 WorkerServer
2682 Jps
2027 DataNode
1933 QuorumPeerMain

4)访问DolphinScheduler UI
DolphinScheduler UI地址为

http://node1:12345/dolphinscheduler

初始用户的用户名为:admin,密码为dolphinscheduler123

2.7 DolphinScheduler启停命令

DolphinScheduler的启停脚本均位于其安装目录的bin目录下。

1)一键启停所有服务

./bin/start-all.sh
./bin/stop-all.sh

注意同Hadoop的启停脚本进行区分。
2)启停 Master

./bin/dolphinscheduler-daemon.sh start master-server
./bin/dolphinscheduler-daemon.sh stop master-server

3)启停 Worker

./bin/dolphinscheduler-daemon.sh start worker-server
./bin/dolphinscheduler-daemon.sh stop worker-server

4)启停 Api

./bin/dolphinscheduler-daemon.sh start api-server
./bin/dolphinscheduler-daemon.sh stop api-server

5)启停 Logger

./bin/dolphinscheduler-daemon.sh start logger-server
./bin/dolphinscheduler-daemon.sh stop logger-server

6)启停 Alert

./bin/dolphinscheduler-daemon.sh start alert-server
./bin/dolphinscheduler-daemon.sh stop alert-server

2.8 启动单机版dolphinscheduler

关闭集群版dolphinscheduler,关闭zookeeper,在安装目录下执行

[linux@node1 dolphinscheduler]$ bin/dolphinscheduler-daemon.sh start standalone-server
  • 0
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值