1、下载https://dolphinscheduler.apache.org/zh-cn/download/3.1.0
2、解压
tar -zxvf apache-dolphinscheduler-3.1.0-bin.tar.gz
mv apache-dolphinscheduler-3.1.0-bin.tar.gz ds_3.1.0
cd ds_3.1.0
chown dolphinscheduler:dolphinscheduler /opt/soft/ds/ds_3.1.0
chmod -R 775 /opt/soft/ds/ds_3.1.0
3、配置
alert-server 、 api-server 、 master-server 、 standalone-server 、tools、 worker-server目录下conf中有application.yaml、common.properties、dolphinscheduler_env.sh三文件添加mysql和hdfs配置。
application.yaml中配置mysql信息:
spring:
config:
activate:
on-profile: mysql
datasource:
driver-class-name: com.mysql.cj.jdbc.Driver
url: jdbc:mysql://***.***.**.*:3306/dolphinscheduler?useUnicode=true&characterEncoding=utf-8&useSSL=false
username: root
password: *********
这里的连接mysql是为了存储ds的工作流、日志、文件用户等信息,在/opt/soft/ds/ds_3.1.0/tools/sql/sql目录下dolphinscheduler_mysql.sql是存储数据要用到的表
mysql -h***.***.**.* -P6033 -uroot -p*********
create database dolphinscheduler;
use dolphinscheduler;
执行dolphinscheduler_mysql.sql中所有的建表语句。
common.properties配置hdfs目录信息:
#user data local directory path, please make sure the directory exists and have read write permissions
data.basedir.path=/opt/soft/ds/ds_3.1.0/data# resource storage type: HDFS, S3, OSS, NONE
resource.storage.type=HDFS
# resource store on HDFS/S3 path, resource file will store to this base path, self configuration, please make sure the directory exists on hdfs and have read write permissions. "/dolphinscheduler" is recommended
resource.storage.upload.base.path=hdfs://nameservice1/tmp/dolphinscheduler
# if resource.storage.type=HDFS, the user must have the permission to create directories under the HDFS root path
resource.hdfs.root.user=hdfs
# if resource.storage.type=S3, the value like: s3a://dolphinscheduler; if resource.storage.type=HDFS and namenode HA is enabled, you need to copy core-site.xml and hdfs-site.xml to conf dir
resource.hdfs.fs.defaultFS=hdfs://***.***.***.***:8020
其中 hdfs://nameservice1 这个和连接hdfs-site.xml、hive-site.xml文件命名有关;
后面需要再hdfs上创建目录本地的历史文件存储目录:
hdfs dfs -mkdir -p /tmp/dolphinscheduler/test/resource
hdfs dfs -chmod -R 777 /tmp/dolphinscheduler/test/resource
mkdir /opt/soft/ds/ds_3.1.0/data
chmod -R 775 /opt/soft/ds/ds_3.1.0/data
dolphinscheduler_env.sh中配置环境信息:
# JAVA_HOME, will use it to start DolphinScheduler server
export JAVA_HOME=${JAVA_HOME:-/usr/java/jdk1.8.0_162}
# Database related configuration, set database type, username and password
export DATABASE=${DATABASE:-mysql}
export SPRING_PROFILES_ACTIVE=${DATABASE}
export SPRING_DATASOURCE_URL=http://***.***.**.*:3306/dolphinscheduler?serverTimezone=UTC&useUnicode=true&characterEncoding=utf8&useSSL=false
export SPRING_DATASOURCE_USERNAME=root
export SPRING_DATASOURCE_PASSWORD=**********
# DolphinScheduler server related configuration
export SPRING_CACHE_TYPE=${SPRING_CACHE_TYPE:-none}
export SPRING_JACKSON_TIME_ZONE=${SPRING_JACKSON_TIME_ZONE:-UTC}
export MASTER_FETCH_COMMAND_NUM=${MASTER_FETCH_COMMAND_NUM:-10}
# Registry center configuration, determines the type and link of the registry center
export REGISTRY_TYPE=${REGISTRY_TYPE:-zookeeper}
export REGISTRY_ZOOKEEPER_CONNECT_STRING=${REGISTRY_ZOOKEEPER_CONNECT_STRING:-localhost:2181}
# Tasks related configurations, need to change the configuration if you use the related tasks.
export HADOOP_HOME=${HADOOP_HOME:-/opt/cloudera/parcels/CDH/lib/hadoop}
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/opt/cloudera/parcels/CDH/lib/hadoop/etc/hadoop}
export SPARK_HOME1=${SPARK_HOME1:-/opt/cloudera/parcels/CDH/lib/spark}
export SPARK_HOME2=${SPARK_HOME2:-/opt/cloudera/parcels/SPARK2/lib/spark2}
export PYTHON_HOME=${PYTHON_HOME:-/opt/soft/python3/Python-3.6.6}
export HIVE_HOME=${HIVE_HOME:-/opt/cloudera/parcels/CDH/lib/hive}
export FLINK_HOME=${FLINK_HOME:-/opt/soft/flink}
export DATAX_HOME=${DATAX_HOME:-/opt/soft/datax/bin/datax.py}
export SEATUNNEL_HOME=${SEATUNNEL_HOME:-/opt/soft/seatunnel}
export CHUNJUN_HOME=${CHUNJUN_HOME:-/opt/soft/chunjun}
export PATH=$HADOOP_HOME/bin:$SPARK_HOME1/bin:$SPARK_HOME2/bin:$PYTHON_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$FLINK_HOME/bin:$DATAX_HOME/bin:$SEATUNNEL_HOME/bin:$CHUNJUN_HOME/bin:$PATH
4、上传hadoop集群配置文件/conf文件下上传对应的core-site.xml、hdfs-site.xml、hive-site.xml文件
5、数据库连接包alert-server 、 api-server 、 master-server 、 standalone-server 、tools、 worker-server目录下的/libs添加mysql、oracle链接包我这里是:
mysql-connector-java-8.0.18.jar
Oracle_10g_10.2.0.4_JDBC_ojdbc14.jar
我这里是因为后面调度有用到doris和Oracle
6、启动停止服务(这里是单机standalone)
# 启动 Standalone Server 服务
bash ./bin/dolphinscheduler-daemon.sh start standalone-server
# 停止 Standalone Server 服务
bash ./bin/dolphinscheduler-daemon.sh stop standalone-server
7、登录
http://localhost:12345/dolphinscheduler/ui
localhost这里是安装dolphinscheduler 的主机ip
默认的用户名和密码是 admin/dolphinscheduler123
配置租户这里的租户是安装主机上linux的用户
如果没有上传core-site.xml、hdfs-site.xml、hive-site.xml可能就会显示无法创建;没有配置mysql连接的话ds会默认连接h2库数据存储到内存中,如果重启就会丢失所有数据。