Dolphinscheduler单机搭建(从零开始)


前言

DolphinScheduler是一个分布式易用的大数据工作流调度系统,提供了可视化的web操作界面,帮助用户快速、高效地构建和调度大数据任务;支持分布式部署和单机部署两种方式。单机部署适用于小规模使用场景,可以在一台机器上快速搭建并运行。


本文将介绍如何在单机上部署DolphinScheduler

一、准备工作

在开始部署之前,确保你已经安装了以下软件和环境:

  • Java 8或更高版本
  • MySQL数据库
  • Hadoop和Hive(可选,用于执行Hadoop和Hive任务)
  • Centos服务器 (获取方法) (使用的链接工具是XShell)

安装包在下面链接中,这里给出了两个,按方便的来下,也可以自行去官网下载

如果没有安装,就按顺序操作,安装了可以直接到DolphinScheduler

二、安装JDK和Mysql

由于之前已经写过了,这里就直接放链接了

三、安装DolphinScheduler

步骤一:下载和解压DolphinScheduler

  1. DolphinScheduler官网下载或是前面网盘中下载
wget --no-check-certificate https://archive.apache.org/dist/dolphinscheduler/3.1.4/apache-dolphinscheduler-3.1.4-bin.tar.gz 
  1. 上传压缩包到服务器的/tmp/下,如果直接拖动上传不了可以下载lrzsz重试
cd /tmp

在这里插入图片描述

  1. 解压下载的压缩包到指定文件夹
mkdir -p /opt/soft/dolphinscheduler
tar -zxvf /tmp/apache-dolphinscheduler-3.1.4-bin.tar.gz -C /opt/soft/dolphinscheduler

步骤二:创建数据库

  1. 登录到MySQL数据库(这里输入密码不会显示,直接输就行)
msyql -u root -p

在这里插入图片描述
5. 创建一个新的数据库,并指定字符集和校对规则。例如,执行以下命令创建名为dolphinscheduler的数据库:

CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;

在这里插入图片描述

6.创建 dolphinscheduler 用户并授予操作 dolphinscheduler 数据库的权限(设置密码为dolphinscheduler)

CREATE USER 'dolphinscheduler'@'%' IDENTIFIED BY 'dolphinscheduler';

授予操作 dolphinscheduler 数据库的权限并刷新权限

GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'dolphinscheduler'@'%' IDENTIFIED BY 'dolphinscheduler';
flush privileges;

在这里插入图片描述
7.完成后可以退出Mysql使用dolphinscheduler 用户登录测试

mysql -u dolphinscheduler -p

在这里插入图片描述

步骤三:修改配置文件

  1. 修改 dolphinscheduler_env.sh 文件
vim /opt/soft/dolphinscheduler/apache-dolphinscheduler-3.1.4-bin/bin/env/dolphinscheduler_env.sh

内容如下:

# JAVA_HOME, will use it to start DolphinScheduler server
export JAVA_HOME=${JAVA_HOME:-/root/jdk1.8.0_11}   # 修改为自己的jdk安装目录

# 修改MySQL配置
# Database related configuration, set database type, username and password
export DATABASE=${DATABASE:-mysql}
export SPRING_PROFILES_ACTIVE=${DATABASE}
export SPRING_DATASOURCE_URL="jdbc:mysql://localhost:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8&useSSL=false"
export SPRING_DATASOURCE_USERNAME="dolphinscheduler"
export SPRING_DATASOURCE_PASSWORD="dolphinscheduler"

# DolphinScheduler server related configuration
export SPRING_CACHE_TYPE=${SPRING_CACHE_TYPE:-none}
export SPRING_JACKSON_TIME_ZONE=${SPRING_JACKSON_TIME_ZONE:-UTC}
export MASTER_FETCH_COMMAND_NUM=${MASTER_FETCH_COMMAND_NUM:-10}

# Registry center configuration, determines the type and link of the registry center
export REGISTRY_TYPE=${REGISTRY_TYPE:-zookeeper}
export REGISTRY_ZOOKEEPER_CONNECT_STRING=${REGISTRY_ZOOKEEPER_CONNECT_STRING:-localhost:2181}

# Tasks related configurations, need to change the configuration if you use the related tasks.
export HADOOP_HOME=${HADOOP_HOME:-/opt/soft/hadoop}
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/opt/soft/hadoop/etc/hadoop}
export SPARK_HOME1=${SPARK_HOME1:-/opt/soft/spark1}
export SPARK_HOME2=${SPARK_HOME2:-/opt/soft/spark2}
export PYTHON_HOME=${PYTHON_HOME:-/opt/soft/python}
export HIVE_HOME=${HIVE_HOME:-/opt/soft/hive}
export FLINK_HOME=${FLINK_HOME:-/opt/soft/flink}
export DATAX_HOME=${DATAX_HOME:-/opt/soft/datax}
export SEATUNNEL_HOME=${SEATUNNEL_HOME:-/opt/soft/seatunnel}
export CHUNJUN_HOME=${CHUNJUN_HOME:-/opt/soft/chunjun}

export PATH=$HADOOP_HOME/bin:$SPARK_HOME1/bin:$SPARK_HOME2/bin:$PYTHON_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$FLINK_HOME/bin:$DATAX_HOME/bin:$SEATUNNEL_HOME/bin:$CHUNJUN_HOME/bin:$PATH

9.修改 application.yaml 文件

vim /opt/soft/dolphinscheduler/apache-dolphinscheduler-3.1.4-bin/standalone-server/conf/application.yaml

主要修改的位置有两处:前面的datasource和后面datasource,这两处都改成如下即可

datasource:
    driver-class-name: com.mysql.cj.jdbc.Driver
    url: jdbc:mysql://127.0.0.1:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8
    username: dolphinscheduler
    password: dolphinscheduler

可以在vim的命令下输入 /datasource ,回车便会跳转,按 n 可以查看下一个
在这里插入图片描述
在这里插入图片描述
修改后的内容如下所示

spring:
  jackson:
    time-zone: UTC
    date-format: "yyyy-MM-dd HH:mm:ss"
  banner:
    charset: UTF-8
  cache:
    # default enable cache, you can disable by `type: none`
    type: none
    cache-names:
      - tenant
      - user
      - processDefinition
      - processTaskRelation
      - taskDefinition
    caffeine:
      spec: maximumSize=100,expireAfterWrite=300s,recordStats
  sql:
    init:
      schema-locations: classpath:sql/dolphinscheduler_h2.sql
  datasource:
    driver-class-name: com.mysql.cj.jdbc.Driver
    url: jdbc:mysql://127.0.0.1:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8
    username: dolphinscheduler
    password: dolphinscheduler
  quartz:
    job-store-type: jdbc
    jdbc:
      initialize-schema: never
    properties:
      org.quartz.threadPool:threadPriority: 5
      org.quartz.jobStore.isClustered: true
      org.quartz.jobStore.class: org.springframework.scheduling.quartz.LocalDataSourceJobStore
      org.quartz.scheduler.instanceId: AUTO
      org.quartz.jobStore.tablePrefix: QRTZ_
      org.quartz.jobStore.acquireTriggersWithinLock: true
      org.quartz.scheduler.instanceName: DolphinScheduler
      org.quartz.threadPool.class: org.quartz.simpl.SimpleThreadPool
      org.quartz.jobStore.useProperties: false
      org.quartz.threadPool.makeThreadsDaemons: true
      org.quartz.threadPool.threadCount: 25
      org.quartz.jobStore.misfireThreshold: 60000
      org.quartz.scheduler.makeSchedulerThreadDaemon: true
      org.quartz.jobStore.driverDelegateClass: org.quartz.impl.jdbcjobstore.StdJDBCDelegate
      org.quartz.jobStore.clusterCheckinInterval: 5000
  servlet:
    multipart:
      max-file-size: 1024MB
      max-request-size: 1024MB
  messages:
    basename: i18n/messages
  jpa:
    hibernate:
      ddl-auto: none
  mvc:
    pathmatch:
      matching-strategy: ANT_PATH_MATCHER

registry:
  type: zookeeper
  zookeeper:
    namespace: dolphinscheduler
    connect-string: localhost:2181
    retry-policy:
      base-sleep-time: 60ms
      max-sleep: 300ms
      max-retries: 5
    session-timeout: 30s
    connection-timeout: 9s
    block-until-connected: 600ms
    digest: ~

security:
  authentication:
    # Authentication types (supported types: PASSWORD,LDAP)
    type: PASSWORD
    # IF you set type `LDAP`, below config will be effective
    ldap:
      # ldap server config
      urls: ldap://ldap.forumsys.com:389/
      base-dn: dc=example,dc=com
      username: cn=read-only-admin,dc=example,dc=com
      password: password
      user:
        # admin userId when you use LDAP login
        admin: read-only-admin
        identity-attribute: uid
        email-attribute: mail
        # action when ldap user is not exist (supported types: CREATE,DENY)
        not-exist-action: CREATE

# Traffic control, if you turn on this config, the maximum number of request/s will be limited.
# global max request number per second
# default tenant-level max request number
traffic:
  control:
    global-switch: false
    max-global-qps-rate: 300
    tenant-switch: false
    default-tenant-qps-rate: 10
    #customize-tenant-qps-rate:
      # eg.
      #tenant1: 11
      #tenant2: 20

master:
  listen-port: 5678
  # master fetch command num
  fetch-command-num: 10
  # master prepare execute thread number to limit handle commands in parallel
  pre-exec-threads: 10
  # master execute thread number to limit process instances in parallel
  exec-threads: 10
  # master dispatch task number per batch
  dispatch-task-number: 3
  # master host selector to select a suitable worker, default value: LowerWeight. Optional values include random, round_robin, lower_weight
  host-selector: lower_weight
  # master heartbeat interval
  heartbeat-interval: 10s
  # Master heart beat task error threshold, if the continuous error count exceed this count, the master will close.
  heartbeat-error-threshold: 5
  # master commit task retry times
  task-commit-retry-times: 5
  # master commit task interval
  task-commit-interval: 1s
  state-wheel-interval: 5s
  # master max cpuload avg, only higher than the system cpu load average, master server can schedule. default value -1: the number of cpu cores * 2
  max-cpu-load-avg: -1
  # master reserved memory, only lower than system available memory, master server can schedule. default value 0.3, the unit is G
  reserved-memory: 0.3
  # failover interval
  failover-interval: 10m
  # kill yarn jon when failover taskInstance, default true
  kill-yarn-job-when-task-failover: true
  worker-group-refresh-interval: 10s

worker:
  # worker listener port
  listen-port: 1234
  # worker execute thread number to limit task instances in parallel
  exec-threads: 10
  # worker heartbeat interval
  heartbeat-interval: 10s
  # Worker heart beat task error threshold, if the continuous error count exceed this count, the worker will close.
  heartbeat-error-threshold: 5
  # worker host weight to dispatch tasks, default value 100
  host-weight: 100
  # tenant corresponds to the user of the system, which is used by the worker to submit the job. If system does not have this user, it will be automatically created after the parameter worker.tenant.auto.create is true.
  tenant-auto-create: true
  #Scenes to be used for distributed users.For example,users created by FreeIpa are stored in LDAP.This parameter only applies to Linux, When this parameter is true, worker.tenant.auto.create has no effect and will not automatically create tenants.
  tenant-distributed-user: false
  # worker max cpuload avg, only higher than the system cpu load average, worker server can be dispatched tasks. default value -1: the number of cpu cores * 2
  max-cpu-load-avg: -1
  # worker reserved memory, only lower than system available memory, worker server can be dispatched tasks. default value 0.3, the unit is G
  reserved-memory: 0.3
  # default worker groups separated by comma, like 'worker.groups=default,test'
  groups:
    - default
  # alert server listen host
  alert-listen-host: localhost
  alert-listen-port: 50052
  task-execute-threads-full-policy: REJECT

alert:
  port: 50052
  # Mark each alert of alert server if late after x milliseconds as failed.
  # Define value is (0 = infinite), and alert server would be waiting alert result.
  wait-timeout: 0

python-gateway:
  # Weather enable python gateway server or not. The default value is true.
  enabled: true
  # Authentication token for connection from python api to python gateway server. Should be changed the default value
  # when you deploy in public network.
  auth-token: jwUDzpLsNKEFER4*a8gruBH_GsAurNxU7A@Xc
  # The address of Python gateway server start. Set its value to `0.0.0.0` if your Python API run in different
  # between Python gateway server. It could be be specific to other address like `127.0.0.1` or `localhost`
  gateway-server-address: 0.0.0.0
  # The port of Python gateway server start. Define which port you could connect to Python gateway server from
  # Python API side.
  gateway-server-port: 25333
  # The address of Python callback client.
  python-address: 127.0.0.1
  # The port of Python callback client.
  python-port: 25334
  # Close connection of socket server if no other request accept after x milliseconds. Define value is (0 = infinite),
  # and socket server would never close even though no requests accept
  connect-timeout: 0
  # Close each active connection of socket server if python program not active after x milliseconds. Define value is
  # (0 = infinite), and socket server would never close even though no requests accept
  read-timeout: 0

server:
  port: 12345
  servlet:
    session:
      timeout: 120m
    context-path: /dolphinscheduler/
  compression:
    enabled: true
    mime-types: text/html,text/xml,text/plain,text/css,text/javascript,application/javascript,application/json,application/xml
  jetty:
    max-http-form-post-size: 5000000

management:
  endpoints:
    web:
      exposure:
        include: '*'
  endpoint:
    health:
      enabled: true
      show-details: always
  health:
    db:
      enabled: true
    defaults:
      enabled: false
  metrics:
    tags:
      application: ${spring.application.name}

audit:
  enabled: true

metrics:
  enabled: true

# Override by profile
---
spring:
  config:
    activate:
      on-profile: postgresql
  quartz:
    properties:
      org.quartz.jobStore.driverDelegateClass: org.quartz.impl.jdbcjobstore.PostgreSQLDelegate
  datasource:
    driver-class-name: org.postgresql.Driver
    url: jdbc:postgresql://127.0.0.1:5432/dolphinscheduler
    username: root
    password: root

---
spring:
  config:
    activate:
      on-profile: mysql
  sql:
     init:
       schema-locations: classpath:sql/dolphinscheduler_mysql.sql
  datasource:
    driver-class-name: com.mysql.cj.jdbc.Driver
    url: jdbc:mysql://127.0.0.1:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8
    username: dolphinscheduler
    password: dolphinscheduler

步骤四:初始化数据库

10.配置MySQL驱动

cd /opt/soft/dolphinscheduler
wget https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.16/mysql-connector-java-8.0.16.jar -P /tmp
cp /tmp/mysql-connector-java-8.0.16.jar apache-dolphinscheduler-3.1.4-bin/worker-server/libs
cp /tmp/mysql-connector-java-8.0.16.jar apache-dolphinscheduler-3.1.4-bin/api-server/libs
cp /tmp/mysql-connector-java-8.0.16.jar apache-dolphinscheduler-3.1.4-bin/alert-server/libs
cp /tmp/mysql-connector-java-8.0.16.jar apache-dolphinscheduler-3.1.4-bin/master-server/libs
cp /tmp/mysql-connector-java-8.0.16.jar apache-dolphinscheduler-3.1.4-bin/tools/libs
cp /tmp/mysql-connector-java-8.0.16.jar apache-dolphinscheduler-3.1.4-bin/standalone-server/libs/standalone-server

11.初始化Dolphinscheduler数据库

cd /opt/soft/dolphinscheduler/apache-dolphinscheduler-3.1.4-bin
bash tools/bin/upgrade-schema.sh

出现如下表示成功
在这里插入图片描述

步骤五:启动DolphinScheduler

  1. 进入bin目录中启动
cd /opt/soft/dolphinscheduler/apache-dolphinscheduler-3.1.4-bin/bin
./dolphinscheduler-daemon.sh start standalone-server

在这里插入图片描述
停止dolphinscheduler

./dolphinscheduler-daemon.sh stop standalone-server

步骤六:访问DolphinScheduler

在浏览器中访问http://your_server_ip:12345/(your_server_ip为部署DolphinScheduler的服务器IP地址),使用默认的用户名和密码(admin/dolphinscheduler123)登录到DolphinScheduler的Web UI
在这里插入图片描述
在这里插入图片描述

至此,你已经成功部署并启动了DolphinScheduler的单机版。

参考博客

总结

通过上述步骤,你可以在一台机器上快速部署DolphinScheduler的单机版。如果你需要更高的性能和可靠性,可以考虑使用分布式部署方式。

希望本教程对您有所帮助!如果在部署过程中遇到问题,请随时在评论区留言或参考DolphinScheduler官方文档获取更详细的部署指南。感谢阅读!

  • 11
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
dolphinscheduler是一个分布式任务调度平台,可以在Linux系统上进行单机部署。以下是dolphinscheduler在Linux上的单机部署步骤: 1. 首先,请下载最新版本的dolphinscheduler后端安装包,并将其上传至服务器的目标部署目录,例如创建/opt/dolphinscheduler作为安装部署目录。 2. 解压安装包,执行以下命令: ``` mkdir -p /opt/dolphinscheduler cd /opt/dolphinscheduler tar -zxvf apache-dolphinscheduler-incubating-1.3.1-dolphinscheduler-bin.tar.gz -C /opt/dolphinscheduler mv apache-dolphinscheduler-incubating-1.3.1-dolphinscheduler-bin dolphinscheduler-bin ``` 3. 创建部署用户并赋予目录操作权限。以创建dolphinscheduler用户为例: ``` useradd dolphinscheduler echo "dolphinscheduler" | passwd --stdin dolphinscheduler sed -i '$adolphinscheduler ALL=(ALL) NOPASSWD: NOPASSWD: ALL' /etc/sudoers sed -i 's/Defaults requirett/#Defaults requirett/g' /etc/sudoers chown -R dolphinscheduler:dolphinscheduler dolphinscheduler-bin ``` 4. 如果需要使用资源中心功能,请执行以下命令: ``` sudo mkdir /data/dolphinscheduler sudo chown -R dolphinscheduler:dolphinscheduler /data/dolphinscheduler ``` 5. 切换到部署用户,执行一键部署脚本: ``` sh install.sh ``` 6. 部署成功后,可以通过访问前端页面地址来登录系统。前端页面地址为http://服务器IP地址:12345/dolphinscheduler,默认用户名和密码为admin/dolphinscheduler123。 7. 若要启停服务,可以使用以下命令: ``` # 一键停止集群所有服务 sh ./bin/stop-all.sh # 一键开启集群所有服务 sh ./bin/start-all.sh # 启停Master sh ./bin/dolphinscheduler-daemon.sh start master-server sh ./bin/dolphinscheduler-daemon.sh stop master-server # 启停Worker sh ./bin/dolphinscheduler-daemon.sh start worker-server sh ./bin/dolphinscheduler-daemon.sh stop worker-server # 启停Api sh ./bin/dolphinscheduler-daemon.sh start api-server sh ./bin/dolphinscheduler-daemon.sh stop api-server # 启停Logger sh ./bin/dolphinscheduler-daemon.sh start logger-server sh ./bin/dolphinscheduler-daemon.sh stop logger-server # 启停Alert sh ./bin/dolphinscheduler-daemon.sh start alert-server sh ./bin/dolphinscheduler-daemon.sh stop alert-server ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值