问题:
DolphinScheduler运行任务,但是任务一直在运行状态,该任务不成功也不失败,任务状态一直显示在运行中,且在任务实例无法查看日志
问题定位及原因:
当DolphinScheduler的监控中心页面上的,Worker主机结点并非是bin/env/install_env.sh下定义的ip地址,则会导致Worker结点并非指定的机器结点,从而导致无法接受到任务执行中或者结果的任何状态
解决:
修改bin/env/install_env.sh下的 ips地址 和 worker结点的ip地址,最好直接使用ip地址的方式编辑,而不是通过/etc/hosts文件映射ip地址名,这便于发现问题
# Example for hostnames: ips="ds1,ds2,ds3,ds4,ds5", Example for IPs: ips="192.168.8.1,192.168.8.2,192.168.8.3,192.168.8.4,192.168.8.5"
ips=${ips:-"172.xx.xxx.200,172.xx.xxx.201,172.xx.xxx.202,172.xx.xxx.203,172.xx.xxx.204,172.xx.xxx.205"}
# Port of SSH protocol, default value is 22. For now we only support same port in all `ips` machine
# modify it if you use different ssh port
sshPort=${sshPort:-"22"}
# A comma separated list of machine hostname or IP would be installed Master server, it
# must be a subset of configuration `ips`.
# Example for hostnames: masters="ds1,ds2", Example for IPs: masters="192.168.8.1,192.168.8.2"
masters=${masters:-"172.xx.xxx.200,172.xx.xxx.201"}
# A comma separated list of machine <hostname>:<workerGroup> or <IP>:<workerGroup>.All hostname or IP must be a
# subset of configuration `ips`, And workerGroup have default value as `default`, but we recommend you declare behind the hosts
# Example for hostnames: workers="ds1:default,ds2:default,ds3:default", Example for IPs: workers="192.168.8.1:default,192.168.8.2:default,192.168.8.3:default"
workers=${workers:-"172.xx.xxx.200:default,172.xx.xxx.201:default"}
最后,在Master结点上重启DolphinScheduler集群即可解决
dolphinscheduler/bin/status-all.sh
dolphinscheduler/bin/stop-all.sh
dolphinscheduler/bin/start-all.sh
参考:
https://dolphinscheduler.apache.org/zh-cn/docs/3.2.1/guide/installation/cluster