Dolphin Scheduler虽然相对稳定,但是偶尔还是会挂掉,

[WARN] 2024-08-27 19:42:51.689 +0000

org.apache.dolphinscheduler.server.worker.task.WorkerHeartBeatTask:[101] - [WorkflowInstance-0][TaskInstance-0] - current cpu load average 0.01 is higher than 1.0 or available memory 0.29707310849572754 is lower than 0.3


[WARN] 2024-04-14 23:59:19.608 +0000

org.apache.dolphinscheduler.server.master.runner.MasterSchedulerBootstrap:[121] - [WorkflowInstance-0][TaskInstance-0] - The current server is overload, cannot consumes commands.

[WARN] 2024-04-14 23:59:20.145

+0000 org.apache.dolphinscheduler.server.master.dispatch.host.LowerWeightHostManager:[139] - [WorkflowInstance-0][TaskInstance-0] - worker 172.xx.xxx.xxxx:1234 current cpu load average 0.0 is too high or available memory 4.55G is too low

[WARN]

2024-04-14 23:59:20.146 +0000 org.apache.dolphinscheduler.server.master.registry.MasterSlotManager:[93][WorkflowInstance-0][TaskInstance-0] - Current master is not in active master list


还有一次服务挂了,重启后收到一个奇怪的邮件报警:

监控Dolphin Scheduler_监控


已经增加了内存空间,并扩大了swap空间,不过看来,跑过的job不会自动释放内存

监控Dolphin Scheduler_dolphin_02

而目前使用的zabbix却无法监控这些情况,目前该用zabbix agent2来加上服务监控,看看是否有用。


尝试监控端口:

Web Page: 12345

Server Port: 5678 

MySQL: 3306

Zookeeper: 2181

Jetty: 8080