【异常解决】DolphinScheduler-2.0.5 工作流实例无法调度和停止异常

最近有同事反馈有些工作流实例在dolphinscheduler上提交后一直显示停止中,无法调度也未产生任务实例,且在工作流实例界面操作标记全部为灰色,任务实例也无法删除。

日志定位

查看master节点日志,发现有如下异常

[ERROR] 2022-06-13 17:59:45.280 org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread:[244] - handler error:
org.mybatis.spring.MyBatisSystemException: nested exception is org.apache.ibatis.exceptions.TooManyResultsException: Expected one result (or null) to be returned by selectOne(), but found: 2
        at org.mybatis.spring.MyBatisExceptionTranslator.translateExceptionIfPossible(MyBatisExceptionTranslator.java:78)
        at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:440)
        at com.sun.proxy.$Proxy91.selectOne(Unknown Source)
        at org.mybatis.spring.SqlSessionTemplate.selectOne(SqlSessionTemplate.java:159)
        at com.baomidou.mybatisplus.core.override.MybatisMapperMethod.execute(MybatisMapperMethod.java:89)
        at com.baomidou.mybatisplus.core.override.MybatisMapperProxy.invoke(MybatisMapperProxy.java:61)
        at com.sun.proxy.$Proxy115.queryByDefinitionCodeAndVersion(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor124.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
        at org.springframework.cache.interceptor.CacheInterceptor.lambda$invoke$0(CacheInterceptor.java:54)
        at org.springframework.cache.interceptor.CacheAspectSupport.invokeOperation(CacheAspectSupport.java:366)
        at org.springframework.cache.interceptor.CacheAspectSupport.lambda$handleSynchronizedGet$1(CacheAspectSupport.java:447)
        at org.springframework.cache.support.NoOpCache.get(NoOpCache.java:76)
        at org.springframework.cache.interceptor.CacheAspectSupport.handleSynchronizedGet(CacheAspectSupport.java:442)
        at org.springframework.cache.interceptor.CacheAspectSupport.execute(CacheAspectSupport.java:382)
        at org.springframework.cache.interceptor.CacheAspectSupport.execute(CacheAspectSupport.java:345)
        at org.springframework.cache.interceptor.CacheInterceptor.invoke(CacheInterceptor.java:64)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
        at com.sun.proxy.$Proxy116.queryByDefinitionCodeAndVersion(Unknown Source)
        at org.apache.dolphinscheduler.service.process.ProcessService.findTaskDefinition(ProcessService.java:2474)
        at org.apache.dolphinscheduler.service.process.ProcessService.lambda$getTaskDefineLogListByRelation$4(ProcessService.java:2465)
        at java.util.HashMap.forEach(HashMap.java:1291)
        at org.apache.dolphinscheduler.service.process.ProcessService.getTaskDefineLogListByRelation(ProcessService.java:2464)
        at org.apache.dolphinscheduler.service.process.ProcessService$$FastClassBySpringCGLIB$$ed138739.invoke(<generated>)
        at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
        at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:689)
        at org.apache.dolphinscheduler.service.process.ProcessService$$EnhancerBySpringCGLIB$$ec695045.getTaskDefineLogListByRelation(<generated>)
        at org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread.buildFlowDag(WorkflowExecuteThread.java:578)
        at org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread.startProcess(WorkflowExecuteThread.java:542)
        at org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread.run(WorkflowExecuteThread.java:239)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
        at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
        at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.ibatis.exceptions.TooManyResultsException: Expected one result (or null) to be returned by selectOne(), but found: 2
        at org.apache.ibatis.session.defaults.DefaultSqlSession.selectOne(DefaultSqlSession.java:80)
        at sun.reflect.GeneratedMethodAccessor98.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:426)
        ... 42 common frames omitted

关键信息 Expected one result (or null) to be returned by selectOne(), but found: 2,即sql的查询逻辑是结果为1条,但是却出现2条的情况
定位源码中报错的位置
at org.apache.dolphinscheduler.service.process.ProcessService.findTaskDefinition(ProcessService.java:2474)

源码定位

看源码找到对应的findTaskDefinition方法,返回的是TaskDefinition类
在这里插入图片描述
定位queryByDefinitionCodeAndVersion方法的具体实现
在这里插入图片描述
查询sql为
在这里插入图片描述

    <select id="queryByDefinitionCodeAndVersion" resultType="org.apache.dolphinscheduler.dao.entity.TaskDefinitionLog">
        select
        <include refid="baseSql"/>
        from t_ds_task_definition_log
        WHERE code = #{code}
        and version = #{version}
    </select>

在dolphinsscheduler对应的数据库中,查询t_ds_task_definition_log表

SELECT code ,version,COUNT(*) cnt  from t_ds_task_definition_log group by code ,version order by cnt desc

在这里插入图片描述

这是一张任务定义的日志表,理论上同一个版本的任务编码,不应该出现多条数据,而查询发现表中确实存在几个任务有重复数据,而这几个有重复数据的任务恰好都属于那个无法调度和停止的工作流。

我的做法是手动删除了这些重复数据(同时t_ds_task_definition表里也是有同样的相同数据,也一并删除),并重启master节点。
重启后工作流实例操作界面恢复正常。
在这里插入图片描述
此时重启调度任务,任务也可恢复正常调度。


但其实还有个问题,就是这些重复数据是怎么进去的,还需要进一步定位,但是有必要怀疑dolphinscheduler的2.0.5版本在往数据库写入任务信息时出现了事务并发问题。

  • 4
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 5
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值