TX-LCN分布式事务框架源码解析

TX-LCN分布式事务框架源码解析(基于lcn模式下的异常流程源码分析)

前一篇文章我们讲了lcn模式下的正常流程是如何运作的。这篇讲下在发生异常时框架是怎么进行回滚的,同样调用链还是A>B>C

正常流程图是这样的,前一个模块的doBusinessCode执行的是后一个模块的所有逻辑。我们从后向前看

 

C模块的所有的代码执行都在B模块的doBusinessCode方法中。B模块的代码执行都在A模块的doBusinessCode方法中。

C模块

C模块业务代码如下(B模块此代码相同处理类相同。)

1、此方法会抛出Throwable 类型的异常

2、此方法会catch住两种异常TransactionException 与 Throwable 异常,并抛出。


 
 
  1. public Object transactionRunning(TxTransactionInfo info) throws Throwable {
  2. // 1. 获取事务类型
  3. String transactionType = info.getTransactionType();
  4. // 2. 获取事务传播状态
  5. DTXPropagationState propagationState = propagationResolver.resolvePropagationState(info);
  6. // 2.1 如果不参与分布式事务立即终止
  7. if (propagationState.isIgnored()) {
  8. return info.getBusinessCallback().call();
  9. }
  10. // 3. 获取本地分布式事务控制器
  11. DTXLocalControl dtxLocalControl = txLcnBeanHelper.loadDTXLocalControl(transactionType, propagationState);
  12. // 4. 织入事务操作
  13. try {
  14. // 4.1 记录事务类型到事务上下文
  15. Set< String> transactionTypeSet = globalContext.txContext(info.getGroupId()).getTransactionTypes();
  16. transactionTypeSet.add(transactionType);
  17. dtxLocalControl.preBusinessCode(info);
  18. // 4.2 业务执行前
  19. txLogger.txTrace(
  20. info.getGroupId(), info.getUnitId(), "pre business code, unit type: {}", transactionType);
  21. // 4.3 执行业务
  22. Object result = dtxLocalControl.doBusinessCode(info);
  23. // 4.4 业务执行成功
  24. txLogger.txTrace(info.getGroupId(), info.getUnitId(), "business success");
  25. dtxLocalControl.onBusinessCodeSuccess(info, result);
  26. return result;
  27. } catch (TransactionException e) {
  28. txLogger.error(info.getGroupId(), info.getUnitId(), "before business code error");
  29. throw e;
  30. } catch (Throwable e) {
  31. // 4.5 业务执行失败
  32. txLogger.error(info.getGroupId(), info.getUnitId(), Transactions.TAG_TRANSACTION,
  33. "business code error");
  34. dtxLocalControl.onBusinessCodeError(info, e);
  35. throw e;
  36. } finally {
  37. // 4.6 业务执行完毕
  38. dtxLocalControl.postBusinessCode(info);
  39. }
  40. }

C模块由于是最后一个模块不再去调用其他接口,它的doBusinessCode只是执行本地数据库操作,此doBusinessCode方法会抛出Throwable异常,如果C模块的本地数据库操作失败报错,则会被catch住去执行下面代码


 
 
  1. catch (Throwable e) {
  2. // 4.5 业务执行失败
  3. txLogger.error(info.getGroupId(), info.getUnitId(), Transactions.TAG_TRANSACTION,
  4. "business code error");
  5. dtxLocalControl.onBusinessCodeError(info, e);
  6. throw e;
  7. }

 
 
  1. public void onBusinessCodeError(TxTransactionInfo info, Throwable throwable) {
  2. try {
  3. //清理事务,即回滚本地数据库连接
  4. transactionCleanTemplate.clean(info.getGroupId(), info.getUnitId(), info.getTransactionType(), 0);
  5. } catch (TransactionClearException e) {
  6. log.error( "{} > clean transaction error." , Transactions.LCN);
  7. }
  8. }

如果本地数据库操作成功,C模块会去joinGroup加入事务组。(异步检测也是处理异常的,后面再讲)


 
 
  1. public void joinGroup(String groupId, String unitId, String transactionType, TransactionInfo transactionInfo)
  2. throws TransactionException {
  3. try {
  4. txLogger.txTrace(groupId, unitId, "join group > transaction type: {}", transactionType);
  5. reliableMessenger.joinGroup(groupId, unitId, transactionType, DTXLocalContext.transactionState(globalContext.dtxState(groupId)));
  6. txLogger.txTrace(groupId, unitId, "join group message over.");
  7. // 异步检测
  8. dtxChecking.startDelayCheckingAsync(groupId, unitId, transactionType);
  9. // 缓存参与方切面信息
  10. aspectLogger.trace(groupId, unitId, transactionInfo);
  11. } catch (RpcException e) {
  12. dtxExceptionHandler.handleJoinGroupMessageException(Arrays.asList(groupId, unitId, transactionType), e);
  13. } catch (LcnBusinessException e) {
  14. dtxExceptionHandler.handleJoinGroupBusinessException(Arrays.asList(groupId, unitId, transactionType), e);
  15. }
  16. txLogger.txTrace(groupId, unitId, "join group logic over");
  17. }

 
 
  1. public void joinGroup(String groupId, String unitId, String unitType, int transactionState) throws RpcException, LcnBusinessException {
  2. JoinGroupParams joinGroupParams = new JoinGroupParams();
  3. joinGroupParams.setGroupId(groupId);
  4. joinGroupParams.setUnitId(unitId);
  5. joinGroupParams.setUnitType(unitType);
  6. joinGroupParams.setTransactionState(transactionState);
  7. MessageDto messageDto = request(MessageCreator.joinGroup(joinGroupParams));
  8. //加入事务组失败,抛出异常
  9. if (!MessageUtils.statusOk(messageDto)) {
  10. throw new LcnBusinessException(messageDto.loadBean(Throwable.class));
  11. }
  12. }

这里会catch异常一个是RpcException 异常即和服务端连接不成功,第二个是LcnBusinessException 异常这个异常是在加入事务组失败的情况下抛出的。

对于RpcException异常框架的处理是直接抛出


 
 
  1. public void handleJoinGroupMessageException(Object params, Throwable ex) throws TransactionException {
  2. throw new TransactionException(ex);
  3. }

对于LcnBusinessException异常是先清理本地事务,回滚连接然后抛出异常


 
 
  1. public void handleJoinGroupBusinessException(Object params, Throwable ex) throws TransactionException {
  2. List paramList = (List) params;
  3. String groupId = ( String) paramList. get( 0);
  4. String unitId = ( String) paramList. get( 1);
  5. String unitType = ( String) paramList. get( 2);
  6. try {
  7. transactionCleanTemplate.clean(groupId, unitId, unitType, 0);
  8. } catch (TransactionClearException e) {
  9. txLogger. error(groupId, unitId, "join group", "clean [{}]transaction fail.", unitType);
  10. }
  11. throw new TransactionException(ex);
  12. }

总结下C模块

1、本地数据库操作异常和加入事务组失败会进行本地数据库连接回滚

2、针对于在加入事务组时和服务端连接、通信失败是直接抛出异常的(基本不可能除非所有的服务端都不可用)

3、只要C模块出现异常都会向B模块抛出Throwable

B模块

B模块和C模块代码一模一样,只是B模块的doBussinessCode是所有的C模块流程与本地操作。

上面说过C模块只要出错或者本地数据库操作失败,都会被B模块的catch Throwable 所捕获到,处理逻辑和C模块一样清理本地事务,回滚连接。

也和C模块同样会启动异步检测程序,会有RpcException与LcnBusinessException处理也和C模块一致。

A模块

A模块会先进行创建事务组,但是由于业务是在之后执行的,则创建事务组只是做抛出异常。A模块catch住后都没有做其他的操作。

A模块的异常处理都放在postBusinessCode方法中。


 
 
  1. public void notifyGroup(String groupId, String unitId, String transactionType, int state) {
  2. try {
  3. txLogger.txTrace(
  4. groupId, unitId, "notify group > transaction type: {}, state: {}.", transactionType, state);
  5. if (globalContext.isDTXTimeout()) {
  6. throw new LcnBusinessException( "dtx timeout.");
  7. }
  8. state = reliableMessenger.notifyGroup(groupId, state);
  9. transactionCleanTemplate.clean(groupId, unitId, transactionType, state);
  10. } catch (TransactionClearException e) {
  11. txLogger.trace(groupId, unitId, Transactions.TE, "clean transaction fail.");
  12. } catch (RpcException e) {
  13. dtxExceptionHandler.handleNotifyGroupMessageException(Arrays.asList(groupId, state, unitId, transactionType), e);
  14. } catch (LcnBusinessException e) {
  15. // 关闭事务组失败
  16. dtxExceptionHandler.handleNotifyGroupBusinessException(Arrays.asList(groupId, state, unitId, transactionType), e.getCause());
  17. }
  18. txLogger.txTrace(groupId, unitId, "notify group exception state {}.", state);
  19. }

我们按情况来说

1、如果A、B、C模块都正确执行,这时notifyGroup方法的state参数为1,如果调用服务端通知清理事务连接有问题或者网络不通(请求异常) reliableMessenger.notifyGroup方法抛出RpcException 异常执行catch逻辑


 
 
  1. catch (RpcException e) {
  2. dtxExceptionHandler.handleNotifyGroupMessageException(Arrays.asList(groupId, state, unitId, transactionType), e);
  3. }

 
 
  1. public void handleNotifyGroupMessageException(Object params, Throwable ex) {
  2. // 当0 时候
  3. List paramList = (List) params;
  4. String groupId = (String) paramList. get( 0);
  5. int state = ( int) paramList. get( 1);
  6. if (state == 0) {
  7. handleNotifyGroupBusinessException( params, ex);
  8. return;
  9. }
  10. //1的情况
  11. String unitId = (String) paramList. get( 2);
  12. String transactionType = (String) paramList. get( 3);
  13. try {
  14. //清理本地事务
  15. transactionCleanTemplate.cleanWithoutAspectLog(groupId, unitId, transactionType, state);
  16. } catch (TransactionClearException e) {
  17. txLogger.error(groupId, unitId, "notify group", "{} > cleanWithoutAspectLog transaction error.", transactionType);
  18. }
  19. // 上报Manager,上报直到成功.
  20. tmReporter.reportTransactionState(groupId, null, TxExceptionParams.NOTIFY_GROUP_ERROR, state);
  21. }

 
 
  1. private MessageDto request(MessageDto messageDto, long timeout, String whenNonManagerMessage) throws RpcException {
  2. for ( int i = 0; i < rpcClient.loadAllRemoteKey().size() + 1; i++) {
  3. try {
  4. String remoteKey = rpcClient.loadRemoteKey();
  5. MessageDto result = rpcClient.request(remoteKey, messageDto, timeout);
  6. log.debug( "request action: {}. TM[{}]", messageDto.getAction(), remoteKey);
  7. return result;
  8. } catch (RpcException e) {
  9. if (e.getCode() == RpcException.NON_TX_MANAGER) {
  10. throw new RpcException(e.getCode(), whenNonManagerMessage + ". non tx-manager is alive.");
  11. }
  12. }
  13. }
  14. throw new RpcException(RpcException.NON_TX_MANAGER, whenNonManagerMessage + ". non tx-manager is alive.");
  15. }

会先提交本地事务(状态为1),然后会和服务端通信进行记录事务状态,可能有人会问你这都请求不到服务端,这里怎么会通信成功呢?我们都知道实际上我们的服务端部署多台,分布式事务只是选取一台来操作事务,如果其中一台不能正常工作,会选择其他服务器。上面的request方法就是根据此客户端连接的所有的服务端进行通信。

服务端接收到状态为1的消息后,会在t_tx_exception表中插入一条数据,state值为1表示要提交事务。但是这里A模块提交了本地事务了,B、C模块还没提交这是怎么搞的?

还记得前面提到的异步检测程序吗?


 
 
  1. // 异步检测
  2. dtxChecking.startDelayCheckingAsync(groupId, unitId, transactionType);

 
 
  1. public void startDelayCheckingAsync(String groupId, String unitId, String transactionType) {
  2. txLogger.taskTrace(groupId, unitId, "start delay checking task");
  3. ScheduledFuture scheduledFuture = scheduledExecutorService.schedule(() -> {
  4. try {
  5. TxContext txContext = globalContext.txContext(groupId);
  6. if (Objects.nonNull(txContext)) {
  7. synchronized (txContext.getLock()) {
  8. txLogger.taskTrace(groupId, unitId, "checking waiting for business code finish.");
  9. txContext.getLock().wait();
  10. }
  11. }
  12. int state = reliableMessenger.askTransactionState(groupId, unitId);
  13. txLogger.taskTrace(groupId, unitId, "ask transaction state {}", state);
  14. if (state == - 1) {
  15. txLogger.error( this.getClass().getSimpleName(), "delay clean transaction error.");
  16. onAskTransactionStateException(groupId, unitId, transactionType);
  17. } else {
  18. transactionCleanTemplate.clean(groupId, unitId, transactionType, state);
  19. aspectLogger.clearLog(groupId, unitId);
  20. }
  21. } catch (RpcException e) {
  22. onAskTransactionStateException(groupId, unitId, transactionType);
  23. } catch (TransactionClearException | InterruptedException e) {
  24. txLogger.error( this.getClass().getSimpleName(), "{} clean transaction error.", transactionType);
  25. }
  26. }, clientConfig.getDtxTime(), TimeUnit.MILLISECONDS);
  27. delayTasks.put(groupId + unitId, scheduledFuture);
  28. }

这个定时任务会按周期性的去调用服务端查询t_tx_exception中的state信息,然后按照state进行提交事务或者回滚事务(这里是提交)。mysql绝对可用。

如果发生业务异常LcnBusinessException,表示服务端在通知B、C客户端提交事务失败,同样服务端会写表t_tx_exception的state为1(提交事务),然后A客户端也提交事务

2、如果C模块报错则,C、B模块已回滚。这种情况下无论是什么异常只要A模块回滚即可。

请求异常回滚


 
 
  1. //请求异常回滚
  2. public void handleNotifyGroupMessageException(Object params, Throwable ex) {
  3. // 当0 时候
  4. List paramList = (List) params;
  5. String groupId = (String) paramList. get( 0);
  6. int state = ( int) paramList. get( 1);
  7. if (state == 0) {
  8. handleNotifyGroupBusinessException( params, ex);
  9. return;
  10. }

 
 
  1. public void handleNotifyGroupBusinessException(Object params, Throwable ex) {
  2. List paramList = (List) params;
  3. String groupId = (String) paramList. get( 0);
  4. int state = ( int) paramList. get( 1);
  5. String unitId = (String) paramList. get( 2);
  6. String transactionType = (String) paramList. get( 3);
  7. //用户强制回滚.
  8. if (ex instanceof UserRollbackException) {
  9. state = 0;
  10. }
  11. if ((ex.getCause() != null && ex.getCause() instanceof UserRollbackException)) {
  12. state = 0;
  13. }
  14. // 结束事务
  15. try {
  16. transactionCleanTemplate.clean(groupId, unitId, transactionType, state);
  17. } catch (TransactionClearException e) {
  18. txLogger.error(groupId, unitId, "notify group", "{} > clean transaction error.", transactionType);
  19. }
  20. }

事务异常回滚


 
 
  1. public void handleNotifyGroupBusinessException(Object params, Throwable ex) {
  2. List paramList = (List) params;
  3. String groupId = (String) paramList. get( 0);
  4. int state = ( int) paramList. get( 1);
  5. String unitId = (String) paramList. get( 2);
  6. String transactionType = (String) paramList. get( 3);
  7. //用户强制回滚.
  8. if (ex instanceof UserRollbackException) {
  9. state = 0;
  10. }
  11. if ((ex.getCause() != null && ex.getCause() instanceof UserRollbackException)) {
  12. state = 0;
  13. }
  14. // 结束事务
  15. try {
  16. transactionCleanTemplate.clean(groupId, unitId, transactionType, state);
  17. } catch (TransactionClearException e) {
  18. txLogger.error(groupId, unitId, "notify group", "{} > clean transaction error.", transactionType);
  19. }
  20. }

3、如果B或C模块异常则只能通过通知B、C进行回滚,如果通知失败则失败,靠客户端A无法处理。

注:由于服务端是高可用上述的一些异常基本不存在

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值