浅谈如何解决RocketMQ重复消费的问题

最新推荐文章于 2024-04-19 09:12:27 发布

梦之救赎

最新推荐文章于 2024-04-19 09:12:27 发布

阅读量8.6k

点赞数 3

文章标签： java 开发语言后端

本文链接：https://blog.csdn.net/weixin_43506048/article/details/121054787

版权

MQ重复消费是指同一个应用的多个实例收到相同的消息，或者同一个实例收到多次相同的消息，若消费者逻辑未做幂等处理，就会造成重复消费。

消息重复这个问题本质上是MQ设计上的at least once 还是exactly once的问题，消费者肯定想要exactly once，但MQ要保证消息投递的可靠性，对未ack的消息，会重复投递。因此消费者端要自己保证消费的幂等性，方法如：消费者收到消息后，从消息中获取消息标识写入到Redis或数据库，当再次收到该消息时就不作处理。消息重复投递的场景，除重试外，很大一部分来自于负载均衡阶段，前一个监听Queue的消费实例拉取的消息未全部ack，新的消费实例监听到这个Queue重新拉取消息。

微众银行为解决负载均衡阶段重复听或漏听的问题，在负载均衡结果变化过程增加了一个过渡态，在过渡态的时候，Consumer会继续保留上一次负载均衡的结果，直到原消费者拉取的消息全部ack，才释放老的结果。

改造的实现是在RocketMQ的Broker端增加了一个ConsumeQueueAccessLockManager类，对Queue加了锁。当新的Consumer拉取消息的时候，判断一下如果该Consumer监听的Queue存在已投递但仍未收到ack且未超时的消息，就不允许获取锁，直到该Queue投递的消息全部ack或者消费超时，才允许该Consumer获取锁，拉取消息。

ConsumeQueueAccessLockManager中获取锁部分逻辑如下：

public synchronized boolean updateAccessControlTable(String group, String topic, String clientId, int queueId) {
	if (group != null && topic != null && clientId != null) {
		ConcurrentHashMap<String/*Topic*/, ConcurrentHashMap<Integer/*queueId*/, AccessLockEntry>> topicTable = accessLockTable.get(group);
		if (topicTable == null) {
			topicTable = new ConcurrentHashMap<>();
			accessLockTable.put(group, topicTable);
			LOG.info("group not exist, put group:{}", group);
		}
		ConcurrentHashMap<Integer/*queueId*/, AccessLockEntry> queueIdTable = topicTable.get(topic);
		if (queueIdTable == null) {
			queueIdTable = new ConcurrentHashMap<>();
			topicTable.put(topic, queueIdTable);
			LOG.info("topic not exist, put topic:{} into group {}", topic, group);
		}

		AccessLockEntry accessEntry = queueIdTable.get(queueId);
		if (accessEntry == null) {
			long deliverOffset = brokerController.getConsumeQueueManager().queryDeliverOffset(group, topic, queueId);
			accessEntry = new AccessLockEntry(clientId, System.currentTimeMillis(), deliverOffset);
			queueIdTable.put(queueId, accessEntry);
			LOG.info("mq is not locked. I got it. group:{}, topic:{}, queueId:{}, newClient:{}",
				group, topic, queueId, clientId);
			return true;
		}

		//已经占有该Queue，则更新时间
		if (clientId.equals(accessEntry.getClientId())) {
			accessEntry.setLastAccessTimestamp(System.currentTimeMillis());
			accessEntry.setLastDeliverOffset(brokerController.getConsumeQueueManager().queryDeliverOffset(group, topic, queueId));
			return false;
		}
		//不占有该Queue，且不是wakeup的请求，才能抢锁
		else {
			long holdTimeThreshold = brokerController.getDeFiBusBrokerConfig().getLockQueueTimeout();
			long realHoldTime = System.currentTimeMillis() - accessEntry.getLastAccessTimestamp();
			boolean holdTimeout = (realHoldTime > holdTimeThreshold);

			long deliverOffset = brokerController.getConsumeQueueManager().queryDeliverOffset(group, topic, queueId);
			long lastDeliverOffset = accessEntry.getLastDeliverOffset();
			if (deliverOffset == lastDeliverOffset) {
				accessEntry.getDeliverOffsetNoChangeTimes().incrementAndGet();
			} else {
				accessEntry.setLastDeliverOffset(deliverOffset);
				accessEntry.setDeliverOffsetNoChangeTimes(0);
			}

			long ackOffset = brokerController.getConsumeQueueManager().queryOffset(group, topic, queueId);
			long diff = deliverOffset - ackOffset;
			boolean offsetEqual = (diff == 0);

			int deliverOffsetNoChangeTimes = accessEntry.getDeliverOffsetNoChangeTimes().get();
			boolean deliverNoChange = (deliverOffsetNoChangeTimes >= brokerController.getDeFiBusBrokerConfig().getMaxDeliverOffsetNoChangeTimes());

			if ((offsetEqual && deliverNoChange) || holdTimeout) {
				LOG.info("tryLock mq, update access lock table. topic:{}, queueId:{}, newClient:{}, oldClient:{}, hold time threshold:{}, real hold time:{}, hold timeout:{}, offset equal:{}, diff:{}, deliverOffset no change:{}, deliverOffset:{}, ackOffset:{}",
					topic,
					queueId,
					clientId,
					accessEntry.getClientId(),
					holdTimeThreshold,
					realHoldTime,
					holdTimeout,
					offsetEqual,
					diff,
					deliverNoChange,
					deliverOffset,
					ackOffset);

				accessEntry.setLastAccessTimestamp(System.currentTimeMillis());
				accessEntry.setLastDeliverOffset(deliverOffset);
				accessEntry.getDeliverOffsetNoChangeTimes().set(0);
				accessEntry.setClientId(clientId);
				return true;
			}
			LOG.info("tryLock mq, but mq locked by other client: {}, group: {}, topic: {}, queueId: {}, nowClient:{}, hold timeout:{}, offset equal:{}, deliverOffset no change times:{}", accessEntry.getClientId(),
				group, topic, queueId, clientId, holdTimeout, offsetEqual, deliverOffsetNoChangeTimes);
			return false;
		}

	}
	return false;
}

DeFiPullMessageProcessor中拉取消息的逻辑如下：

@Override
public RemotingCommand processRequest(final ChannelHandlerContext ctx,
	RemotingCommand request) throws RemotingCommandException {
	final PullMessageRequestHeader requestHeader =
		(PullMessageRequestHeader) request.decodeCommandCustomHeader(PullMessageRequestHeader.class);
	ConsumerGroupInfo consumerGroupInfo = deFiBrokerController.getConsumerManager().getConsumerGroupInfo(requestHeader.getConsumerGroup());
	if (deFiBrokerController.getDeFiBusBrokerConfig().getMqAccessControlEnable() == 1) {
		//集群模式才进行访问表控制
		if (consumerGroupInfo != null && consumerGroupInfo.getMessageModel() == MessageModel.CLUSTERING) {
			ClientChannelInfo clientChannelInfo = consumerGroupInfo.getChannelInfoTable().get(ctx.channel());
			if (clientChannelInfo != null) {
				String group = consumerGroupInfo.getGroupName();
				String topic = requestHeader.getTopic();
				int queueId = requestHeader.getQueueId();
				String clientId = clientChannelInfo.getClientId();

				boolean acquired = deFiBrokerController.getMqAccessLockManager().updateAccessControlTable(group, topic, clientId, queueId);
				boolean isAllowed = deFiBrokerController.getMqAccessLockManager().isAccessAllowed(group,topic,clientId,queueId);

				//不是分给自己的Queue，返回空
				if (!isAllowed) {
					RemotingCommand response = RemotingCommand.createResponseCommand(PullMessageResponseHeader.class);
					final PullMessageResponseHeader responseHeader = (PullMessageResponseHeader) response.readCustomHeader();

					LOG.info("pull message rejected. queue is locked by other client. group:{}, topic:{}, queueId:{}, queueOffset:{}, request clientId:{}",
						requestHeader.getConsumerGroup(), requestHeader.getTopic(), requestHeader.getQueueId(), requestHeader.getQueueOffset(), clientId);

					responseHeader.setMinOffset(deFiBrokerController.getMessageStore().getMinOffsetInQueue(requestHeader.getTopic(), requestHeader.getQueueId()));
					responseHeader.setMaxOffset(deFiBrokerController.getMessageStore().getMaxOffsetInQueue(requestHeader.getTopic(), requestHeader.getQueueId()));
					responseHeader.setNextBeginOffset(requestHeader.getQueueOffset());
					responseHeader.setSuggestWhichBrokerId(MixAll.MASTER_ID);
					response.setCode(ResponseCode.PULL_NOT_FOUND);
					response.setRemark("mq is locked by other client.");
					return response;
				}
				//分到一个Q之后，更新offset为最新的ackOffset，避免消息重复
				if (acquired) {
					long nextBeginOffset = correctRequestOffset(group, topic, queueId, requestHeader.getQueueOffset());
					if (nextBeginOffset != requestHeader.getQueueOffset().longValue()) {
						RemotingCommand response = RemotingCommand.createResponseCommand(PullMessageResponseHeader.class);
						final PullMessageResponseHeader responseHeader = (PullMessageResponseHeader) response.readCustomHeader();
						response.setOpaque(request.getOpaque());
						responseHeader.setMinOffset(deFiBrokerController.getMessageStore().getMinOffsetInQueue(requestHeader.getTopic(), requestHeader.getQueueId()));
						responseHeader.setMaxOffset(deFiBrokerController.getMessageStore().getMaxOffsetInQueue(requestHeader.getTopic(), requestHeader.getQueueId()));
						responseHeader.setNextBeginOffset(nextBeginOffset);
						responseHeader.setSuggestWhichBrokerId(MixAll.MASTER_ID);
						response.setCode(ResponseCode.PULL_NOT_FOUND);
						response.setRemark("lock a queue success, update pull offset.");
						LOG.info("update pull offset from [{}] to [{}] after client acquire a queue. clientId:{}, queueId:{}, topic:{}, group:{}",
							requestHeader.getQueueOffset(), nextBeginOffset, clientId, requestHeader.getQueueId(),
							requestHeader.getTopic(), requestHeader.getConsumerGroup());
						return response;
					}
					else {
						LOG.info("no need to update pull offset. clientId:{}, queueId:{}, topic:{}, group:{}, request offset: {}",
							clientId, requestHeader.getQueueId(), requestHeader.getTopic(), requestHeader.getConsumerGroup(), requestHeader.getQueueOffset());
					}
				}
			}
		}
	}

	//...
	return response;
}

在大消息量的场景下，在Broker端做的这些改造，能有效减少无意义的重复投递，对节省网络资源等有很大意义，即使这个改造，会影响一点服务端性能，但整体权衡利远大于弊。这个特性也有很强的通用性，完全适用于其它项目。话说回来，虽然在Broker端做了很大改造，但在重试等场景下，仍可能造成消息重复投递，消费者端还是要做好消费的幂等处理。

--本文引入的源码均来自于微众开源项目DeFiBus

DeFiBus/3-circuit-break-mechanism.md at develop · WeBankFinTech/DeFiBus · GitHubDeFiBus=RPC+MQ，安全可控的分布式金融级消息总线。. Contribute to WeBankFinTech/DeFiBus development by creating an account on GitHub.https://github.com/WeBankFinTech/DeFiBus/blob/develop/docs/cn/features/3-circuit-break-mechanism.md

梦之救赎

关注

3
点赞
踩
22

收藏

觉得还不错? 一键收藏
0
评论
浅谈如何解决RocketMQ重复消费的问题

MQ重复消费是指同一个应用的多个实例收到相同的消息，或者同一个实例收到多次相同的消息，若消费者逻辑未做幂等处理，就会造成重复消费。消息重复这个问题本质上是MQ设计上的atleastonce 还是exactlyonce的问题，消费者肯定想要exactly once，但MQ要保证消息投递的可靠性，对未ack的消息，会重复投递。因此消费者端要自己保证消费的幂等性，方法如：消费者收到消息后，从消息中获取消息标识写入到Redis或数据库，当再次收到该消息时就不作处理。消息重复...
复制链接

扫一扫