项目地址
可靠mq项目:reliable-mq
支持消息事务消息发送、幂等消费、顺序消费和可靠消费
问题
采用自动ack的方式,如果处理方法未抛出异常,则自动ack消息;如方法抛出异常,则消息会被重新放回队列。
如果处理方法没有catch异常,当代码有bug时,则消息会被不断消费,但是又消费不成功,不停打印异常日志,甚至导致日志文件将硬盘撑爆。
为了避免这种情况,大家可能会在消费mq的业务代码最外围用try catch包住,等于消费永远成功(这种行为,等价于将defaultRequeueRejected设置为false)。
这种做法,在出现异常的时候,只是简单打一个异常日志,容易丢失mq消息,导致业务问题。
spring-rabbit对消费失败重试的支持
spring:
#rabbitMq本身的配置
rabbitmq:
virtual-host: BP_VHOST #虚拟主机
host: rmq01.nh.fdd #服务器地址
username: admin #用户名
password: admin123 #密码
publisher-confirms: true #启用发送确认机制
template:
mandatory: true #RabbitTemplate发送手动确认
listener:
type: simple #默认就是simple,只用simple
#type为simple时的配置
simple:
retry:
enabled: true #启用消费失败重试
max-attempts: 3
initialInterval: 1000 #消费失败时的初始延迟重试时间,单位毫秒
multiplier: 2 #延迟时间的翻倍因子
maxInterval: 10000 #最大延迟时间,重试是本地重试,总的时间不能太长,因为太长了,mq服务器说不定就任务mq消息未被相应,重新分配,单位毫秒
看spring.rabbitmq.listener.simple.retry。这里使用了退避算法,初始的间隔时间是initialInterval,接下来每失败一次,间隔时间都需要乘multiplier系数,最大不超过maxInterval。如果重试到最大重试max-attempts次。如果到达最大重试次数,依然失败,则会调用MessageRecover恢复消息。
重试的配置,是在org.springframework.boot.autoconfigure.amqp.AbstractRabbitListenerContainerFactoryConfigurer中。
public abstract class AbstractRabbitListenerContainerFactoryConfigurer<T extends AbstractRabbitListenerContainerFactory<?>> {
......
protected void configure(T factory, ConnectionFactory connectionFactory,
RabbitProperties.AmqpContainer configuration) {
Assert.notNull(factory, "Factory must not be null");
Assert.notNull(connectionFactory, "ConnectionFactory must not be null");
Assert.notNull(configuration, "Configuration must not be null");
factory.setConnectionFactory(connectionFactory);
if (this.messageConverter != null) {
factory.setMessageConverter(this.messageConverter);
}
factory.setAutoStartup(configuration.isAutoStartup());
if (configuration.getAcknowledgeMode() != null) {
factory.setAcknowledgeMode(configuration.getAcknowledgeMode());
}
if (configuration.getPrefetch() != null) {
factory.setPrefetchCount(configuration.getPrefetch());
}
if (configuration.getDefaultRequeueRejected() != null) {
factory.setDefaultRequeueRejected(configuration.getDefaultRequeueRejected());
}
if (configuration.getIdleEventInterval() != null) {
factory.setIdleEventInterval(configuration.getIdleEventInterval().toMillis());
}
factory.setMissingQueuesFatal(configuration.isMissingQueuesFatal());
ListenerRetry retryConfig = configuration.getRetry();
if (retryConfig.isEnabled()) {
RetryInterceptorBuilder<?, ?> builder = (retryConfig.isStateless())
? RetryInterceptorBuilder.stateless()
: RetryInterceptorBuilder.stateful();
RetryTemplate retryTemplate = new RetryTemplateFactory(
this.retryTemplateCustomizers).createRetryTemplate(retryConfig,
RabbitRetryTemplateCustomizer.Target.LISTENER);
builder.retryOperations(retryTemplate);
// 就是这里了!!!会先判断是否有指定messageRecoverer,如果没有,就会设置为RejectAndDontRequeueRecoverer,即到达最大重试次数后,丢弃消息
MessageRecoverer recoverer = (this.messageRecoverer != null)
? this.messageRecoverer : new RejectAndDontRequeueRecoverer();
builder.recoverer(recoverer);
factory.setAdviceChain(builder.build());
}
}
}
当messageRecoverer为null的时候,就会采用RejectAndDontRequeueRecoverer——到达最大失败重试次数,就会丢弃消息。所以需要制定自己的消息恢复对象,这个对象可以做两件事情:保存消费失败的消息,告警提醒消费失败
messageRecover是在org.springframework.boot.autoconfigure.amqp.RabbitAnnotationDrivenConfiguration设置的
@Configuration
@ConditionalOnClass(EnableRabbit.class)
class RabbitAnnotationDrivenConfiguration {
private final ObjectProvider<MessageConverter> messageConverter;
private final ObjectProvider<MessageRecoverer> messageRecoverer;
private final ObjectProvider<RabbitRetryTemplateCustomizer> retryTemplateCustomizers;
private final RabbitProperties properties;
// ObjectProvider这种方式挺有意思的,适合在starter中用,预留一些拓展
RabbitAnnotationDrivenConfiguration(ObjectProvider<MessageConverter> messageConverter,
ObjectProvider<MessageRecoverer> messageRecoverer,
ObjectProvider<RabbitRetryTemplateCustomizer> retryTemplateCustomizers,
RabbitProperties properties) {
this.messageConverter = messageConverter;
this.messageRecoverer = messageRecoverer;
this.retryTemplateCustomizers = retryTemplateCustomizers;
this.properties = properties;
}
@Bean
@ConditionalOnMissingBean
public SimpleRabbitListenerContainerFactoryConfigurer simpleRabbitListenerContainerFactoryConfigurer() {
SimpleRabbitListenerContainerFactoryConfigurer configurer = new SimpleRabbitListenerContainerFactoryConfigurer();
configurer.setMessageConverter(this.messageConverter.getIfUnique());
// ObjectProvider.getIfUnique只要spring环境中,只有一个类型为MessageRecover的bean,返回的就不为空
configurer.setMessageRecoverer(this.messageRecoverer.getIfUnique());
configurer.setRetryTemplateCustomizers(this.retryTemplateCustomizers
.orderedStream().collect(Collectors.toList()));
configurer.setRabbitProperties(this.properties);
return configurer;
}
......
}
从RabbitAnnotationDrivenConfiguration的代码可以看到,messageRecover对象的获取,是通过ObjectProvider.getIfUnique来获取,如果我们想使用自己定义的MessageRecover对象,只需要用@Bean注解定义好。
实现
保存消费失败的mq消息
表结构为:
CREATE TABLE `consume_fail_record` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT COMMENT '自增主键',
`queue` VARCHAR(100) NOT NULL COMMENT '队列名称',
`application` VARCHAR(100) NOT NULL COMMENT '应用名称',
`message_id` VARCHAR(50) NOT NULL COMMENT '消息id',
`headers` TEXT DEFAULT NULL COMMENT '消息头',
`body` TEXT NOT NULL COMMENT '消息内容',
`error_stack` TEXT NOT NULL COMMENT '失败原因的错误堆栈',
`create_time` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
`update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间',
PRIMARY KEY (`id`),
UNIQUE KEY `idx_message_app` (`message_id`, `application`)
) ENGINE = InnoDB DEFAULT CHARACTER SET = utf8mb4 COMMENT = '消息消费失败记录';
保存异常的堆栈信息,方便排查代码bug。
保存queue,修复bug后,可以用headers和body到管理后台重发mq消息。
自定义MessageRecover
在reliable-mq中,定义了
com.zidongxiangxi.reliablemq.consumer.rely.RabbitDatabaseMessageRecover
@Slf4j
public class RabbitDatabaseMessageRecover implements MessageRecoverer {
private ConsumeFailRecordManager consumeFailRecordManager;
private Alarm alarm;
public RabbitDatabaseMessageRecover(ConsumeFailRecordManager consumeFailRecordManager,
ObjectProvider<Alarm> alarmProvider) {
this.consumeFailRecordManager = consumeFailRecordManager;
if (Objects.nonNull(alarmProvider)) {
try {
this.alarm = alarmProvider.getIfUnique();
} catch (Throwable ignore) {}
}
}
@Override
public void recover(Message message, Throwable cause) {
ConsumeFailRecord record = RabbitUtils.generateConsumeFailRecord(message, cause);
if (Objects.isNull(record)) {
return;
}
// 如果告警对象不为空,则触发告警
if (Objects.nonNull(alarm)) {
alarm.failWhenConsume(record.getBody(), cause);
}
// 保存消费失败的记录
boolean success = consumeFailRecordManager.saveRecord(record);
if (!success) {
log.warn("[RabbitDatabaseMessageRecover] fail to save consume fail record:{}", JSON.toJSONString(record));
}
}
}
消费失败后,会用RabbitUtils.generateConsumeFailRecord构造出一个ConsumeFailRecord对象,保存到数据库,方便恢复。
预留了一个alarmProvider参数,如果消费失败要触发告警,使用者只要实现com.zidongxiangxi.reliabelmq.api.alarm.Alarm接口,编写自己的告警代码,定义一个bean到spring中即可。
查漏补缺
到上面,可靠消费方案就完成了。但是只对starter生成的MessageLisenterContainer才生效。自己定义的MessageLisenterContainer对象,需要定义一个RetryOperationsInterceptor,设置到每个MessageLisenterContainer对象的advieChain属性中。
reliable-mq在com.zidongxiangxi.reliablemq.starter.consumer.rabbit.ReliableMqRabbitConsumerConfiguration定义了RetryOperationsInterceptor对象。
@Configuration
@ConditionalOnClass(SimpleMessageListenerContainer.class)
@ConditionalOnProperty(prefix = "reliable-mq.consumer.rabbit", name = "enabled", havingValue = "true")
public class ReliableMqRabbitConsumerConfiguration {
private ReliableMqConsumerRabbit rabbitProperties;
public ReliableMqRabbitConsumerConfiguration(ReliableMqConsumerRabbit rabbitProperties) {
this.rabbitProperties = rabbitProperties;
}
......
/**
* 定义失败重试的interceptor
*
* @param messageRecovererProvider 消息恢复提供者
* @return 失败重试的interceptor
*/
@Bean(BeanNameConstants.INTERNAL_RABBIT_RETRY_OPERATIONS_INTERCEPTOR)
@ConditionalOnMissingBean(RetryOperationsInterceptor.class)
public RetryOperationsInterceptor retryOperationsInterceptor(
ObjectProvider<MessageRecoverer> messageRecovererProvider
) {
ReliableMqConsumerRely properties = rabbitProperties.getRely();
RetryInterceptorBuilder<RetryInterceptorBuilder.StatelessRetryInterceptorBuilder,
RetryOperationsInterceptor> builder = RetryInterceptorBuilder.stateless();
PropertyMapper map = PropertyMapper.get();
RetryTemplate template = new RetryTemplate();
SimpleRetryPolicy policy = new SimpleRetryPolicy();
map.from(properties::getMaxAttempts).to(policy::setMaxAttempts);
template.setRetryPolicy(policy);
ExponentialBackOffPolicy backOffPolicy = new ExponentialBackOffPolicy();
map.from(properties::getInitialInterval).whenNonNull().as(Duration::toMillis)
.to(backOffPolicy::setInitialInterval);
map.from(properties::getMultiplier).to(backOffPolicy::setMultiplier);
map.from(properties::getMaxInterval).whenNonNull().as(Duration::toMillis)
.to(backOffPolicy::setMaxInterval);
template.setBackOffPolicy(backOffPolicy);
builder.retryOperations(template);
MessageRecoverer recoverer = null;
try {
recoverer = messageRecovererProvider.getIfUnique();
} catch (Throwable e) {}
recoverer = Objects.nonNull(recoverer) ? recoverer : new RejectAndDontRequeueRecoverer();
builder.recoverer(recoverer);
return builder.build();
}
}
......
}
其实就是抄袭spring的自动配置的代码~~只是spring生成的RetryOperationsInterceptor对象没有成为一个bean,而这里定义成了一个bean。
接下来就可以利用BeanPostProcessor的机制把该bean设置到adviceChain中。依然是用RabbitListenerContainerBeanPostProcessor和SimpleRabbitListenerContainerFactoryBeanPostProcessor(可以看《rabbitMq幂等消费方案》)。