SpringCloudStreamRocketMQ报错The producer group[] has been created before, specify another name please

环境信息

服务端:腾讯云RocketMQ服务
客户端:SpringCloudAlibaba2021.1版本

发现错误

生产环境有人反馈系统报错,上服务器查看发现大量报错日志
The producer group[] has been created before, specify another name please

排查问题

从错误信息看来是group被创建了两次,网上查了一下解决方案基本都是
new DefaultMQProducer时,提供instance name,而且instance name唯一(意思是每次创建实例使用不同的名称)。

producer.setInstanceName(RunTimeUtil.getRocketMqUniqeInstanceName());

但是该方案是治标不治本,并且由于我们是使用框架发送消息,该对象是自动创建的。
本着先使用临时方案解决,再查找问题的心态先查看DefaultMQProducer创建的地方:RocketMQComponent4BinderAutoConfiguration、RocketMQAutoConfiguration

// DefaultMQProducer bean创建
@Bean
	@ConditionalOnMissingBean(DefaultMQProducer.class)
	public DefaultMQProducer defaultMQProducer() {
		DefaultMQProducer producer;
		String configNameServer = environment.resolveRequiredPlaceholders(
				"${spring.cloud.stream.rocketmq.binder.name-server:${rocketmq.producer.name-server:}}");
		String ak = environment.resolveRequiredPlaceholders(
				"${spring.cloud.stream.rocketmq.binder.access-key:${rocketmq.producer.access-key:}}");
		String sk = environment.resolveRequiredPlaceholders(
				"${spring.cloud.stream.rocketmq.binder.secret-key:${rocketmq.producer.secret-key:}}");
		if (!StringUtils.isEmpty(ak) && !StringUtils.isEmpty(sk)) {
			producer = new DefaultMQProducer(RocketMQBinderConstants.DEFAULT_GROUP,
					new AclClientRPCHook(new SessionCredentials(ak, sk)));
			producer.setVipChannelEnabled(false);
		}
		else {
			producer = new DefaultMQProducer(RocketMQBinderConstants.DEFAULT_GROUP);
		}
		if (StringUtils.isEmpty(configNameServer)) {
			configNameServer = RocketMQBinderConstants.DEFAULT_NAME_SERVER;
		}
		producer.setNamesrvAddr(configNameServer);
		return producer;
	}

// RocketMQTemplate 的bean创建
@Bean(destroyMethod = "destroy")
	@ConditionalOnMissingBean
	public RocketMQTemplate rocketMQTemplate(DefaultMQProducer mqProducer,
			ObjectMapper objectMapper) {
		RocketMQTemplate rocketMQTemplate = new RocketMQTemplate();
		rocketMQTemplate.setProducer(mqProducer);
		rocketMQTemplate.setObjectMapper(objectMapper);
		return rocketMQTemplate;
	}

经查看,RocketMQTemplate实现了InitializingBean接口,于是查看RocketMQTemplate中的afterPropertiesSet()方法,该方法中调用了 producer.start();,于是又查看DefaultMQProducer.start()

    /**
     * 启动这个生产者实例. </p>
     */
    @Override
    public void start() throws MQClientException {
        this.setProducerGroup(withNamespace(this.producerGroup));
        this.defaultMQProducerImpl.start();
        if (null != traceDispatcher) {
            try {
                traceDispatcher.start(this.getNamesrvAddr(), this.getAccessChannel());
            } catch (MQClientException e) {
                log.warn("trace dispatcher start failed ", e);
            }
        }
    }

可以看到实际的实例启动代码在defaultMQProducerImpl.start();,于是继续查看

// defaultMQProducerImpl.start();

	public void start() throws MQClientException {
        this.start(true);
    }

    public void start(final boolean startFactory) throws MQClientException {
        switch (this.serviceState) {
            case CREATE_JUST:
                this.serviceState = ServiceState.START_FAILED;
                //校验配置
                this.checkConfig();
                //设置实例名称
                if (!this.defaultMQProducer.getProducerGroup().equals(MixAll.CLIENT_INNER_PRODUCER_GROUP)) {
                    this.defaultMQProducer.changeInstanceNameToPID();
                }
                //获取client工厂
                this.mQClientFactory = MQClientManager.getInstance().getOrCreateMQClientInstance(this.defaultMQProducer, rpcHook);
                //注册实例
                boolean registerOK = mQClientFactory.registerProducer(this.defaultMQProducer.getProducerGroup(), this);
                if (!registerOK) {
                    this.serviceState = ServiceState.CREATE_JUST;
                    throw new MQClientException("The producer group[" + this.defaultMQProducer.getProducerGroup()
                        + "] has been created before, specify another name please." + FAQUrl.suggestTodo(FAQUrl.GROUP_NAME_DUPLICATE_URL),
                        null);
                }

                this.topicPublishInfoTable.put(this.defaultMQProducer.getCreateTopicKey(), new TopicPublishInfo());

                if (startFactory) {
                    mQClientFactory.start();
                }

                log.info("the producer [{}] start OK. sendMessageWithVIPChannel={}", this.defaultMQProducer.getProducerGroup(),
                    this.defaultMQProducer.isSendMessageWithVIPChannel());
                this.serviceState = ServiceState.RUNNING;
                break;
            case RUNNING:
            case START_FAILED:
            case SHUTDOWN_ALREADY:
                throw new MQClientException("The producer service state not OK, maybe started once, "
                    + this.serviceState
                    + FAQUrl.suggestTodo(FAQUrl.CLIENT_SERVICE_NOT_OK),
                    null);
            default:
                break;
        }

此时发现该处的instance name基本是不可修改的,一个应用中实例名默认使用的是进程pid作为实例名。

发现问题

按照上面的流程,系统运行时应该是不会出现该错误的。由于是发送消息失败,于是从源头查找问题,发送消息使用的是StreamBridge.send()方法,于是查看该方法实现原理,该方法有多个重载方法,最终调用的方法源码如下

	public boolean send(String bindingName, @Nullable String binderName, Object data, MimeType outputContentType) {
		if (!(data instanceof Message)) {
			data = MessageBuilder.withPayload(data).build();
		}
		//获取生产者对象配置
		ProducerProperties producerProperties = this.bindingServiceProperties.getProducerProperties(bindingName);
		//绑定生产者的通道
		SubscribableChannel messageChannel = this.resolveDestination(bindingName, producerProperties, binderName);
		...
		//发送消息
		return messageChannel.send(resultMessage);
	}

	synchronized SubscribableChannel resolveDestination(String destinationName, ProducerProperties producerProperties, String binderName) {
	    //从缓存中获取
		SubscribableChannel messageChannel = this.channelCache.get(destinationName);
		//没获取到,从spring上下文获取
		if (messageChannel == null && this.applicationContext.containsBean(destinationName)) {
			messageChannel = this.applicationContext.getBean(destinationName, SubscribableChannel.class);
			this.addInterceptors((AbstractMessageChannel) messageChannel);
		}
		//没获取到,则创建一个
		if (messageChannel == null) {
			messageChannel = new DirectWithAttributesChannel();
			if (this.destinationBindingCallback != null) {
				Object extendedProducerProperties = this.bindingService
						.getExtendedProducerProperties(messageChannel, destinationName);
				this.destinationBindingCallback.configure(destinationName, messageChannel,
						producerProperties, extendedProducerProperties);
			}

			Binder binder = null;
			if (StringUtils.hasText(binderName)) {
				BinderFactory binderFactory = this.applicationContext.getBean(BinderFactory.class);
				binder = binderFactory.getBinder(binderName, messageChannel.getClass());
			}

			//通道绑定到实际的生产者
			this.bindingService.bindProducer(messageChannel, destinationName, false, binder);
			//加入到缓存
			this.channelCache.put(destinationName, messageChannel);
			this.addInterceptors((AbstractMessageChannel) messageChannel);
		}

		return messageChannel;
	}

然后查看bindingService.bindProducer()是如何 绑定到实际的生产者

	public <T> Binding<T> bindProducer(T output, String outputName, boolean cache, @Nullable Binder<T, ?, ProducerProperties> binder) {
		...
		//执行绑定
		Binding<T> binding = doBindProducer(output, bindingTarget, binder,
				producerProperties);
		...
		return binding;
	}

	public <T> Binding<T> doBindProducer(T output, String bindingTarget,
			Binder<T, ?, ProducerProperties> binder,
			ProducerProperties producerProperties) {
		//如果没有定时器,或者没有配置重试,则直接绑定(报异常会抛出)
		if (this.taskScheduler == null
				|| this.bindingServiceProperties.getBindingRetryInterval() <= 0) {
			return binder.bindProducer(bindingTarget, output, producerProperties);
		}
		//否则先尝试绑定。出现异常则先响应延迟绑定,并使用定时器自动重试
		else {
			try {
				return binder.bindProducer(bindingTarget, output, producerProperties);
			}
			catch (RuntimeException e) {
				LateBinding<T> late = new LateBinding<T>(bindingTarget,
						e.getCause() == null ? e.toString() : e.getCause().getMessage(), producerProperties, false);
				rescheduleProducerBinding(output, bindingTarget, binder,
						producerProperties, late, e);
				return late;
			}
		}
	}

而以上代码中 binder.bindProducer() 最终会调用到 AbstractMessageChannelBinder.createProducerMessageHandler(),而该方法已有 RocketMQMessageChannelBinder 重写。在该方法中会重新创建生产者,即RocketMQTemplate对象和DefaultMQProducer对象,此处创建DefaultMQProducer对象时会使用TOPIC名称作为instance name【producer.setInstanceName(RocketMQUtil.getInstanceName(rpcHook, destination.getName() + "|" + UtilAll.getPid()));】,即同一个TOPIC会使用相同的instance name。
看到此处发现只有多次调用 bindingService.bindProducer() 方法,则会使用同一个instance name创建多个相同实例,此时才会报以上错误。如果要触发多次调用,只有channelCache中不存在,并且spring上下文中也不存在才会执行。
由于我们使用StreamBridge.send()方法时传入的是TOPIC名称,spring上下文中肯定不存在,于是查看channelCache的实现代码

StreamBridge(FunctionCatalog functionCatalog, FunctionRegistry functionRegistry,
			BindingServiceProperties bindingServiceProperties, ConfigurableApplicationContext applicationContext,
			@Nullable NewDestinationBindingCallback destinationBindingCallback) {
		this.bindingService = applicationContext.getBean(BindingService.class);
		this.functionCatalog = functionCatalog;
		this.functionRegistry = functionRegistry;
		this.applicationContext = applicationContext;
		this.bindingServiceProperties = bindingServiceProperties;
		this.destinationBindingCallback = destinationBindingCallback;
		//此处使用的是LinkedHashMap,并且重写了removeEldestEntry()。
		this.channelCache = new LinkedHashMap<String, SubscribableChannel>() {
			@Override
			protected boolean removeEldestEntry(Map.Entry<String, SubscribableChannel> eldest) {
				//bindingServiceProperties.getDynamicDestinationCacheSize()默认为10
				boolean remove = size() > bindingServiceProperties.getDynamicDestinationCacheSize();
				if (remove && logger.isDebugEnabled()) {
					logger.debug("Removing message channel from cache " + eldest.getKey());
				}
				return remove;
			}
		};
	}

发现channelCache是一个LinkedHashMap,并且重写了removeEldestEntry()方法【感兴趣的可以去查看相关方法】。当
channelCache中元素个数大于bindingServiceProperties.getDynamicDestinationCacheSize()【该配置默认为10个】时,则移除第一个元素。至此大概知道原因了,于是登录服务器查看rocketmq的日志,默认在{user.home}/logs/rocketmqlogs 目录下,搜索instance name
符合规则RocketMQUtil.getInstanceName(rpcHook, destination.getName() + "|" + UtilAll.getPid())的client,果然数量大于10个,至此真相大白。

分析原因

系统每次使用发送消息StreamBridge.send()发送消息时,才会创建RocketMQ的客户端实例,创建完成后则将TOPIC和SubscribableChannel存放在channelCache中。当系统中消息发送的TOPIC数量大于10个时,则会移除掉channelCache中最早维护的TOPIC关系。而下次再往该TOPIC发送消息时,则会重新创建SubscribableChannel,在创建SubscribableChannel的过程中会重新创建RocketMQ的客户端实例,导致应用中mQClientFactory存在相同的生产组报错。

解决方案

  1. 扩大channelCache的容量,即修改bindingServiceProperties.getDynamicDestinationCacheSize()的配置
#根据实际需求配置
spring.cloud.stream.dynamicDestinationCacheSize=20
  1. 使用StreamBridge.send()发送消息时,第一个参数传入springbean 名称而不是使用TOPIC名称。该方案需要在配置文件中配置生产者的SubscribableChannel

总结

由于偷懒不想在每个项目中配置生产者的SubscribableChannel,导致项目中产生隐藏BUG。好在最后问题排查出来并解决,但是由于系统部分消息丢失而导致的数据问题还需修复

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值