org.eclipse.paho.client.mqttv3 同步Client 发布订阅线程为同一个时,将永久阻塞的问题
使用MQTT版本为:
<dependency>
<groupId>org.eclipse.paho</groupId>
<artifactId>org.eclipse.paho.client.mqttv3</artifactId>
<version>1.2.5</version>
</dependency>
问题复现
使用 Flux 改造MQTT Client 时,又如下业务行为即订阅某一主题A,当主题A中消息到达时根据A的内容发布主题B消息。
getClient().subscribeTopic(MqttTopic.newTopic(topic, Qos._0, Direction.Sub))
.doOnNext(mqttMessage -> System.out.println("sub 中的线程 " + Thread.currentThread().getName()))
// .publishOn(publishSchedule)
.map(requestToResponse)
.map(responseMessage -> Tuples.of(responseMessage, singletonClient.publish(responseMessage)))
.subscribe(tuple -> {
System.out.println("--------------------------------------------------");
System.out.println("消息已回复 【"+ (tuple.getT2() ? "成功" : "失败") +"】");
System.out.println("消息体为 " + tuple.getT1().getMessage().toString());
System.out.println("--------------------------------------------------");
});
此时 客户端发布将永远保持阻塞。
问题排查
jps
查询当前所有Java 程序 pid 。
jstack -l 1736
查看线程 dump。
结果如下,其中在问题发生前线程状态为:
"MQTT Call: sceneServer-test-client-78afe" #16 prio=5 os_prio=0 tid=0x000000002989c000 nid=0x5b20 in Object.wait() [0x000000002b30f000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000071a1bd478> (a java.lang.Object)
at java.lang.Object.wait(Object.java:502)
at org.eclipse.paho.client.mqttv3.internal.CommsCallback.run(CommsCallback.java:181)
- locked <0x000000071a1bd478> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- None
问题发生后线程状态为:
"MQTT Call: sceneServer-test-client-78afe" #16 prio=5 os_prio=0 tid=0x000000002989c000 nid=0x5b20 in Object.wait() [0x000000002b30d000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000071d139730> (a java.lang.Object)
at java.lang.Object.wait(Object.java:502)
at org.eclipse.paho.client.mqttv3.internal.Token.waitForResponse(Token.java:143)
- locked <0x000000071d139730> (a java.lang.Object)
at org.eclipse.paho.client.mqttv3.internal.Token.waitForCompletion(Token.java:108)
at org.eclipse.paho.client.mqttv3.MqttToken.waitForCompletion(MqttToken.java:67)
at org.eclipse.paho.client.mqttv3.MqttClient.publish(MqttClient.java:570)
at org.eclipse.paho.client.mqttv3.MqttClient.publish(MqttClient.java:562)
at com.whb.util.mqtt.MqttClient.publish(MqttClient.java:141)
由此可见,线程阻塞在 org.eclipse.paho.client.mqttv3.internal.Token.waitForResponse(Token.java:143)
方法中,通过调试查看该方法发现:
方法中 responseLock.wait(); 使线程开始等待,并通过同类下的: protected void notifyComplete() 方法唤醒线程。
通过断点调试可以发现,调用 protected void notifyComplete() 的线程与订阅的线程为同一线程,若客户端订阅线程为A发布线程也为A时,将导致线程阻塞。