硬件: ESP32C3
AT 版本号:
AT version:2.4.2.0 和 3.2.0.0
一、问题描述:分别测试 AT 2.4.2.0 和 3.2.0.0 版本的 AT+MQTTCONNCFG 指令, 通过 AT+MQTTCONNCFG 指令设置 keep alive 时间为 60s 时, 不同版本实际测试下来 PING request 时间间隔表现行为不一样, 主要表现如下:
1. AT 2.4.2.0 版本,在没有交互包的时候,PING req 时间间隔是 30s, 如果有交互包, 会根据最后一次交互包的时间刷新, 以最后一次交互时间作为起始时间, 间隔 30s 之后发送 ping req 。
2. 3.2.0.0 版本, 没有交互包的时候,PING req 时间间隔是 30s, 如果有交互包,发现并不会根据最后一次交互包的时间刷新, 而是以上一次 ping req 时间作为起始时间, 中间无论交互多少次,时间都不会被刷新, 这个与我们期待的行为不太一致。
二、问题复现
AT 2.4.2.0
AT+CWMODE=1
OK
AT+CWJAP="espressif","12345678"
WIFI CONNECTED
WIFI GOT IP
OK
AT+SYSLOG=1
OK
AT+MQTTUSERCFG=0,1,"J1A0801F0002D&1697685358&853807230","take_order","dnVENLAAFKFzubcxfGvLy8x9hRg=",0,0,""
OK
AT+MQTTCONNCFG=0,60,1,"","",0,0
OK
AT+MQTTCONN=0,"192.168.137.1",1883,1
+MQTTCONNECTED:0,1,"192.168.137.1","1883","",1
OK
AT+MQTTPUB=0,"interact","\"{\"count\":\"1\"}\"",0,0
OK
AT+MQTTPUB=0,"interact","\"{\"count\":\"2\"}\"",0,0
OK
AT+MQTTPUB=0,"interact","\"{\"count\":\"3\"}\"",0,0
OK
wifi 抓包如下:
AT 3.2.0.0
AT+SYSLOG=1
OK
AT+CWMODE=1
OK
AT+CWJAP="espressif","12345678"
WIFI CONNECTED
WIFI GOT IP
OK
AT+MQTTUSERCFG=0,1,"J1A0801F0002D&1697685358&853807230","take_order","dnVENLAAFKFzubcxfGvLy8x9hRg=",0,0,""
OK
AT+MQTTCONNCFG=0,60,1,"","",0,0
OK
AT+MQTTCONN=0,"192.168.137.1",1883,1
+MQTTCONNECTED:0,1,"192.168.137.1","1883","",1
OK
AT+MQTTPUB=0,"topic","\"{\"timestamp\":\"20201121085253\"}\"",0,0
OK
AT+MQTTPUB=0,"Topic","\"{\"timestamp\":\"20201121085253\"}\"",0,0
OK
AT+MQTTSUB=0,"Topic",0
OK
AT+MQTTPUB=0,"topic","\"{\"timestamp\":\"20201121085253\"}\"",0,0
OK
AT+MQTTPUB=0,"Topic","\"{\"timestamp\":\"20201121085253\"}\"",0,0
OK
+MQTTSUBRECV:0,"Topic",32,"{"timestamp":"20201121085253"}"
AT+MQTTPUB=0,"Topic","\"{\"timestamp\":\"20201121085253\"}\"",0,0
OK
+MQTTSUBRECV:0,"Topic",32,"{"timestamp":"20201121085253"}"
AT+MQTTPUB=0,"Topic","\"{\"timestamp\":\"20201121085253\"}\"",0,0
OK
+MQTTSUBRECV:0,"Topic",32,"{"timestamp":"20201121085253"}"
AT+MQTTPUB=0,"Topic","\"{\"timestamp\":\"20201121085253\"}\"",0,0
OK
+MQTTSUBRECV:0,"Topic",32,"{"timestamp":"20201121085253"}"
AT+MQTTPUB=0,"Topic","\"{\"timestamp\":\"20201121085253\"}\"",0,0
OK
+MQTTSUBRECV:0,"Topic",32,"{"timestamp":"20201121085253"}"
wifi 抓包如下:
三、问题分析
在分析这条指令之前, 我们先来看下这个参数是用来做什么的, 文档中对于这条指令的定义如下:
从上面的描述中, 我们可以看出 keep alive 的参数描述还是比较简单的, 只是说用来设置 mqtt ping 超时时间的, keep alive 如果不设置的话,那么默认是 120s , ping request 的时间间隔是 60s ,关于这个时间间隔, 我和chenwu 进行了沟通, 他补充了一些关于 mqtt ping 超时的说明:
1、每发送任意 MQTT 层包,将刷新 PING 发送时间。
2. 当无报文交互时, 默认到达 60s 时, 会发送 mqtt ping , 再过 60s , 如果接收不到 PING response, 将触发重连/断开。
3.如果接收到 PING 包, 将重新计时,也就是默认情况下, 60s 发送一次 PING 包。如果连续 120s 接收不到 PING response, 将触发重连/断开。
按照上面抓包来看, 2.4.2.0 确实符合要求, 但是 3.2.0.0 不会根据交互包更新时间, 它是根据上一次 ping req 的时间作为起始时间的。
再进一步排查, 发现了问题原因:
是因为 esp-mqtt 行为变更导致,在这个 commit 之后:Changes the moment we update keepalive_tick. · espressif/esp-mqtt@2c2e6f3 · GitHub。
对应的 esp-idf commit 信息如下:
https://github.com/espressif/esp-idf/commit/58cf25193dc45f379f5a513e9f8dda92882ebe87
在 release5.0 notes 中我也找到了相关的说明, 如下:
- 之前 ESP-IDF release/v4.3 上的 esp-mqtt 在每次 mqtt write data 之后,会更新当前的 keep-alive 计数器:https://github.com/espressif/esp-mqtt/blob/idf_v4.x/mqtt_client.c#L966 (AT v2.4.2.0 基于 release/v4.3 开发,所以每次发送数据后,会刷新 keep-alive 时间)
- 现在 ESP-IDF release/v5.0 上的 esp-mqtt 在每次 mqtt write data 之后,不会更新当前的 keep-alive 计数器:https://github.com/espressif/esp-mqtt/blob/master/mqtt_client.c#L666 (AT v3.2.0.0 基于 release/v5.0 开发,所以每次收发数据,不会刷新 keep-alive 时间)
- 和 David 确认了下为什么在 5.0 中更改了保持存活间隔的处理方式。以前,每次发送控制信息包都会重置超时计数器,有时会导致后来发现断开连接。现在我们每次与 PING 控制信息包交互时都会重置计数器,这样即使有其他控制信息也能持续发送。