Skip to content
This repository has been archived by the owner on Oct 4, 2021. It is now read-only.

MQTT gets disconnected after short time and don't reconnect #673

Closed
websoftix opened this issue Dec 28, 2020 · 12 comments
Closed

MQTT gets disconnected after short time and don't reconnect #673

websoftix opened this issue Dec 28, 2020 · 12 comments
Labels
bug Something isn't working

Comments

@websoftix
Copy link

Bug description
I have a Gateway Premium II connected to a Buderus Logamax Plus GB172i-30 K. I use a Raspbery Pi 4 with Mosquito to receive the information on MQTT and also logs in syslog and the problem is that the ESP gets disconnected from MQTT after about a minute and isn't connecting back.
The behavior was present on version 2.1 and is also present in 2.2.

I need a little help to try solve this, thank you in advance.

All the best,
Virgil

Steps to reproduce
In my setup it's very easy to reproduce, I just restart the ESP, it connects to Mqtt works well for about one minute and then it gets disconnected and (almost) never succed to reconnect back.
In just few ocasions, after restarting the ESP, the mqtt re-connection worked for a number of times, maybe a few minutes but finally it hanged.

I downloaded the code and tried to find the where/how the re-connection should occur and why its not working.
The only place where I can find a call to mqtt connect() is in MqttSettingsService::configureMqtt() which is called in
MqttSettingsService::loop().

In my case it seem that both _mqttClient.disconnect() and _mqttClient.connect() calls are reurning immediately (maybe because bcause of _lockMutiConnections ).
I tried a few things but nothing worked so far.

I keep seeing the following line in the logs, I don't know if it's related:
[emsesp] WiFi Connected with IP=192.168.100.130, hostname=emsesp

The log line is written in WebStatusService::onStationModeGotIP() but I don't understand who is calling this handler because I don't see any Wifi Disconect ...
Also, the ESP continue to send logs to rsyslog on my RaspberyPi, so the Wifi connection even if id drops it gets back.

Expected behavior
The expected behavior is that MQTT connection is re-established.

Device information
{
"System": {
"version": "2.2.0",
"uptime": "000+00:07:50.553",
"freemem": 44,
"fragmem": 40
},
"Settings": {
"enabled": "on",
"publish_time_boiler": 10,
"publish_time_thermostat": 0,
"publish_time_solar": 0,
"publish_time_mixer": 0,
"publish_time_other": 0,
"publish_time_sensor": 0,
"mqtt_format": 1,
"mqtt_qos": 0,
"mqtt_retain": "off",
"tx_mode": 1,
"ems_bus_id": 11,
"master_thermostat": 0,
"rx_gpio": 13,
"tx_gpio": 15,
"dallas_gpio": 0,
"dallas_parasite": "off",
"led_gpio": 2,
"hide_led": "off",
"api_enabled": "off",
"bool_format": 1,
"analog_enabled": "off"
},
"Status": {
"bus": "connected",
"bus protocol": "Buderus",
"#telegrams received": 265,
"#read requests sent": 58,
"#write requests sent": 0,
"#incomplete telegrams": 0,
"#tx fails": 3,
"rx line quality": 100,
"tx line quality": 100,
"#MQTT publish fails": 0,
"#dallas sensors": 0
},
"Devices": [
{
"type": "Boiler",
"name": "Buderus GBx72/Trendline/Cerapur/Greenstar Si/27i (DeviceID:0x08 ProductID:123, Version:06.08)",
"handlers": "0x10 0x11 0x14 0x15 0x16 0x18 0x19 0x1A 0x1C 0x2A 0x33 0x34 0x35 0xD1 0xE3 0xE4 0xE5 0xE6 0xE9 0xEA 0x494 0x495"
},
{
"type": "Controller",
"name": "BC25 (DeviceID:0x09, ProductID:125, Version:03.03)",
"handlers": ""
}
]
}

Additional context
Bellow I paste the logs from the start to the disconect moment:

Dec 28 17:46:40 emsesp - 000+00:00:07.967 I 99: [mqtt] MQTT connected
Dec 28 17:46:40 emsesp - 000+00:00:07.967 D 100: [mqtt] Subscribing to topic: emsesp/system
Dec 28 17:46:40 emsesp - 000+00:00:08.160 D 101: [telegram] Sending read Tx [#12], telegram: 0B 88 E6 00 20 07
Dec 28 17:46:40 emsesp - 000+00:00:08.183 D 102: [mqtt] Subscribing to topic: emsesp/boiler
Dec 28 17:46:40 emsesp - 000+00:00:08.212 D 103: [emsesp] Last Tx read successful
Dec 28 17:46:40 emsesp - 000+00:00:08.212 T 104: [emsesp] Boiler(0x08) -> Me(0x0B), UBAParametersPlus(0xE6), data:
Dec 28 17:46:40 emsesp - 000+00:00:08.212 D 105: [emsesp] This telegram (UBAParametersPlus) is not recognized by the EMS bus
Dec 28 17:46:40 emsesp - 000+00:00:08.212 D 106: [emsesp] Toggling fetch for device ID 0x08, telegram ID 0xE6 to 0
Dec 28 17:46:40 emsesp - 000+00:00:08.212 D 107: [emsesp] No telegram type handler found for ID 0xE6 (src 0x08)
Dec 28 17:46:40 emsesp - 000+00:00:08.384 D 108: [mqtt] Publishing topic emsesp/info (#2, retain=0, try#1, size 58, pid 1)
Dec 28 17:46:40 emsesp - 000+00:00:08.482 D 109: [telegram] Sending read Tx [#13], telegram: 0B 88 EA 00 20 37
Dec 28 17:46:40 emsesp - 000+00:00:08.512 D 110: [emsesp] Last Tx read successful
Dec 28 17:46:40 emsesp - 000+00:00:08.512 T 111: [emsesp] Boiler(0x08) -> Me(0x0B), UBAParameterWWPlus(0xEA), data:
Dec 28 17:46:40 emsesp - 000+00:00:08.512 D 112: [emsesp] This telegram (UBAParameterWWPlus) is not recognized by the EMS bus
Dec 28 17:46:40 emsesp - 000+00:00:08.512 D 113: [emsesp] Toggling fetch for device ID 0x08, telegram ID 0xEA to 0
Dec 28 17:46:40 emsesp - 000+00:00:08.512 D 114: [emsesp] No telegram type handler found for ID 0xEA (src 0x08)
Dec 28 17:46:40 emsesp - 000+00:00:08.586 D 115: [mqtt] Publishing topic emsesp/status (#3, retain=1, try#1, size 6, pid 1)
Dec 28 17:46:41 emsesp - 000+00:00:10.198 D 116: [mqtt] Publishing topic emsesp/boiler_data (#4, retain=0, try#1, size 290, pid 1)
Dec 28 17:46:41 emsesp - 000+00:00:10.399 D 117: [mqtt] Publishing topic emsesp/boiler_data_ww (#5, retain=0, try#1, size 151, pid 1)
Dec 28 17:46:41 emsesp - 000+00:00:10.738 T 118: [emsesp] Boiler(0x08) -> All(0x00), UBADevices(0x07), data: 0B 00 00 00 00 00 00 00 00 00 00 00 00
Dec 28 17:46:42 emsesp - 000+00:00:10.973 T 119: [emsesp] Boiler(0x08) -> All(0x00), UBAMonitorFast(0x18), data: 28 01 8A 64 0F 09 02 25 C0 01 38 80 00 80 00 00 71 FF 2D 48 00 C8 00 02 00
Dec 28 17:46:42 emsesp - 000+00:00:11.002 D 120: [mqtt] Publishing topic emsesp/heating_active (#6, retain=0, try#1, size 2, pid 1)
Dec 28 17:46:42 emsesp - 000+00:00:11.203 D 121: [mqtt] Publishing topic emsesp/tapwater_active (#7, retain=0, try#1, size 3, pid 1)
Dec 28 17:46:42 emsesp - 000+00:00:11.265 T 122: [emsesp] Boiler(0x08) -> All(0x00), UBAMonitorSlow(0x19), data: 80 00 80 00 80 00 00 00 00 40 00 E1 FF 04 C6 ED 00 00 00 04 48 45 00 2C 8C 80 00
Dec 28 17:46:42 emsesp - 000+00:00:11.546 T 123: [emsesp] Boiler(0x08) -> All(0x00), UBAMaintenanceStatus(0x1C), data: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Dec 28 17:46:42 emsesp - 000+00:00:11.749 T 124: [emsesp] Boiler(0x08) -> All(0x00), MC10Status(0x2A), data: 00 00 00 00 00 00 00 00 A7 00 A8 80 00 00 80 00 80 00 80 00 00
Dec 28 17:46:43 emsesp - 000+00:00:12.040 T 125: [emsesp] Boiler(0x08) -> All(0x00), UBAMonitorWW(0x34), data: 3C 01 38 80 00 21 00 00 01 00 00 7E A8 00 B5 73 00 80 00
Dec 28 17:46:51 emsesp - 000+00:00:20.051 D 126: [mqtt] Publishing topic emsesp/boiler_data (#8, retain=0, try#1, size 550, pid 1)
Dec 28 17:46:51 emsesp - 000+00:00:20.252 D 127: [mqtt] Publishing topic emsesp/boiler_data_ww (#9, retain=0, try#1, size 412, pid 1)
Dec 28 17:46:51 emsesp - 000+00:00:20.857 T 128: [emsesp] Boiler(0x08) -> All(0x00), UBAMonitorWW(0x34), data: 3C 01 37 80 00 21 00 00 01 00 00 7E A8 00 B5 73 00 80 00
Dec 28 17:46:52 emsesp - 000+00:00:21.081 T 129: [emsesp] Boiler(0x08) -> All(0x00), UBAMonitorFast(0x18), data: 28 01 8E 64 0F 09 02 25 C0 01 37 80 00 80 00 00 71 FF 2D 48 00 C8 00 02 00
Dec 28 17:46:52 emsesp - 000+00:00:21.355 T 130: [emsesp] Boiler(0x08) -> All(0x00), MC10Status(0x2A), data: 00 00 00 00 00 00 00 00 A8 00 A8 80 00 00 80 00 80 00 80 00 00
Dec 28 17:47:01 emsesp - 000+00:00:30.107 D 131: [mqtt] Publishing topic emsesp/boiler_data (#10, retain=0, try#1, size 550, pid 1)
Dec 28 17:47:01 emsesp - 000+00:00:30.308 D 132: [mqtt] Publishing topic emsesp/boiler_data_ww (#11, retain=0, try#1, size 412, pid 1)
Dec 28 17:47:01 emsesp - 000+00:00:30.748 T 133: [emsesp] Boiler(0x08) -> All(0x00), MC10Status(0x2A), data: 00 00 00 00 00 00 00 00 A8 00 A8 80 00 00 80 00 80 00 80 00 00
Dec 28 17:47:02 emsesp - 000+00:00:30.995 T 134: [emsesp] Boiler(0x08) -> All(0x00), UBAMonitorWW(0x34), data: 3C 01 37 80 00 21 00 00 01 00 00 7E A8 00 B5 73 00 80 00
Dec 28 17:47:02 emsesp - 000+00:00:31.213 T 135: [emsesp] Boiler(0x08) -> All(0x00), UBAMonitorFast(0x18), data: 28 01 8E 64 0F 09 02 25 C0 01 36 80 00 80 00 00 72 FF 2D 48 00 C8 00 02 00
Dec 28 17:47:11 emsesp - 000+00:00:40.163 D 136: [mqtt] Publishing topic emsesp/boiler_data (#12, retain=0, try#1, size 550, pid 1)
Dec 28 17:47:11 emsesp - 000+00:00:40.364 D 137: [mqtt] Publishing topic emsesp/boiler_data_ww (#13, retain=0, try#1, size 410, pid 1)
Dec 28 17:47:11 emsesp - 000+00:00:40.672 T 138: [emsesp] Boiler(0x08) -> All(0x00), UBAMonitorFast(0x18), data: 28 01 8C 64 0F 09 02 25 C0 01 36 80 00 80 00 00 73 FF 2D 48 00 C8 00 02 00
Dec 28 17:47:12 emsesp - 000+00:00:40.942 T 139: [emsesp] Boiler(0x08) -> All(0x00), MC10Status(0x2A), data: 00 00 00 00 00 00 00 00 A8 00 A8 80 00 00 80 00 80 00 80 00 00
Dec 28 17:47:12 emsesp - 000+00:00:41.160 T 140: [emsesp] Boiler(0x08) -> All(0x00), UBAMonitorWW(0x34), data: 3C 01 36 80 00 21 00 00 01 00 00 7E A8 00 B5 73 00 80 00
Dec 28 17:47:21 emsesp - 000+00:00:50.016 D 141: [mqtt] Publishing topic emsesp/boiler_data (#14, retain=0, try#1, size 550, pid 1)
Dec 28 17:47:21 emsesp - 000+00:00:50.217 D 142: [mqtt] Publishing topic emsesp/boiler_data_ww (#15, retain=0, try#1, size 408, pid 1)
Dec 28 17:47:21 emsesp - 000+00:00:50.677 T 143: [emsesp] Boiler(0x08) -> All(0x00), UBAMonitorWW(0x34), data: 3C 01 36 80 00 21 00 00 01 00 00 7E A8 00 B5 73 00 80 00
Dec 28 17:47:22 emsesp - 000+00:00:50.937 T 144: [emsesp] Boiler(0x08) -> All(0x00), UBAMonitorFast(0x18), data: 28 01 8F 64 0F 09 02 25 C0 01 35 80 00 80 00 00 71 FF 2D 48 00 C8 00 02 00
Dec 28 17:47:22 emsesp - 000+00:00:51.139 T 145: [emsesp] Boiler(0x08) -> All(0x00), MC10Status(0x2A), data: 00 00 00 00 00 00 00 00 A8 00 A8 80 00 00 80 00 80 00 80 00 00
Dec 28 17:47:29 emsesp - 000+00:00:58.661 T 146: [emsesp] Boiler(0x08) -> All(0x00), UBAMonitorFast(0x18), data: 28 01 90 64 0F 09 02 25 C0 01 35 80 00 80 00 00 73 FF 2D 48 00 C8 00 02 00
Dec 28 17:47:31 emsesp - 000+00:01:00.001 D 147: [emsesp] Fetching values for device ID 0x08
Dec 28 17:47:31 emsesp - 000+00:01:00.001 D 148: [telegram] Tx read request to device 0x08 for type ID 0x14
Dec 28 17:47:31 emsesp - 000+00:01:00.001 D 149: [telegram] Tx read request to device 0x08 for type ID 0x16
Dec 28 17:47:31 emsesp - 000+00:01:00.001 D 150: [telegram] Tx read request to device 0x08 for type ID 0x19
Dec 28 17:47:31 emsesp - 000+00:01:00.001 D 151: [telegram] Tx read request to device 0x08 for type ID 0x33
Dec 28 17:47:31 emsesp - 000+00:01:00.001 D 152: [emsesp] Fetching values for device ID 0x09
Dec 28 17:47:33 emsesp - 000+00:01:00.058 I 153: [mqtt] MQTT disconnected: TCP
Dec 28 17:47:33 emsesp - 000+00:01:00.131 D 154: [telegram] Sending read Tx [#16], telegram: 0B 88 14 00 20 E4
Dec 28 17:47:33 emsesp - 000+00:01:00.165 D 155: [emsesp] Last Tx read successful
Dec 28 17:47:33 emsesp - 000+00:01:00.165 T 156: [emsesp] Boiler(0x08) -> Me(0x0B), UBATotalUptime(0x14), data: 17 69 90
Dec 28 17:47:33 emsesp - 000+00:01:00.429 D 157: [telegram] Sending read Tx [#17], telegram: 0B 88 16 00 20 EC

@websoftix websoftix added the bug Something isn't working label Dec 28, 2020
@websoftix
Copy link
Author

I found a solution, I'm not sure this have any implications that I don't realise but now Mqtt re-connects.
For reasons I wasn't able to comprehend the _lockMutiConnections is not reset to false when the connection drops and because of this no reconnection is possible.

The solution I found was to modify AsyncMqttClient and add a AsyncMqttClient::resetLockMutiConnections() that sets _lockMutiConnections to false.
I then call this in MqttSettingsService::configureMqtt() just before _mqttClient.connect();

I hope this helps somehow to find the real problem which seems to be that AsyncMqttClient::_onDisconnect is not called, as there is the place where _lockMutiConnections is set to false.

Thank you,
Virgil

@proddy
Copy link
Collaborator

proddy commented Dec 28, 2020

Glad you've found a workaround. It looks like it may be the WiFi cutting out if you're seeing messages like [emsesp] WiFi Connected with IP=192.168.100.130, hostname=emsesp in the logs. Is your wifi signal as reported by EMS-ESP strong enough? I may need to add that to the info command (http://ems-esp/api?device=system&cmd=info) in the next version.

In any case, I should be able to reproduce this too if I disable one of my access points and see if MQTT is able to reconnect to the broker.

@websoftix
Copy link
Author

Is your wifi signal as reported by EMS-ESP strong enough?

The wifi_quality() function reports 100 and I think it's correct. The router and the EMS are on the same room, let's say 3-4m distance and the only thing getting in the way are the furniture dors which are not very tick.

I noticed letting it run (and having some more info added to the log) that when the connectcount_ is increasing over 50 the web interface became unresponsive (while mqtt and remote logging still worked) and at some point in my insistences of loading the web ui the ESP restarted.

There is one thing that intrigues me: the disconects happend very rhythmical, like programmed :), one per minute, see beelow the log filtered:
Dec 28 22:00:00 ems-esp - 000+00:13:00.047 I 1155: [mqtt] MQTT disconnected: TCP
Dec 28 22:01:00 ems-esp - 000+00:14:00.068 I 1229: [mqtt] MQTT disconnected: TCP
Dec 28 22:02:00 ems-esp - 000+00:15:00.065 I 1303: [mqtt] MQTT disconnected: TCP
Dec 28 22:03:00 ems-esp - 000+00:16:00.045 I 1383: [mqtt] MQTT disconnected: TCP
Dec 28 22:04:00 ems-esp - 000+00:17:00.057 I 1458: [mqtt] MQTT disconnected: TCP
Dec 28 22:05:00 ems-esp - 000+00:18:00.054 I 1533: [mqtt] MQTT disconnected: TCP
Dec 28 22:06:00 ems-esp - 000+00:19:00.048 I 1608: [mqtt] MQTT disconnected: TCP
Dec 28 22:07:00 ems-esp - 000+00:20:00.045 I 1681: [mqtt] MQTT disconnected: TCP
Dec 28 22:08:00 ems-esp - 000+00:21:00.066 I 1755: [mqtt] MQTT disconnected: TCP
Dec 28 22:09:00 ems-esp - 000+00:22:00.063 I 1829: [mqtt] MQTT disconnected: TCP
Dec 28 22:10:00 ems-esp - 000+00:23:00.059 I 1903: [mqtt] MQTT disconnected: TCP
Dec 28 22:11:00 ems-esp - 000+00:24:00.055 I 1984: [mqtt] MQTT disconnected: TCP
Dec 28 22:12:02 ems-esp - 000+00:25:00.051 I 2058: [mqtt] MQTT disconnected: TCP
Dec 28 22:13:02 ems-esp - 000+00:26:00.040 I 2137: [mqtt] MQTT disconnected: TCP
Dec 28 22:14:02 ems-esp - 000+00:27:00.069 I 2214: [mqtt] MQTT disconnected: TCP
Dec 28 22:15:02 ems-esp - 000+00:28:00.065 I 2292: [mqtt] MQTT disconnected: TCP
Dec 28 22:16:02 ems-esp - 000+00:29:00.061 I 2370: [mqtt] MQTT disconnected: TCP
Dec 28 22:17:02 ems-esp - 000+00:30:00.057 I 2448: [mqtt] MQTT disconnected: TCP
Dec 28 22:18:02 ems-esp - 000+00:31:00.053 I 2526: [mqtt] MQTT disconnected: TCP
Dec 28 22:19:02 ems-esp - 000+00:32:00.049 I 2609: [mqtt] MQTT disconnected: TCP
Dec 28 22:20:02 ems-esp - 000+00:33:00.045 I 2688: [mqtt] MQTT disconnected: TCP
Dec 28 22:21:02 ems-esp - 000+00:34:00.066 I 2766: [mqtt] MQTT disconnected: TCP
Dec 28 22:22:02 ems-esp - 000+00:35:00.063 I 2845: [mqtt] MQTT disconnected: TCP

I'm think I will play with the SYSTEM_HEARTBEAT_INTERVAL and EMS_FETCH_FREQUENCY which are both one minute maybe I can find something.

@proddy
Copy link
Collaborator

proddy commented Dec 28, 2020

/are you getting the "Wifi Connected" messages too every 1-2 minutes?

@websoftix
Copy link
Author

/are you getting the "Wifi Connected" messages too every 1-2 minutes?

Yes, it seems the two coincide:

Dec 28 21:58:00 ems-esp - 000+00:11:00.104 I 1008: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 21:59:00 ems-esp - 000+00:12:00.103 I 1082: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:00:00 ems-esp - 000+00:13:00.096 I 1158: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:01:00 ems-esp - 000+00:14:00.116 I 1230: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:02:00 ems-esp - 000+00:15:00.114 I 1304: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:03:00 ems-esp - 000+00:16:00.113 I 1385: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:04:00 ems-esp - 000+00:17:00.125 I 1459: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:05:00 ems-esp - 000+00:18:00.102 I 1536: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:06:00 ems-esp - 000+00:19:00.097 I 1610: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:07:00 ems-esp - 000+00:20:00.096 I 1682: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:08:00 ems-esp - 000+00:21:00.115 I 1756: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:09:00 ems-esp - 000+00:22:00.114 I 1830: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:10:00 ems-esp - 000+00:23:00.109 I 1904: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:11:00 ems-esp - 000+00:24:00.108 I 1985: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:12:02 ems-esp - 000+00:25:02.102 I 2083: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:13:02 ems-esp - 000+00:26:02.109 I 2161: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:14:02 ems-esp - 000+00:27:02.117 I 2239: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:15:02 ems-esp - 000+00:28:02.115 I 2317: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:16:02 ems-esp - 000+00:29:02.113 I 2395: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:17:02 ems-esp - 000+00:30:02.111 I 2473: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:18:02 ems-esp - 000+00:31:02.108 I 2551: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:19:02 ems-esp - 000+00:32:02.098 I 2634: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:20:02 ems-esp - 000+00:33:02.095 I 2713: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:21:02 ems-esp - 000+00:34:02.121 I 2791: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp
Dec 28 22:22:02 ems-esp - 000+00:35:02.119 I 2870: [emsesp] WiFi Connected with IP=192.168.100.130, hostname=ems-esp

@proddy
Copy link
Collaborator

proddy commented Dec 28, 2020

Right. To rule out any 3rd party library issues, do you get the same behaviour when using the prebuilt binaries?

@websoftix
Copy link
Author

I just downloaded the prebuilt 2.2 image and I can confirm it's the same behavior, it disconects.

@websoftix
Copy link
Author

Changing SYSTEM_HEARTBEAT_INTERVAL to 70000 and SYSTEM_HEARTBEAT_INTERVAL to 50000 (at the same time) doesn't change the situation too much even though it seems to add a little more variation to the timing of the disconects:

Dec 28 22:42:12 ems-esp - 000+00:01:00.053 I 236: [mqtt] MQTT disconnected: TCP
Dec 28 22:43:12 ems-esp - 000+00:02:00.050 I 312: [mqtt] MQTT disconnected: TCP
Dec 28 22:44:12 ems-esp - 000+00:03:00.071 I 388: [mqtt] MQTT disconnected: TCP
Dec 28 22:45:12 ems-esp - 000+00:04:00.067 I 468: [mqtt] MQTT disconnected: TCP
Dec 28 22:47:20 ems-esp - 000+00:01:00.054 I 253: [mqtt] MQTT disconnected: TCP
Dec 28 22:48:52 ems-esp - 000+00:01:11.239 I 247: [mqtt] MQTT disconnected: TCP
Dec 28 22:52:30 ems-esp - 000+00:01:00.055 I 252: [mqtt] MQTT disconnected: TCP
Dec 28 22:53:30 ems-esp - 000+00:02:00.051 I 331: [mqtt] MQTT disconnected: TCP
Dec 28 22:54:30 ems-esp - 000+00:03:00.073 I 408: [mqtt] MQTT disconnected: TCP
Dec 28 22:55:30 ems-esp - 000+00:04:00.069 I 496: [mqtt] MQTT disconnected: TCP
Dec 28 22:56:28 ems-esp - 000+00:05:00.064 I 579: [mqtt] MQTT disconnected: TCP
Dec 28 22:57:28 ems-esp - 000+00:06:00.061 I 669: [mqtt] MQTT disconnected: TCP
Dec 28 22:58:28 ems-esp - 000+00:07:00.057 I 740: [mqtt] MQTT disconnected: TCP

@websoftix
Copy link
Author

I found the cause of the rithmical disconects (every minute): the DHCP server in my router have the lease time set to 1 minute (I needed it at some point and forgot it that way).

I set it to 2 minutes for the sake of testing and the problem disapeared. I watched the "remaining lease time" reported by the router for the ESP and it seems that ESP is renewing the WiFi connection as the lease time never got to zero so I don't get the WiFi disconect and subsequently the Mqtt disconect.

I will wach it for few days and see if the problem occur anymore and I will get back.

@proddy
Copy link
Collaborator

proddy commented Dec 28, 2020

I haven't been able to test yet so just shouting out ideas for now so we can home in on the root cause.

  • if you open a telnet session, does the connection also drop after 1-2 minutes? If so then it's the WiFi that's reconnecting and causing the mqtt to also reconnect
  • if you take the wemos esp8266 out of the gateway, plug it into a usb charger\laptop/pc do you see the same drop outs?
  • check the WiFi strength in telnet with 'system' then 'show'

@websoftix
Copy link
Author

Thank you for help!

if you open a telnet session, does the connection also drop after 1-2 minutes? If so then it's the WiFi that's reconnecting and causing the mqtt to also reconnect

Yes, the telnet connection dropped also very quick and now it stays connected.

@websoftix
Copy link
Author

I think we can close this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants