-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Temperature sensors loose custom names in HA on NetModule boot #108
Comments
I'm trying to remember our previous dialog on this when we developed that part of the code. Perhaps we can refresh each others memory without having to dig back through all our gitter communications and email. |
NetModule does not need to store any names. It is not it's job. Jevgeni |
I went back and reviewed the code that handles this. There can always be an improvement (if it will fit). I wish I had made more notes on the "theory of operation" for this function. Here are the basics, and why I did what I did, even if in hindsight there is a better way:
|
I should also add that this "rearrangement" can also happen during runtime if a sensor is intermittent. Let's say you have 1, 2, 3, 4, 5, but 3 is intermittent. Serial numbers 4 and 5 will shift positions during runtime if 3 disappears, then will shift back when 3 reappears. I might be able to do something about that as I haven't lost power and still know what was there, but I don't have it coded that way right now. |
Order of reporting is irrelevant. Serial number is key here. I do not know for what reason you store serial numbers in RAM. You can rescan sensors for every temperature read as well. No need to keep them. I've found some time to debug this further and found multiple issues:
Found problems:
I need to validate switch and binary sensor topics as well. I suspect there will be similar problems. If I will find something I will create a separate thread for that. |
When NetModule publishes MQTT config topics to broker without retain, then a bit later started HomeAssistant will not be able to setup these sensors. Broker will not resend these messages to HA and HA will know nothing about them. |
Checked |
Thanks for digging into this. These issues must have been present for a very long time. The last time this area of code was touched was the May 9, 2021 release when Issues #55 #57 #58 and #66 were addressed ... probably creating all the problems you are describing. I'll get to work on this shortly. |
I think I've found the "fails to send retain" problem. Not sure what I was thinking when I broke that so I'm looking more closely to get it right. I can also filter out the temperature sensors with "0" as an ID. I'm still looking at why it happens twice on boot. Just to be sure, you are saying I should never send an empty payload for a temperature sensor, and instead leave it up to the user to remove sensors in HA (or whatever they are using) once they are gone, correct? |
Correct |
Well, it appears I did NOT find the "fails to send retain" problem. The code I thought I may have broken always sets the Retain bit, so it is being lost further into the MQTT packing code. I will have to go through some detailed debugging to find where this is being disrupted. I'm going to release what I have now for Issues #94 and #100, then will work through this problem. This may take some time but I will get it done. |
Thank you @nielsonm236 for looking into it. |
@yozik04 I applied debug in my code all the way down to the point that messages are packed for sending on MQTT. Retain was set all the way down to that level. Attaching a log file from my Mosquitto server. I started it in verbose mode within a PowerShell window so that I could see all messaging. It appears to show that all messages sent from the Network Module have Retain set to 1 (as shown in the log entries by "r1"). I am not sure what to do next to determine the difference between your setup and mine. Thinking ... but could also use some suggestions. |
I'm also not seeing the multiple delete messages or the Temp Sensor ID 000000000000 message. I'm easily confused so it could still be me, but is it possible you inadvertently loaded a very old code load? I have to ask, although I consider that unlikely since you said you tried the latest code. If you look at the bottom of the Configuration page it should show the code revision level. I'm still looking at code. |
Here is how I grabbed the log: I did that from Raspberry Pi where mosquitto-clients is installed. |
I think I have part of the mystery solved. When running mosquitto_sub we've actually started a client that, with your command above, subscribes as a client and is displaying messages sent to it by the broker (so, not really a log). Watching the traffic under PowerShell with "mosquitto -v" I see the following with each publish of a temperature sensor from the Network Module: |
As an aside it looks like the very first time the broker sends a Publish to mosquitto_sub it sets Retain to 1. Thereafter it is always 0. I'm not sure why, and I don't always see that happen. |
I changed my testing to look at the homeassistant HA messages instead of the NetworkModule messages (as you said above ... but I overlooked it). Once doing that I could see the duplication and ID 000000000000 problems. I've tracked that down in firmware, applied fixes, and I'm testing. |
Ahh. You are right... I did not thought about that. Sorry for the confusion I made. |
@yozik04 I've tested the code changes for this Issue as follows:
During test I noticed that the broker sent an empty payload message for a sensor to mosquitto_sub. It was not sent by the Network Module to the broker. I only saw that one time and I was not able to reproduce it. I have no clue why/how that happened, but it occurred in the very first test, and not again over about 25 test attempts. I will continue to watch for that. All that sounds like what you requested. But it raised an additional question: During boot I send empty payload messages for "switch" and "binary_sensor" topics followed by their Config messages because I can't be sure if someone has changed a pin to Input, Output, or Disabled from some previous setting. Is this a problem? I did a few short tests with rebooting a module (when the deletes occur) and HA didn't lose the names I had given pins within HA. Having said that, I don't know if there is some power loss scenario for the modules and HA host that might cause a problem as all my testing left equipment powered up. Mike |
I was just re-reading your original statement, and you indicate that switch/binary_sensor doesn't issue empty payloads. I'm pretty sure it does, for every switch/binary_sensor, at every boot ... the code reads that way. I will look at this further to be sure of my statement. |
Another question: On my test module I removed two temperature sensors so they would stop sending Config messages to HA. Now hours later I'm trying to remove the created Entities in HA and it won't let me. HA suggested I restart HA and try again ... I did that and it still won't let me remove them. Since this is the method we were hoping to use this seems like a problem. Any ideas? |
Well, I "disabled" them rather than delete. They still appeared but in a "disabled" list. Then after a few minutes they magically disappeared from HA ... at least I couldn't find them anymore. Who knows. |
My test showed that empty payloads were not sent for switches and binary_sensors. Maybe you send these only on configuration save? I will retest again if you give me new version to flash. |
Maybe it is easier to remove whole device then? I was using MqttExplorer to remove topics under homeassistant/<device_id> |
I haven't been able to do more testing since our messages above. I'm going to do a little during the next two hours. re: "Maybe it is easier to remove whole device then?" I found that disabling the entity in HA (rather than deleting) appears to have caused it to be deleted, although it took a few minutes. I'm still not sure about this so I need to see if it reliably deletes the missing sensors. Also, deleting the whole device defeats our purpose here, doesn't it? Won't that cause loss of all unique names entered in HA for that device? As FYI, I have found that I often have to delete a "card", then create a new one in order to see changes I've made in HA. Attached is the code I'm testing with. This is the version that requires programming with the SWIM interface. All of my testing is done with the programming-over-ethernet version of the code ... but that makes no difference to what we are testing now. |
I was able to set up my test fairly quickly, and I verified that the "empty payload" messages are indeed being sent. I can see them in PowerShell running Mosquitto with "./mosquitto -v", and I see them being forwarded to mosquitto_sub. I think you might not be seeing them depending on how you subscribed with mosquitto_sub. I used this statement: What I'm seeing is that at boot if I find a pin defined as an Output I will first send a blank payload for that pin # defined as a "binary_sensor" (thus deleting any previous Input definition for the pin). Then I will send the "switch" Config for that pin. Likewise if I find a pin defined as an Input I will first send a blank payload for that pin # defined as a "switch" (thus deleting any previous Output definition for the pin. Then I will send the "binary_sensor" Config for that pin. If I find a pin defined as Disabled I will send a "binary_sensor" message with a blank payload, followed by a "switch" message with a blank payload, thus deleting ANY definition for the pin in HA. This means that if a pin definition did not change (stayed as a switch or stayed as a binary_sensor) HA probably ignores the delete message I send. This is a little different than what I expected (although I think it is right) so I will look at my code a little more closely to make sure I have the correct comments describing operation. So the way I'm handling "switch" and "binary_sensor" entities is different than the way I was handling "sensors" (ie, Temperature Sensors) in that I would actually delete the sensors in the previous code. As you point out that was leading to problems. OK, now I'm off to look at the code again. Mike |
That didn't take long. I looked at the comments in the code and they describe exactly what I said above with my test results. As I said before, I'm easily confused. |
HA can sure be confusing. I re-attached the two Temperature Sensors that I had previously "Disabled" in HA. They showed up again as "Disabled entities". So HA never really forgot about them, It just wasn't showing them until they physically reappeared. Then I couldn't get them re-enabled ... or I thought I couldn't. Per a pop-up in HA I had to reboot the HA host (supervisor / system / reboot host) and even then they showed as Disabled. But after about 2 minutes they changed to "Enabled". So I think the key here when "removing" Temperature Sensor configs is to "Disable" them, not "Delete" them. Then perform a "reboot host". Then give HA a couple of minutes to update its configs. There might be some quicker way by editing yaml, and if editing is done there might also be a way to delete any knowledge HA has about disabled configs. I don't want to go there. I'll wait to see if you have feedback. Otherwise I think we've resolved the this Issue. I will have to add some of what I've learned to the manual. |
I will retest this week and get back to you. Thank you! |
Testing switches and binary_sensors:
Works as expected. I do not see any empty payloads that would cleanup existing entities in HA. Testing passed. Testing temperature sensors:
Also works as expected. No empty payloads are sent. I can confirm that now HomeAssistant does not loose any custom data of temperature sensors. You have resolved all the problems. Thank you! |
Addressed in Release 20220921 0500 |
o Fixed Issue #94: “ENC28J60 Revision not always reported in Link Error Statistics” o Added Enhancement Issue #100 “Remove Configuration Button”. This is an option to disable the Configuration button on the IOControl page. o Fixed Issue #108: “Temperature sensors loose custom names in HA on NetModule boot”.
Recently I had a power outage for some hours on my street. After that I noticed that all Temperature sensor names in Home Assistant were reset to default ones:
<unit_name> temp <temp_sensor_id>
. I had better names defined before the outage. This is a problem as I expect more frequent power outages in Europe in upcoming winter as the whole Europe decided to drown in it's energy crisis. But this is a story for another day. Let's return back to the problem.I have flashed latest release Code Revision 20220831 2011 MQTT to all my 3 modules today and issue persists there as well.
In MQTT Explorer I see that NetModule sends empty MQTT message to temperature sensors configuration topics
homeassistant/sensor/aaaaa/bbbbb/config
on boot.This is wrong and should be avoided. I see that switches do not do that, which is right behavior.
The text was updated successfully, but these errors were encountered: