-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Z2M crash while running OTA update of Philips Hue bulb #22463
Comments
Could you provide the debug log of this? See this on how to enable debug logging. |
@Koenkk I am attaching a debug log containing a first attempt at updating the Philips Hue which failed almost immediately and a second attempt which is running (not failed yet) but very very slowly. Consider that this bulb is at no more than 2m from the coordinator and LQI is 216 which means it should be well in range for a flawless communication. |
@Koenkk In the end I don't know if the update failed or not because z2m crashed with these final messages (not logged in log.log) before it died (I'm attaching the full debug log which by the way contains also the broadcast errors that @Nerivec asked for): [2024-05-06 20:01:49] debug: zh:ember:queue: Status queue=0 priorityQueue=0. |
With crash, do you mean that z2m completely stops after this? @Nerivec could it be that we are missing an await somewhere? This is a command response |
Yes, with crash I mean z2m completely stops after this. |
With the latest Hue FW updates becoming available, I have about 50 devices to update. Each is super slow and often times out after an hour or so, so you need to start the update again. Unfortunately, there is also no way to schedule multiple updates in series (marking several devices to update eventually brings down the network due to congestion). |
It must be something specific to Hue that is causing trouble. I had a report, not too long ago, of almost 20 Inovelli devices being updated (sequentially of course 😁) without issue on an ember network with over 300 devices; that would indicate the underlaying system (OTA & driver) is more than fine. |
To me it looks like the stability issue has something to with With only one update at a time (if that's a hard requirement, Z2M should enforce it), updating the Hue lights seems to be stable for me today, but still quite slow (about an hour for one light, will have to check the logs for better statistics). What I did notice with a quick look at the debug logs is a consistent discrepancy between the requested and sent chunk size:
|
I also get this a lot in the logs, not sure if this is expected:
|
Sorry @Nerivec, I may have muddied the waters here - I am using zStack, not Ember. Should I open a separate issue? |
No on the contrary, it confirms this is a more "generic" issue. The first At ~50 bytes per message, and the average firmware being 250KB (some Hue appear to be double that), that's a lot of messages, plus it is throttled to avoid crashes, so timeframe for update is unfortunately pretty long; best scenario is unlikely to be less than 20 minutes. |
Would it be possible to increase the chunk size to the requested 64 bytes? I know there was a similar discussion in Koenkk/zigbee-herdsman-converters#6193 and it would a appear that the 50 bytes is more or less an arbitrary limit (i.e. it probably should not be significantly higher due to network congestion, but a power-of-two value like 64 would at least seem more natural too me). It's been some time since I had a Hue Bridge in use, but FW updates always seemed much quicker to me then. (Though it is possible that this only seemed that way because upgrading was "fire and forget", it would update all devices automatically once you gave it the go.) |
From the comments in code, it seems the 50 was chosen because higher values often result in instabilities. Also have to consider deeply nested routes and the cost involved. |
I have 170 Hue Devices (FML), I would probably be willing to kill for the option to update all (even if it did them sequentially). As it is, I often have 10-20 on the go at a time, and they usually take 2-3 hours each. I've not seen any real problems (though Z2m can respond slowly to clicks on the UI). |
@mundschenk-at Any chance you can run custom code to test an OTA refactor? If yes, find me on zigbee2mqtt Discord. |
@Nerivec Sure, I've pinged you on Discord. |
I know @mundschenk-at had some pretty good results with the new OTA refactoring. Had some good feedback on large |
@Nerivec can't easily test the dev branch as I'm on HAOS. Will report back in July with the next release. |
@Ricc68 You could switch to the Zigbee2MQTT Edge add-on. |
@Nerivec I have installed the Edge branch (latest dev) but the OTA update keeps failing. The good news is that z2m is not crashing anymore. [2024-06-15 18:18:42] debug: z2m:mqtt: Received MQTT message on 'zigbee2mqtt/bridge/request/device/ota_update/update' with data '{"id":"Lampada del soggiorno","transaction":"rpjz7-1"}' |
@Ricc68 Looks like your issue lies elsewhere (not OTA). Try bringing the device in question close to the coordinator, and re-pair the device to it directly. Then see if the device behaves better, you may have a router in-between the two that's making troubles. As long as you can't send simple commands to the device without encountering problems (stable network), no point even trying OTA, if it doesn't fail at first, it will likely fail at some point (it takes many, many messages to OTA successfully). |
@Nerivec understand your point, but the bulb is 1.5m away from the dongle-e and additionally the bulb is the only router in my small ZigBee network. True that there's lot of WiFi pollution from others and in addition the dongle is near my WiFi router but it was working fairly well before installing the dev branch. I will try to move the antenna further away. |
@Nerivec I have moved the dongle 1m further away from the WiFi router and things greatly improved. In 30 mins it updated the bulb and I don't see errors anymore. The good thing, other than having found the way to make my network more reliable, is that with the dev build a failed update is not crashing z2m anymore which I think is the true goal of this topic. |
What happened?
During the OTA update of a Philips Hue bulb z2m crashed.
One thing to note is how slow is the OTA update, wants more than 6 hours to update the light bulb firmware: I wonder if this is normal (no previous experience with ZigBee so I don't know).
What did you expect to happen?
OTA update complete successfully and possibly a faster OTA update as I never seen in my life a device requiring so much time to update. BTW this slow update also happens with the other 2 Sonoff TRVZB valves so the cause does not seem to be a problematic device.
How to reproduce it (minimal and precise)
Run the OTA update of the Philips Hue bulb.
Zigbee2MQTT version
1.37.0
Adapter firmware version
7.4.2 [GA]
Adapter
Sonoff ZBDongle-E
Setup
Add-on on Home Assistant OS, host is a VM on x86-64, dongle uses ember driver
Debug log
[2024-05-05 12:32:04] info: z2m: Update available for 'Lampada del soggiorno'
[2024-05-05 12:32:17] info: z2m: Updating 'Lampada del soggiorno' to latest firmware
[2024-05-05 12:32:18] info: z2m: Update of 'Lampada del soggiorno' at 0.00%
[2024-05-05 12:32:52] info: z2m: Update of 'Lampada del soggiorno' at 0.14%, ≈ 401 minutes remaining
[2024-05-05 12:33:25] info: z2m: Update of 'Lampada del soggiorno' at 0.27%, ≈ 411 minutes remaining
[2024-05-05 12:34:01] info: z2m: Update of 'Lampada del soggiorno' at 0.51%, ≈ 339 minutes remaining
[2024-05-05 12:34:36] info: z2m: Update of 'Lampada del soggiorno' at 0.68%, ≈ 337 minutes remaining
[2024-05-05 12:35:09] info: z2m: Update of 'Lampada del soggiorno' at 0.82%, ≈ 346 minutes remaining
[2024-05-05 12:35:11] error: zh:ember: Delivery of BROADCAST failed for "65532" [apsFrame={"profileId":0,"clusterId":54,"sourceEndpoint":0,"destinationEndpoint":0,"options":256,"groupId":0,"sequence":154} messageTag=70]
[2024-05-05 12:35:11] error: zh:ember: Delivery of BROADCAST failed for "65533" [apsFrame={"profileId":41440,"clusterId":33,"sourceEndpoint":242,"destinationEndpoint":242,"options":256,"groupId":0,"sequence":155} messageTag=48]
[2024-05-05 12:35:43] info: z2m: Update of 'Lampada del soggiorno' at 0.95%, ≈ 356 minutes remaining
[2024-05-05 12:36:18] info: z2m: Update of 'Lampada del soggiorno' at 1.17%, ≈ 339 minutes remaining
[2024-05-05 12:36:57] info: z2m: Update of 'Lampada del soggiorno' at 1.49%, ≈ 307 minutes remaining
[2024-05-05 12:37:31] info: z2m: Update of 'Lampada del soggiorno' at 1.66%, ≈ 309 minutes remaining
Error: Delivery failed for {"profileId":260,"clusterId":25,"sourceEndpoint":1,"destinationEndpoint":11,"options":4352,"groupId":0,"sequence":0}
at EmberOneWaitress.deliveryFailedFor (/app/node_modules/zigbee-herdsman/src/adapter/ember/adapter/oneWaitress.ts:96:31)
at EmberAdapter.onMessageSentDeliveryFailed (/app/node_modules/zigbee-herdsman/src/adapter/ember/adapter/emberAdapter.ts:558:30)
at Ezsp.emit (node:events:517:28)
at Ezsp.ezspMessageSentHandler (/app/node_modules/zigbee-herdsman/src/adapter/ember/ezsp/ezsp.ts:3957:18)
at Ezsp.callbackDispatch (/app/node_modules/zigbee-herdsman/src/adapter/ember/ezsp/ezsp.ts:794:18)
at Ezsp.tick (/app/node_modules/zigbee-herdsman/src/adapter/ember/ezsp/ezsp.ts:448:22)
at listOnTimeout (node:internal/timers:569:17)
at processTimers (node:internal/timers:512:7)
The text was updated successfully, but these errors were encountered: