Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[quest] Running multiple OTA updates concurrently? #2093

Closed
LordMike opened this issue Dec 23, 2021 · 13 comments
Closed

[quest] Running multiple OTA updates concurrently? #2093

LordMike opened this issue Dec 23, 2021 · 13 comments
Assignees
Labels
question Further information is requested

Comments

@LordMike
Copy link
Contributor

Hey,

I couldn't find it described anywhere - but I'm currently trying to update two (of 11) devices with a firmware through zwavejs2mqtt - concurrently.. It seems to work, as in the logs show the two devices at different states through their respective updates - but I wanted to make sure it'll work .. Like.. is this safe?

I'm worried about state issues within this project, where it might serve the same (the last) firmware to all devices asking for firmwares.. Or maybe, when device 2 asks for "part 1", it actually gets "part 335" which device 1 has reached..

Hoping this can be cleared up :)

Also - It would be really cool to start updates on N devices, perhaps in sequence (as the network would probably get exponentially slower for each concurrent update).... :)

Mike.

@LordMike LordMike added the question Further information is requested label Dec 23, 2021
@LordMike
Copy link
Contributor Author

LordMike commented Dec 23, 2021

Other questions I have on updates:

  • Can I start an update for a battery powered device, and then wait a week for it to complete? The device will eventually wake up, and notice the update, I guess..
  • In this case, can I see, somehow, that an update is pending for a device / how far it has come?
  • What happens if I start two firmware updates for one device - maybe by mistake?

@robertsLando
Copy link
Member

About the update status I could make some UI improvments, about other questions I let @AlCalzone to answer

@AlCalzone
Copy link
Member

Or maybe, when device 2 asks for "part 1", it actually gets "part 335" which device 1 has reached..

The firmware update status is tracked on each node instance, so confusing the packets isn't possible. Also if it were possible that would cause the checksum validation to fail.

Can I start an update for a battery powered device, and then wait a week for it to complete?

No. After starting the update, the device must request each packet in a certain time (a minute if I'm not mistaken)

What happens if I start two firmware updates for one device - maybe by mistake?

The second api call will throw an error.

In this case, can I see, somehow, that an update is pending for a device / how far it has come?

On every status change, the driver emits an event telling the application the progress. It is up to the application to display that.

Regarding the original question:
It currently is possible to start multiple updates at the same time. Because of the load it puts on the network it is not a good idea though.
The driver is currently lacking the ability to sequence tasks on a higher level which would be necessary to automatically take care of this.
See zwave-js/node-zwave-js#3707

@LordMike
Copy link
Contributor Author

Great. It did seem like it worked, and I hoped any checksums would catch errors, so I started 8 updates when I went to bed. I knew it would take a long time (longer than 8 sequential updates), but I imagined it would be done by the morning anyways.

I think it managed to update 4 devices by the morning. I thought the rest had given up, as I saw no status updates - but suddenly some appeared - so I rebooted everything.

I know otas are rare, but more visibility and management capabilities would be awesome. If nothing else, to give me a sense that it will work - like, hide the “begin” button on devices already in progress (replace it with a progress bar).

This way I know that I can’t start two concurrent, and I know how many are running.

If possible, add a percentage to the firmware column in the devices list for those in progress. :).

Mike.

@LordMike
Copy link
Contributor Author

LordMike commented Dec 25, 2021

Or maybe, when device 2 asks for "part 1", it actually gets "part 335" which device 1 has reached..

The firmware update status is tracked on each node instance, so confusing the packets isn't possible. Also if it were possible that would cause the checksum validation to fail.

What I was really worried about, is a classical coding mistake: Store the new firmware in a globally static list, and then the devices fetch parts from that.. But when a new update is started, this list changes, and the currently-running updates will get incorrect packets. Ideally this breaks the checksum later on, as you point out, but the risk is high :)

Can I start an update for a battery powered device, and then wait a week for it to complete?

No. After starting the update, the device must request each packet in a certain time (a minute if I'm not mistaken)

Is there any way around this - or is it entirely intentional for the update process to be a user-driven process?

(Of course zwavejs2mqtt could "queue" the update, as in when the device wakes up it begins the desired update - so the question is really more, is the process desired to be manual?)

@robertsLando
Copy link
Member

I think this could be fixed with a new column showing the update status of that node, could you open a feature request about this?

@LordMike
Copy link
Contributor Author

Well, at least the usability part - I'll make an FR for it..

As for the "updating all battery devices" part - is a queued update viable, where zjs2mqtt can wait N days until the device comes back online?

@AlCalzone
Copy link
Member

is a queued update viable, where zjs2mqtt can wait N days until the device comes back online?

No, because you have to initiate the update manually on the device anyways.

@LordMike
Copy link
Contributor Author

Wait.. what.. I've updated some powered devices remotely..

I assumed that whenever a battery powered does its wakeup cycle, it becomes available to the controller and will receive any queued messages. At this point, the controller could send a new Update_md command, after which the device stays awake to get the new firmware.

Pressing the button on the device will mostly put it in the awake state, right?

@AlCalzone
Copy link
Member

Well... in this case this might already work as-is. The firmware update process starts by requesting metadata from the device. This message should be queued until the device wakes up. The next message instructs the device to start the update process, requesting frames from the controller.

Note that this won't work if a device requires manual activation, like my Aeotec multisensor.

@LordMike
Copy link
Contributor Author

LordMike commented Dec 27, 2021

Won't the buttonpush just put the device in the awake state?

In the guide for the multisensor, it says:

  1. Select FIRMWARE UPDATE and then click START. The over-the-air firmware upgrade of your MultiSensor 6 will begin.
  2. If the Multisensor 6 is battery powered, the firmware update may not initiate right away. just tap the button on the Multisensor 6 then the update should begin.

From that, I think the device just needs to be awake (point 11 says it will start immediately).. If its not awake (point 12), you wake it.

@AlCalzone
Copy link
Member

Won't the buttonpush just put the device in the awake state?

This might be the case here. But, quote the specs:

It is RECOMMENDED that firmware update is enabled by out-of-band authentication (e.g. physical activation of a pushbutton) prior to the transmission of this command.

and this is not limited to battery-powered devices. Meaning that even if the device wakes up, it might not start requesting the firmware fragments until you push the button. In the case of an automatic update this might mean that the update is aborted with a timeout before you get to doing that.

@LordMike
Copy link
Contributor Author

Oh, I see. Thanks :)

I think I noticed some failed updates, which was in tune with what you wrote about a 1 minute timeout - but I could be thinking of something else.

In any case - thanks for all the answers. It's been very helpful. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants