Ota commands in flash #6538

davisonja · 2019-09-20T04:49:42Z

Currently OTA updates involve the use of (volatile) RTC memory to keep track of update commands across a chip reset. If there is a powerloss part-way through the flashing process the result is likely to be corrupt flash. Ideally these commands would be persisted in something less volatile - like flash.

This PR adds that functionality, and resolves #905.

Update commands are stored in a reserved area of flash that has been inserted just before the start of the FS in flash. This ensures that other things don't end up moving (like sdkwifi pages), that we don't lose FS space, but does cost 2 pages of potential sketch space. This may be an issue on smaller sized devices and it is possible to make the reservation optional - though the last consensus was it was a small enough loss so as not to be an issue.

A pointer to the location of the flash store has been added to the start of the flash, along with a magic number that the bootloader can use to recognise whether or not flash-storage should be checked. In this way it's possible to use the updated bootloader with older sketches without any issues (and means that control over using flash storage should be kept within the sketch).

As it stands the location of the update data hasn't been changed, and so is still in the first 1MB of flash, meaning you have a maximum sketch size half of that (less reserved areas). Once this has received some meaningful testing beyond my boards I anticipate updating the system to allow update data to be stored further on in flash, thus allowing larger OTA updates - assuming it's not already been done; I've not been through the current state beyond what was required to merge in the flash update solution.

boards.txt.py now generates .ld files that should include all the necessary information, and in the right places.

There is now a README.md file in the eboot folder as the start of some descriptions of how it works. It still needs some details added, but realistically all that info is already in the source itself.

I've now tested it on real hardware, which works, after a few fixes. 😄

…should mean nothing else moves)

davisonja · 2019-09-20T05:03:03Z

@d-a-v @Androbin a draft PR if you're interested.

OpenUAS · 2019-09-20T11:27:30Z

Good useful PR, so a CI that does not fail and the PR goes through would be great...

davisonja · 2019-09-20T23:28:15Z

Travis is next on the list, along with some tests on a couple of different esps that I've got. The other thing I need to check, since I'm listing todo's is ensure I finished the eboot doc changes.

Androbin · 2019-09-26T23:46:48Z

What would happen if I deployed the current bootloader and had it update itself to the new version later? Would it brick itself because the running code would be overwritten?

davisonja · 2019-09-27T00:13:45Z

What would happen if I deployed the current bootloader and had it update itself to the new version later? Would it brick itself because the running code would be overwritten?

While I don't have my notes handy at the moment I'm fairly sure that self-updating was supported, if not already (I forget whether the writing in eboot is already totally run-from-RAM) then certainly as an achievable improvement - switching to the flash-using bootloader entirely OTA is an intended use-case.

…lash boards.txt et al need to be regenerated

…lash

…ault off. Readded the start of the eboot docs in README.md (still WIP, but it's a start) Adjusted the default number of commands that are cycled through for wear levelling to 32 Fixed a 🤦 bug in the loop to find the first available flash-based command block

lrodorigo · 2020-10-16T17:20:42Z

Sorry for jumping in... I just want to report that I am using flash stored OTA commands (a fork of the PR) on my devices, and I updated (without compression) tens of times more than 200 devices, without reporting a single issue. Il giorno ven 16 ott 2020 alle ore 11:27 Julian Davison < notifications@github.com> ha scritto:

…

Oh, potentially getting rid of the existing RTC stuff will help, the original approach left that intact to provide a path of least change, but if we were to opt to go with just flash storage, it might all fit. Will explore that next. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#6538 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACBHPUCSPU2EPEFJWTUGZO3SLAGWVANCNFSM4IYSVVSA> .

-- *Luigi R.*

Instead of using either a series of etc_putc or setting a series of bytes one by one, use a simple macro to define 32b constants to build up strings. Saves ~30 bytes of program code in eboot for #6538 to work with.

devyte · 2020-10-17T01:28:56Z

@lrodorigo it is always a good thing to have other testers confirming that something works!

devyte · 2020-10-17T01:32:34Z

@davisonja that's only 152 bytes over, not a lot. I don't think it makes sense to keep both rtc and flash commands, so I'd say you're on the right path.

drzony · 2020-10-17T08:53:37Z

@devyte @davisonja This code exploits the same write without erase "feature" that SPI does, it may stop working at some point with new flashes. from what I can see it "should" work on PUYA because of read-then-write, but we need to be careful here.

…lash * upstream/master: (72 commits) Typo error in ESP8266WiFiGeneric.h (esp8266#7797) lwip2: use pvPortXalloc/vPortFree and "-free -fipa-pta" (esp8266#7793) Use smarter cache key, cache Arduino IDE (esp8266#7791) Update to SdFat 2.0.2, speed SD access (esp8266#7779) BREAKING - Upgrade to upstream newlib 4.0.0 release (esp8266#7708) mock: +hexdump() from debug.cpp (esp8266#7789) more lwIP physical interfaces (esp8266#6680) Rationalize File timestamp callback (esp8266#7785) Update to LittleFS v2.3 (esp8266#7787) WiFiServerSecure: Cache SSL sessions (esp8266#7774) platform.txt: instruct GCC to perform more aggressive optimization (esp8266#7770) LEAmDNS fixes (esp8266#7786) Move uzlib to master branch (esp8266#7782) Update to latest uzlib upstream (esp8266#7776) EspSoftwareSerial bug fix release 6.10.1: preciseDelay() could delay() for extremely long time, if period duration was exceeded on entry. (esp8266#7771) Fixed OOM double count in umm_realloc. (esp8266#7768) Added missing check for failure on umm_push_heap calls in Esp.cpp (esp8266#7767) Fix: cannot build after esp8266#7060 on Win64 (esp8266#7754) Add the missing 'rename' method wrapper in SD library. (esp8266#7766) i2s: adds i2s_rxtxdrive_begin(enableRx, enableTx, driveRxClocks, driveTxClocks) (esp8266#7748) ...

davisonja · 2020-12-28T09:26:49Z

@devyte @davisonja This code exploits the same write without erase "feature" that SPI does, it may stop working at some point with new flashes. from what I can see it "should" work on PUYA because of read-then-write, but we need to be careful here.

Are there flash systems where this doesn't apply, @drzony ? If I follow your comment it's the behaviour where we're modifying already written blocks to mark commands as needing action?

davisonja · 2020-12-28T10:08:23Z

@davisonja that's only 152 bytes over, not a lot. I don't think it makes sense to keep both rtc and flash commands, so I'd say you're on the right path.

Finally actually had a first run at removing the RTC stuff, and it looks like that does the trick, so compressed images and flash storage can both run, without the RTC support (and also without serial debugging on, but with #7545 applied more liberally through the other debug code it might yet be possible to get that in too.

I've yet to verify the code actually works after all these random code removals, but the fact that it fits is a good start :)

drzony · 2020-12-28T10:21:45Z

@davisonja Currently I don't think there are any flashes that we don't handle. But the workaround is based on flash ID. If another flash comes out that works in the same manner as PUYA ones, then it will break silently (without read-then-write PUYA flashes corrupt the bytes written).
Yes, I mean writing to same bytes of flash twice without erase for command storage.

earlephilhower · 2020-12-28T16:40:02Z

Your CI failures are related to you having a ESP8266SDFat submodule in your commit. The easiest thing at this point would be to just update to the proper ESP8266SDFat commit ESP8266SdFat @ 0a46e4e and git add that submodule again.

davisonja · 2020-12-29T09:29:23Z

@drzony I might need to find the PUYA info on a computer - I had a quick look last night from my phone.

My impression of flash systems were they (all) require an erase cycle (usually a sequence of consecutive addresses - a page or block) which resets the storage to a known value (typically 1's IIRC); from there you can write actual values which is actually achieved by updating bits from the erase-value. The write process can be repeated as many times as you like, but you can only ever toggle a bit from the erase-value to the other one. So in the case of an erase-value of 1, you can only ever clear bits, but you can do so at anytime - there's no concept of write-count, so 'twice' doesn't really apply.

From a (fairly quick) look around this seems to be fairly standard, though the exact amount you have to 'write' varies (as in, a byte, or several bytes) the ability to update any 1 to a 0 holds. It's definitely a common 'trick'. Worth a note in the info on the eboot docs, tho.

drzony · 2020-12-29T12:06:53Z

@davisonja
Yes, this is a standard, but spi_flash_write does not erase anything.
So in eboot_command_write_to_flash there is a second write to the same address without erase. This works since most SPI flashes have undocumented feature that you can write to the same address as long as it only has 1->0 transitions. (FYI spi_flash_write writes always in 4-byte increments). There is no write-count, but PUYA flashes corrupt data on second write if the address was not read from before the write. So on "standard" flashes you can write bit-by-bit 1->0 transitions, but on PUYA flashes (and probably other in the future) you can only write each 4 bytes once (and then erase is required) unless you do read-then-write trick.

From a (fairly quick) look around this seems to be fairly standard, though the exact amount you have to 'write' varies (as in, a byte, or several bytes) the ability to update any 1 to a 0 holds. It's definitely a common 'trick'. Worth a note in the info on the eboot docs, tho.

That's the problem I'm highlighting, that this is a 'common trick', but it's not really documented by any manufacturer and may not work on all flashes.

Some of it is already described in #7644
There was a lenghty discussion in #7514

davisonja · 2020-12-30T04:44:50Z

@drzony that's an interesting collection of reading, ta.

Do you think the current scheme (as in this incomplete PR) is going to be ok to proceed with given that it looks as though it's compatible with the flash implementations we've dealt with so far? I'm not clear whether 'be careful here' is aimed at that being kept in mind, or in changing the process the code is using to mark commands...

drzony · 2020-12-31T16:26:28Z

@davisonja I would change the way of writing the commands. The simplest idea would be to have each flag as a separate uint32_t (this will waste some space, but will make sure that the bytes are written only once) and do not write them when saving the command (only when marking). Then 0xFFFFFFFF would mean flag not set and 0x00000000 flag set.

davisonja · 2021-01-01T07:26:53Z

@drzony so the considered wisdom is that a full 32-bits is safest - is that for flash in general, or the current scheme of writing used in this project?

drzony · 2021-01-01T10:14:33Z

@davisonja It's the limitation of spi_flash_write from Espressif SDK:

Memory passed to spi_flash_write must be 4-byte aligned i.e. (uint32_t)buffer % 4 == 0
It always writes 4 bytes even if passed size is less than 4 bytes
If starting flash offset is not 4-byte aligned, you cannot pass page (256 bytes) boundary

I wrote workarounds for this, you can see them here

zhfei1979 · 2022-04-09T00:29:24Z

Poor bug, still not fixed

davisonja · 2022-04-09T00:36:26Z

It is still on my list, there's been too little time to redo it. I have a working solid solution, that needs slight tweaking to account for some specific flash behaviour mentioned in the discussion. Unfortunately the code base has moved significantly since it was first built and we're at the point where it's simpler to re-apply the concept to the current code, and sort out the specifics of fitting everything in (or a migration path).

…

On Sat, 9 Apr. 2022, 12:29 zhfei1979, ***@***.***> wrote: Poor bug, still not fixed — Reply to this email directly, view it on GitHub <#6538 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABX6A3SOPJATQD3PV2HVQTVEDFPBANCNFSM4IYSVVSA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

d-a-v · 2022-04-09T22:22:12Z

Unfortunately the code base has moved significantly since it was first
built

We can help with this if needed

davisonja added 6 commits September 20, 2019 11:01

WIP: Flash-stored OTA commands

91c95a5

Basic merge and update of boards.txt.py, untested!!

e97e1e7

Missed some brackets and forgot to finish a line!

9d598f2

Tweak values and move OTA commands to the end of sketch space (which …

9733d42

…should mean nothing else moves)

Result of running python3 tools/boards.txt.py --allgen

c850515

Cleaned up some things that had leaked in from testing

f3d01a2

mcspr mentioned this pull request Sep 20, 2019

Restore EEPROM address to prior released location #6537

Merged

devyte changed the title ~~Ota commands in flash~~ WIP - Ota commands in flash Sep 20, 2019

Androbin mentioned this pull request Sep 20, 2019

Merge eboot changes from davisonja #6533

Closed

davisonja mentioned this pull request Sep 21, 2019

OTA which survives power failure #905

Open

davisonja added 5 commits September 21, 2019 22:45

Added constant APP_START_OFFSET which was missed out of the transfer

8f1a74c

Add includes

32f1539

CI Cleanup

923b2d7

CI Cleanups

8205f17

CI Cleanups

16127c2

davisonja added 5 commits September 29, 2019 16:42

Merge remote-tracking branch 'upstream/master' into ota_commands_in_f…

ce6a824

…lash boards.txt et al need to be regenerated

Regenerated the files

a931150

switched out the cast

1339018

Adjusted values to be addresses not offsets and fixed up references

68245ac

Tweak tests

fa00e1e

davisonja mentioned this pull request Oct 2, 2019

Travis config relies on (untested) arduino nightly build #6582

Closed

6 tasks

davisonja added 2 commits October 2, 2019 19:11

Merge remote-tracking branch 'upstream/master' into ota_commands_in_f…

abe0f39

…lash

davisonja changed the title ~~WIP - Ota commands in flash~~ Ota commands in flash Oct 3, 2019

Merge branch 'master' into ota_commands_in_flash

217a9c5

earlephilhower mentioned this pull request Jan 27, 2021

eboot: .RODATA, upstream uzlib, move CRC, save 112 bytes #7844

Merged

devyte mentioned this pull request Mar 30, 2021

Running out of heap (and thus crashing) during filesystem update will corrupt the filesystem #7950

Closed

6 tasks

d-a-v modified the milestones: 3.0.0, 3.0.1 Mar 31, 2021

devyte mentioned this pull request Apr 4, 2021

PoC for handling Erase WiFi Setting after OTA #6965

Draft

d-a-v modified the milestones: 3.0.1, 3.1 Jun 16, 2021

d-a-v mentioned this pull request Jun 17, 2021

Power off at the ending of ESPhttpUpdate.update, exception occurs for later power on #8121

Closed

6 tasks

earlephilhower mentioned this pull request Jun 23, 2021

suggest master & slave sketches support as espressif's non-OS SDK archieved #8133

Closed

d-a-v modified the milestones: 3.1, 4.0.0 Dec 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ota commands in flash #6538

Ota commands in flash #6538

davisonja commented Sep 20, 2019 •

edited

Loading

davisonja commented Sep 20, 2019

OpenUAS commented Sep 20, 2019

davisonja commented Sep 20, 2019 via email •

edited

Loading

Androbin commented Sep 26, 2019

davisonja commented Sep 27, 2019

lrodorigo commented Oct 16, 2020 via email

devyte commented Oct 17, 2020

devyte commented Oct 17, 2020

drzony commented Oct 17, 2020

davisonja commented Dec 28, 2020

davisonja commented Dec 28, 2020

drzony commented Dec 28, 2020

earlephilhower commented Dec 28, 2020

davisonja commented Dec 29, 2020

drzony commented Dec 29, 2020 •

edited

Loading

davisonja commented Dec 30, 2020

drzony commented Dec 31, 2020

davisonja commented Jan 1, 2021

drzony commented Jan 1, 2021 •

edited

Loading

zhfei1979 commented Apr 9, 2022

davisonja commented Apr 9, 2022 via email

d-a-v commented Apr 9, 2022

Ota commands in flash #6538

Are you sure you want to change the base?

Ota commands in flash #6538

Conversation

davisonja commented Sep 20, 2019 • edited Loading

davisonja commented Sep 20, 2019

OpenUAS commented Sep 20, 2019

davisonja commented Sep 20, 2019 via email • edited Loading

Androbin commented Sep 26, 2019

davisonja commented Sep 27, 2019

lrodorigo commented Oct 16, 2020 via email

devyte commented Oct 17, 2020

devyte commented Oct 17, 2020

drzony commented Oct 17, 2020

davisonja commented Dec 28, 2020

davisonja commented Dec 28, 2020

drzony commented Dec 28, 2020

earlephilhower commented Dec 28, 2020

davisonja commented Dec 29, 2020

drzony commented Dec 29, 2020 • edited Loading

davisonja commented Dec 30, 2020

drzony commented Dec 31, 2020

davisonja commented Jan 1, 2021

drzony commented Jan 1, 2021 • edited Loading

zhfei1979 commented Apr 9, 2022

davisonja commented Apr 9, 2022 via email

d-a-v commented Apr 9, 2022

davisonja commented Sep 20, 2019 •

edited

Loading

davisonja commented Sep 20, 2019 via email •

edited

Loading

drzony commented Dec 29, 2020 •

edited

Loading

drzony commented Jan 1, 2021 •

edited

Loading