Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Espressif HAL5.1 - Core panic'ed on EPS32S3 with hello_world sample #71397

Closed
Piziwate opened this issue Apr 11, 2024 · 24 comments
Closed

Espressif HAL5.1 - Core panic'ed on EPS32S3 with hello_world sample #71397

Piziwate opened this issue Apr 11, 2024 · 24 comments
Assignees
Labels
bug The issue is a bug, or the PR is fixing a bug platform: ESP32 Espressif ESP32 priority: low Low impact/importance bug Stale
Milestone

Comments

@Piziwate
Copy link
Contributor

Piziwate commented Apr 11, 2024

Describe the bug
When I compile the hello_world example with the main version of Zephyr, the MCU crashes. This issue has arisen since a few days ago (transition to HAL5.1).

I'm using a custom board based on the ESP32S3 SOC (ESP32-S3-WROOM-2-N32R8V), but to check I used the board esp32s3_devkitc/esp32s3/procpu in Zephyr, since hello_world doesn't use any external hardware and drivers.

Compiled with :
west build -b esp32s3_devkitc/esp32s3/procpu .\samples\hello_world --pristine

To Reproduce
Steps to reproduce the behavior:

  1. west build -b esp32s3_devkitc/esp32s3/procpu .\samples\hello_world --pristine
  2. west flash

Impact
Zephyr doesn't start well. (Guru Meditation Error: Core 0 panic'ed (IllegalInstruction))

Logs and console output

ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0x7 (TG0WDT_SYS_RST),boot:0xc (SPI_FAST_FLASH_BOOT)
Saved PC:0x40050f3f
SPIWP:0xee
Octal Flash Mode Enabled
For OPI Flash, Use Default Flash Boot Mode
mode:SLOW_RD, clock div:2
load:0x3fc8d1d8,len:0x186c
load:0x40374000,len:0x91b8
SHA-256 comparison failed:
Calculated: 4a27f71faf34de04864aae173eae34d25fba7a4d62a02981b4992fec4ed976a9
Expected: 00000000a0550000000000000000000000000000000000000000000000000000
Attempting to boot anyway...
entry 0x40377784
I (89) boot: ESP Simple boot
I (89) boot: compile time Apr 11 2024 15:56:11
W (89) boot: Unicore bootloader
W (90) spi_flash: Octal flash chip is using but dio mode is selected, will automatically swich to Octal mode
I (97) spi_flash: detected chip: mxic (opi)
I (101) spi_flash: flash io: opi_str
W (104) spi_flash: Detected size(32768k) larger than the size in the binary image header(2048k). Using the size in the binary image header.
I (117) boot: chip revision: v0.1
Guru Meditation Error: Core 0 panic'ed (IllegalInstruction)
Core 0 register dump:
PC      : 0x42001f50  PS      : 0x00060b30  A0      : 0x8037caf4  A1      : 0x3fceb490
A2      : 0x00000001  A3      : 0x3c010d83  A4      : 0x3c010d8d  A5      : 0x00000078
A6      : 0x3c010d83  A7      : 0x00000009  A8      : 0x803796d4  A9      : 0x3fceb4a0
A10     : 0x00000001  A11     : 0x3c010d83  A12     : 0x3c010d8d  A13     : 0x3fceb4c0
A14     : 0x3fceb4a0  A15     : 0x0000000c  SAR     : 0x00000004  EXCCAUSE: 0x00000000
EXCVADDR: 0x00000000  LBEG    : 0x40056f5c  LEND    : 0x40056f72  LCOUNT  : 0x00000000

Backtrace: 0x42001f50:0x3fceb490 0x4037caf1:0x3fceb4e0 0x4037c64f:0x3fceb510 0x4037c7a8:0x3fceb530 0x40377787:0x3fceb550 0x40045c01:0x3fceb570 0x40043ab6:0x3fceb6f0 0x40034c45:0x3fceb710

and booting in a loop !

Environment (please complete the following information):

  • OS: Windows 11 Pro
  • Zephyr SDK zephyr-sdk-0.16.5
  • Commit SHA or Version used f021236
@Piziwate Piziwate added the bug The issue is a bug, or the PR is fixing a bug label Apr 11, 2024
Copy link

Hi @Piziwate! We appreciate you submitting your first issue for our open-source project. 🌟

Even though I'm a bot, I can assure you that the whole community is genuinely grateful for your time and effort. 🤖💙

@sylvioalves
Copy link
Collaborator

sylvioalves commented Apr 11, 2024

@Piziwate would you please sync your main to latest, run west update and then west blobs fetch hal_espressif? At least to make sure we have the same env. It works in here.

@Piziwate
Copy link
Contributor Author

Piziwate commented Apr 12, 2024

@sylvioalves I reinstalled everything, to the latest version, but the problem persists! I'm getting exactly the same error. Which ESP32 did you test on? In my case, I can get the hello_world to work fine on a regular ESP32, but no success on the ESP32S3.

Could it be related to the fact that my module has 32Mb of flash and 8Mb of PSRAM?
Could you share your boot prompt for comparison purposes?

@sylvioalves
Copy link
Collaborator

@Piziwate I used a N16R8 here, not N32R8. Let me check again, the issue you described sounds related to octal flash.

@sylvioalves
Copy link
Collaborator

@Piziwate you mentioned this is a custom board. Is it using USB interface or there is a usb-serial converter? That would explain things.

@Piziwate
Copy link
Contributor Author

@sylvioalves Yes, I'm using USB (uart0 is available on a debug header, but currently not used)

@sylvioalves
Copy link
Collaborator

@Piziwate I was able to reproduce now in a octal flash SPI (same as yours). Will check the issue.

@sam131208
Copy link

@sylvioalves Whether octal SPIRAM has been resolved. If not, is it possible to refine it together?

@nashif nashif added the priority: low Low impact/importance bug label Apr 23, 2024
@Piziwate
Copy link
Contributor Author

@sylvioalves Do you have any news regarding this bug? I'm really stuck because of it! I'm willing to help but I admit I don't really know where to start looking!

@celinakalus
Copy link
Contributor

I am currently stuck at a similar looking issue with my esp32s3_devkitc, Core is panicking on hello_world sample. I can confirm it works on the commit where initial board support was added. git bisect tells me:

a54f3832f5fc8fae55198045949e1991b3261450 is the first bad commit
commit a54f3832f5fc8fae55198045949e1991b3261450
Author: Marek Matej <marek.matej@espressif.com>
Date:   Wed Jan 31 20:18:02 2024 +0100

    kconfig.zephyr: Remove ESP_IDF bootloader option

    Remove ESP_IDF bootloader option.
    Default boot method is simple boot.

    Signed-off-by: Marek Matej <marek.matej@espressif.com>

 Kconfig.zephyr | 9 ---------
 1 file changed, 9 deletions(-)

So maybe it's not the transition to HAL 5.1 that is causing the problem?

I see two kinds of log output on problematic commits:

ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0x3 (RTC_SW_SYS_RST),boot:0x8 (SPI_FAST_FLASH_BOOT)
Saved PC:0x403cdd3d
SPIWP:0xee
mode:DIO, clock div:1
load:0x3fce3818,len:0x1664
load:0x403c9700,len:0x4
load:0x403c9704,len:0xb74
load:0x403cc700,len:0x2e8c
entry 0x403c98fc
I (31) boot: ESP-IDF v5.1-dev-3972-g1559b6309f 2nd stage bootloader
I (31) boot: compile time Mar 15 2023 12:14:10
I (32) boot: chip revision: v0.2
I (36) boot.esp32s3: Boot SPI Speed : 80MHz
I (41) boot.esp32s3: SPI Mode       : DIO
I (45) boot.esp32s3: SPI Flash Size : 4MB
I (50) boot: Enabling RNG early entropy source...
I (56) boot: Partition Table:
I (59) boot: ## Label            Usage          Type ST Offset   Length
I (66) boot:  0 factory          factory app      00 00 00010000 00100000
I (74) boot: End of partition table
E (78) esp_image: image at 0x10000 has invalid magic byte (nothing flashed here?)
E (86) boot: Factory app partition is not bootable
E (92) boot: No bootable app partitions in the partition table

and on later commits (e.g. 7477636f0f9):

ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0x7 (TG0WDT_SYS_RST),boot:0x8 (SPI_FAST_FLASH_BOOT)
Saved PC:0x40050f3f
SPIWP:0xee
mode:DIO, clock div:2
load:0x3fc8d1f0,len:0x186c
load:0x40374000,len:0x91d0
SHA-256 comparison failed:
Calculated: f95d0f487ccb304b1dd846f910286b316bbdea6a20fd5784ce8741de861e8558
Expected: 0000000080550000000000000000000000000000000000000000000000000000
Attempting to boot anyway...
entry 0x4037779c
I (68) boot: ESP Simple boot
I (68) boot: compile time Jun 28 2024 17:49:44
W (68) boot: Unicore bootloader
I (69) spi_flash: detected chip: generic
I (70) spi_flash: flash io: dio
W (73) spi_flash: Detected size(16384k) larger than the size in the binary image header(2048k). Using the size in the binary image header.
I bootloader_flash: XM25QHxxC startup flow
E bootloader_flash: XMC flash startup fail
E (93) boot.esp32s3: failed when running XMC startup flow, reboot!
[esp32s3] [ERR] HW init failed, aborting
Guru Meditation Error: Core 0 panic'ed (IllegalInstruction)
Core 0 register dump:
PC      : 0x4200202c  PS      : 0x00060730  A0      : 0x80045c04  A1      : 0x3fceb550
A2      : 0x00000011  A3      : 0x3ff1b1c2  A4      : 0x00000020  A5      : 0x00010000
A6      : 0x00010000  A7      : 0x0000aa90  A8      : 0x803777b4  A9      : 0x3fceb500
A10     : 0x0000002a  A11     : 0x3fc8d481  A12     : 0x3fc8ea5c  A13     : 0x600080b8
A14     : 0x00000008  A15     : 0xffffffff  SAR     : 0x00000004  EXCCAUSE: 0x00000000
EXCVADDR: 0x00000000  LBEG    : 0x4037833d  LEND    : 0x40378341  LCOUNT  : 0x00000000

Backtrace: 0x4200202c:0x3fceb550 0x40045c01:0x3fceb570 0x40043ab6:0x3fceb6f0 0x40034c45:0x3fceb710

I hope this helps. Any news regarding fixes? @sylvioalves

@sylvioalves
Copy link
Collaborator

sylvioalves commented Jun 28, 2024

Edit: Can you fetch latest main and re-check it?

@celinakalus
Copy link
Contributor

celinakalus commented Jun 28, 2024

latest main (1159c2a) does not work, see second variant of log output

@sylvioalves
Copy link
Collaborator

Would you also test this change? zephyrproject-rtos/hal_espressif#297
I don't have a board with that particular XMC type right now to test it.

@celinakalus
Copy link
Contributor

Still getting a boot loop, unfortunately:

ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0x7 (TG0WDT_SYS_RST),boot:0x8 (SPI_FAST_FLASH_BOOT)
Saved PC:0x40050f3f
SPIWP:0xee
mode:DIO, clock div:2
load:0x3fc8d338,len:0x181c
load:0x40374000,len:0x9318
SHA-256 comparison failed:
Calculated: fa8b442bfad2529b395d9cf63152ed402cecaf4f5b0f559058e9962b92d550e9
Expected: 0000000090540000000000000000000000000000000000000000000000000000
Attempting to boot anyway...
entry 0x40377864
I (68) boot: ESP Simple boot
I (68) boot: compile time Jun 28 2024 18:20:31
W (69) boot: Unicore bootloader
I (69) spi_flash: detected chip: generic
I (70) spi_flash: flash io: dio
W (73) spi_flash: Detected size(16384k) larger than the size in the binary image header(2048k). Using the size in the binary image header.
I bootloader_flash: XM25QHxxC startup flow
E bootloader_flash: XMC flash startup fail
E (93) boot.esp32s3: failed when running XMC startup flow, reboot!
[esp32s3] [ERR] HW init failed, aborting
Guru Meditation Error: Core 0 panic'ed (IllegalInstruction)
Core 0 register dump:
PC      : 0x42001ec8  PS      : 0x00060730  A0      : 0x80045c04  A1      : 0x3fceb550
A2      : 0x00000011  A3      : 0x3ff1b1c2  A4      : 0x00000020  A5      : 0x00010000
A6      : 0x00010000  A7      : 0x0000ab80  A8      : 0x8037787c  A9      : 0x3fceb500
A10     : 0x0000002a  A11     : 0x3fc8d5c9  A12     : 0x3fc8eb54  A13     : 0x600080b8
A14     : 0x00000008  A15     : 0xffffffff  SAR     : 0x00000004  EXCCAUSE: 0x00000000
EXCVADDR: 0x00000000  LBEG    : 0x40378405  LEND    : 0x40378409  LCOUNT  : 0x00000000

Backtrace: 0x42001ec8:0x3fceb550 0x40045c01:0x3fceb570 0x40043ab6:0x3fceb6f0 0x40034c45:0x3fceb710

Tested with main (1159c2a) plus modified west.yml.

=== updating hal_espressif (modules/hal/espressif):
HEAD is now at 7cdf0b4a8e test: flash: add XMC flash ID
WARNING: left behind hal_espressif branch "bugfix/xmc_flash"; to switch back to it (fast forward):
  git -C ../modules/hal/espressif checkout bugfix/xmc_flash

@sylvioalves
Copy link
Collaborator

Which board is that?

@celinakalus
Copy link
Contributor

A supposedly ESP32-S3-DevKitC-1 compatible board off of Amazon, branded DollaTek. Supposedly contains an ESP32-S3-WROOM-1-N16R8, according to the engraving on the ESP module.

@Piziwate
Copy link
Contributor Author

Piziwate commented Jul 1, 2024

@sylvioalves , I don't know if your modifications also concerned my issue. I ran a test again this morning and I'm still experiencing continuous restarts with the same message!

@LeoBriandFiveO
Copy link
Contributor

I got the same issue on latest main, anything new ?

@cburlacu
Copy link
Contributor

I also have the same issue (genuine ESP32-S3-DevKitC-1 v1.1) and from the module marking it has a ESP32-S3-WROOM-2-N32R8V module...

@brandon-exact
Copy link
Contributor

I am also eager for a fix. If anyone has a workaround please share!

@cburlacu
Copy link
Contributor

FWIW it appears that it was fixed (there are a few recent commits in the Espressif's hal):

*** Booting Zephyr OS build v3.7.0-2963-g418b1e0e2146 ***
Hello World! esp32s3_devkitc/esp32s3/procpu

@celinakalus
Copy link
Contributor

Very glad to hear that. Indeed the current main works for me, too. The first working commit for me is 795ac34f291c, directly after a HAL update (the commit of the HAL change itself did not compile).

In the ESP HAL, a bisect between the commit from the HAL update reveals that commit 30ae474a7f55 (hal: gcc: Compiler flag for strict volatile bitfields) is the one that fixes the problem.

This commit is not currently on v3.7-branch, and indeed, the boot loop occurs there. Cherry-picking the commit mentioned above restores functionality.

@sylvioalves any chance to get this change backported? I would love to be able to use the LTS release on my ESP32S3.

@sylvioalves
Copy link
Collaborator

sylvioalves commented Sep 19, 2024

@celinakalus Sounds we should. Although that change fixed the issue, I do need to verify it properly.

@sylvioalves sylvioalves added this to the v3.7.1 milestone Sep 27, 2024
sylvioalves added a commit to sylvioalves/zephyr that referenced this issue Oct 1, 2024
ESP32-S3 initialization code should apply the errata
after cache initialization. This fixes it making sure
data and cache instruction are properly
handled and let following calls to work as needed.

This also update hal_espressif to force gcc to treat
register bitfield structs declared as volatile to
ensure writes on 32 bit peripheral registers.

Fixes zephyrproject-rtos#71397
Fixes zephyrproject-rtos#76325

Signed-off-by: Sylvio Alves <sylvio.alves@espressif.com>
nashif pushed a commit that referenced this issue Oct 5, 2024
ESP32-S3 initialization code should apply the errata
after cache initialization. This fixes it making sure
data and cache instruction are properly
handled and let following calls to work as needed.

This also update hal_espressif to force gcc to treat
register bitfield structs declared as volatile to
ensure writes on 32 bit peripheral registers.

Fixes #71397
Fixes #76325

Signed-off-by: Sylvio Alves <sylvio.alves@espressif.com>
Copy link

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

@github-actions github-actions bot added the Stale label Nov 27, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue is a bug, or the PR is fixing a bug platform: ESP32 Espressif ESP32 priority: low Low impact/importance bug Stale
Projects
None yet
Development

No branches or pull requests

9 participants