-
Notifications
You must be signed in to change notification settings - Fork 830
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: ed25519_test fails on ESP32 #5948
Comments
Hi @PaulMartinsen , Please try defining |
|
Hi @PaulMartinsen I noticed you are getting a bunch of watchdog timeouts during your benchmark tests. Note that in a recent I've confirmed a fresh install of wolfssl 5.5.4 (specifically commit If you are certain you are already doing this, then perhaps try deleting these directories and building fresh:
as well as remove your If you've done all these things and still are having problems, check that the wolfssl in your target components directory is actually v5.5.4. I have several different versions of ESP-IDF installed for two different toolchains, Each of course needs to have its own copy of wolfSSL installed. Here's my Also: check for a duplicate wolfssl in you local project directory. I do have a new Here's my benchmark output:
oddly, I was unable to find for completeness, here's my
Are you using an ESP32-S2, -S3, C3? Try to turn off hardware acceleration to see if that makes a difference. |
HI @gojimmypi , I discovered this while working on an ESP32-S3 port. When I found the test failed for both the ESP32-S3 and the ESP32 I raised the issue. I'm not sure if I need ED25519, but others might. For this issue, we could consider only the ESP32 target as WolfSSL doesn't support hardware acceleration on the ESP32-s3 yet. Also when I tried to run without hardware acceleration I got unhandled exceptions. I didn't look too closely as it wasn't important to me at this point. I'm not using the benchmark example. I'm calling
I compared the I loaded the bench mark program into VisualGDB, built against IDF-v5 and programmed it onto an ESP32. The output I got was:
This didn't report any errors for the ED25519 benchmark. However, as I mentioned, I am calling |
@PaulMartinsen ah yes, my mistake: If using instead WSL to build & flash:
Here's my most recent change to the end of
As you noted, I'm also calling wolf_test_task() in main.c. It is not super fast at this time without hardware acceleration, but the
Please confirm once you have the benchmark and tests working on the ESP32 (without any changes). Then let's update the GitHub subject, as it's not really a bug regarding the I'm looking forward to helping you get full ESP32-S3 support! Cheers |
Hi @PaulMartinsen if you are interested in TLS 1.3 examples for the ESP32 IDF v5, my wolfSSL/wolfssl-examples#360 was merged yesterday. Those examples should, in theory also work with my Espressif_Multichip branch, but I've not actually test them on the ESP32-S3 yet. |
Hi @gojimmypi , I tested on a new computer, with a fresh install tonight and an ESP32 (not ESP32s3) and found
I started going through differences in the |
Hi @PaulMartinsen that's certainly curious. The first thing that came to mind was the version of VisualGDB and more importantly the version of the ESP-IDF used between a new project and the existing one. Is the Exactly which version of VisualGDB are you using? I had some difficulty getting my ESP-IDF v5 going, but at the end of the year there was another sysprogs release and everything seemed to come together after that: https://twitter.com/gojimmypi/status/1608950750486597638 Is there any chance that the config used for a new project points to either a different toolchain and/or a different ESP-IDF version or directory with a different version of wolfSSL? Could there be a stray If it is failing as noted above (watchdog timeout) then perhaps all the I'm wondering what else might be different in your new program based on the example. Is it a project that you can share? |
@PaulMartinsen one more idea: you mentioned the failure in your new program based on the example. I'm wondering if you are calling |
Hi @PaulMartinsen I'm revisiting your example code, above. Is that the exact code you are using in VisualGDB in your I've been recently spending so much time with the Linux tools that when I first read this issue, I didn't even notice the
I can't even get that code to compile, even when removing the
I was however, able to get this to work:
This is essentially the same as your code, including the I'd really like to be able to see your entire test project, if possible. Thank you. |
Hi @gojimmypi , I attached a copy of the project I was building with VisualGDB (#1 above) ED25519Test.zip. I used version 5.6.109.4769 of VisualGDB and IDF-v5 when building this. The #2 project was build directly in the example folder using idf.py (no VisualGDB). This machine has only a single version of VisualGDB, IDFv5 and WolfSSL to avoid any confusion. I don't get the watchdog timeouts if I disable them. Sorry I wasn't clear on that; was getting late! They should be a red-herring though. My understanding is I think your native c code in the previous post should be functionally identical. I'd renamed Does the attached test project work for you? Thanks for your help with this, |
Hi @PaulMartinsen , The attached zip file was tremendously helpful! Thank you for providing that. I learned several things from your project. But more importantly, I found a way to fix the ed25519 test error, although not the root cause. Oddly, making a single change to Using winmerge to compare the effects of that setting: and Now, I realize that is not really very intuitive and there may be some underlying issue that needs to be addressed in either wolfSSL or the Espressif toolchain.... but at least it is a start. Here's the full output from your app with no other change other than the stack setting, setting my serial port to COM20, and using the Tigard JTAG to debug:
I realize this is already Friday for you... but if you could kindly test and confirm this, I'll take a closer look over the weekend. |
Great work @gojimmypi!. When I made that change to the stack smashing protection mode in the solution where I was working on the ESP32s3 hardware support I found the ED25519 test passes now. On both the ESP32 & the ESP32s3. Those flags seem to be putting guard bytes around the stack. Normally such things show up faults, not make the program work! A weird one for sure. Hopefully you can track down the root-cause.
|
The problem here appears to be byte alignment of the WC_ESP32SHA typedef struct in wolfcrypt/port/Espressif/esp32-crypt.h used exclusively by the Espressif port and only when Stack Smashing is set to This is counter-intuitive as noted above: as one would expect the problem to be visible when turning on the stack monitoring feature. The problem is not observed when setting Stack Smashing to To illustrate the byte-alignment cause, here's a code compare of passing and non-passing test results: Simply moving the position of the Note that some individually movable and unused guard bytes are included in this example and have been used to detect struct memory corruption to the Moving all of the largest types to the beginning of the Stackoverflow has a variety of results regarding typedef struct byte alignment memory allocation which are interesting, but none definitive on this particular issue. Here's an example of the Here's the The error occurs in ed25519_verify_msg_final_with_sha(): This is in the call to The actual error is triggered in the ed25519 ConstantCompare() but of course by the time we get there, the data is already bad. The compare fails on the 3rd call; two previous compares were successful. I'll be soon creating a PR that at least fixes the problem. However, I plan to leave this ticket open to continue investigating this odd situation as it is not at all intuitive and may be indicative of some other underlying problems. On the topic of other problems, one thing considered was the enum ESP32_SHA_FAIL_NEED_UNROLL = -1 , similar to what Espressif does in sha.h. Due to the type ambiguity of the |
Hi Jim, Great work tracking that one down. A tricky one for sure! It sounds like setting the stack smashing mode shifted the memory around so it was better aligned (coincidentally?). Is that right? I wonder if it is affected by optimization level too? I was using Debug (-Og). It would be interesting to see what assembly is being generated. I'm wondering if it is working with the reordered fields only because it is still overwriting a byte but in a situation that doesn't matter. I don't see any alignment modification on your typedef. I thought these structures are usually padded to a multiple of 2 or 4, etc. Seems like it might be incorrect translation of the source code. All in all, it quite alarming! |
Thanks, @PaulMartinsen! Indeed this has been an interesting problem. I agree and believe there are likely peripheral effects when turning on Espressif Stack Smashing, and object alignment appears to be one of them. I've tried a variety of Note PR #5997 is not my final solution. I've already had a wolfSSL staff engineer contact me stating "that PR looks fishy". I agree. That change does however significantly improve things to at least pass the test when Espressif Stack Smashing is turned off. More work is needed. I still believe there's some other underlying potential problem. I've been looking at a variety of Regarding unexpected overwriting of memory: in my dev branch I have a variety of guard bytes that I've experimented with at the beginning and/or the end of the struct. I've not yet seen one with unexpected contents. In any case, the good thing here is that there are some fairly aggressive and comprehensive tests, both in the test application as well as the Jenkins build. |
I'm close to getting a PR together to fully resolve this issue. This was never a byte alignment problem, despite the compelling evidence to the contrary. The root cause was the use of copied SHA I've updated #5989 with the details of the root cause of this ED25519 failure. Working code is on my ED25519_SHA2_fix branch. In particular the new esp_sha_init() that attempts to detect if an instance is a copy. HW is demoted to SW if a copy is detected. More details are on my Twitter microblog and will be included in PR. |
Hi @PaulMartinsen this issue of the ED25519 test failure on the ESP32 should now be fixed in the Although there were a variety of theories noted in above referenced issues and pull requests for things such as byte-alignment, stack issues, and initialization, note that the actual root cause was related to the copy of a ctx object with a SHA calculation in progress. There's now a new initializer attribute that keeps track of which See the Espressif wolfssl_test and the various VisualGDB Project Files. Note that for the ESP32-S3, the AES192 HW is not supported. These lines will need to be added to the default
For the problematic AES192 on the ESP32-S3, I have an updated CMakeFiles.txt in the works that will allow for project-specific user settings in a future PR. I also plan to make this fall back to SW more gracefully. Can you please check the ED25519 on your device and close this issue upon confirmation that it is working as desired? Thank you.
|
Hi @gojimmypi , sorry for the delay on this. Super busy atm. I ran into a few problems testing this on my system (Windows 11).
THe error I get is
The problem is we have spaces on Windows, so that the SRC_DIRS entry should have been The VisualGDB debugger has a problem with spaces too, which I resolve with I modified the After copying to a temp folder, it builds, uploads and.... all tests pass :). I'd say its worth fixing the space problem and path length shouldn't be a problem in modern file systems either. spaces would be a showstopper for me; the long paths are more of a side effect of the deep nesting you folks have that (probably ) doesn't affect projects. I didn't try very hard with
|
For the long paths, it may actually be a Windows defaults issue. I've had to set a registry value to allow long paths on Windows. Put the following in a file called something.reg:
Hopefully that fixes it. |
Hi @PaulMartinsen ! Thank you for the feedback.
Indeed there was a time when I had 8.4.0/8.1.0/r9 configured with ESP-IDF v5. I thought I had cleaned them all up to use 11.2.0/9.2.90/r2, instead. The wolfssl_test_IDF_v5_ESP32S3 appears to have the GCC 11.2.0 setting. Can you please confirm which wolfSSL version you are using? I think this is all cleaned up on the current I definitely want robust project files ready to go. I have a Sysprogs support request: Difficulties switching Espressif ESP-IDF version or Chipset. The response indicates that there's not a graceful solution to changing chipsets coming anytime soon. So it's even more important that project files are properly configured.
I have a considerably more robust wolfSSL component CMakeFiles.txt that I think is using your suggestion on quoted paths, although modified a bit. This version also allows for a local project Regardless of we might do here, please note the official Espressif Standard Setup of Toolchain for Windows indicates that the installation path must not be longer than 90 chars and contain no spaces:
@bandi13 has a good suggestion of using a registry setting. I think the error is specific to Windows and unlikely that a CMake setting will fix that. (although I do not know for certain). I probably will not apply that registry setting to my machine, as I prefer to not have any unusual settings that a customer typically will not have. But then, I do all my work in a I'll do my best to help you with the path issue, but I think the official wolfSSL support response will be to follow the Espressif guidance. I'm certain you are not the only one with spaces in the path. I definitely would like things to work for the maximum number of users without needing to fuss. btw - I know I've been referring you to my You may be interested in my new wolfSSL example coming soon, based exactly on your signing example: currently called wolfssl_server_6205 for the Thanks again for your patience and for taking the time to write up feedback. It's super helpful. I hope you are making good progress on your project. Please let me know if there's anything else I can do to help. Cheers |
Hi @gojimmypi , I took another look after merging all the changes in from the wolfssl master branch. Version problem was in I pulled a fresh copy of the wolfssl repository. And now the
Bit short of time this morning to figure out what went wrong. As it has been a problem in the past, I randomly tried adding On the topic of spacy paths, I would think of WolfSSL as a library and not part of the tool-chain. Espressif's problem comes from relying on a 3rd party debugger, which I imagine they have little control over. I imagine some annoying person must have complained too much so the gcc compilers and linkers are happy with spaces now. And now you've got that nice build-in-place make file going I think WolfSSL can expect users to want to reference the WolfSSL library as a submodule within their projects (well, that's what I do anyway :) ). I didn't get a chance this morning to check the new make file you linked to, but will report back when I get a chance. |
Hi there @PaulMartinsen !
Ah, yes. I'm glad you were able to find that. As you saw I renamed the project files to be chip and IDF-version specific. That's odd git did not indicate the file deletion. In any case, thanks for confirming it was not a toolchain version mismatch.
Thanks for the reminder on that. Indeed the S3 does not support AES-192 HW acceleration. See #6375 and page 834 of the ESP32-S3 Technical Reference Manual:
Please note the interim fix, and for all future versions, the
See if that works for you. If not, I'll dig a little deeper and create an interim PR to fix that.
I see your point there and agree that the scope of "what is the toolchain" could be somewhat ambiguous. As it is not entirely clear, I'd say that the official response is to honor the ESP-IDF guidelines with no spaces, and not use long paths. That said, I also would like to it still work with spaces, if possible. In the case of paths in the project, I took my WIP wolfssl_test and copied it to a directory with paces in the path. The respective wolfSSL component CMakeFiles.txt seemed to work. If you take that CMake file for a test drive and confirm it works in your environment, I'll create a PR and apply this to existing samples. I appreciate your patience as we work though these issues. Looking forward to your response. Cheers |
Hi @gojimmypi , Quick update....after adding to
The (enabled) tests all passed. I'm a little unclear why adding only Also, the file I changed ( Justthespacestocheckoutnow. ;-)
|
Hi @gojimmypi , this morning I noticed |
Hi @PaulMartinsen -
I understand. Between the current master wolfSSL, and the code on my WIP ED25519_SHA2_fix branch that I've referred to, there's been a change from using a shared There should be some detection logic if a My goal is to completely remove the functionality where a single Note that I am also working on an alternative solution that will introduce a new mechanism for including wolfSSL into a project. This week I've successfully configured wolfSSL at components.espressif.com. I will be creating a PR for a demo example in the near future. I do not expect to combine these two options. The above-mentioned project-specific In contrast, the components at the ESP Registry are extremely easy for the new user to integrate into a project. The drawback (or benefit, depending on your perspective) is that the component source files are not easily edited. In fact, there's a hash file that even detects changes and complains at build time! Given the install mechanism, there's certainly no easy way to push any local wolfSSL changes back to GitHub.
As for the AES192 - I should have clarified the interim workaround to turn off AES192 for only the ESP32-S3 requires the
Without the
I look forward to your report on my latest CMake file and if it works in your environment with spaces in the path. Note that I plan to have a single, universal CMake file that will be used in all projects, both from the ESP Registry and from the GitHub repo. There are relatively small changes needed to detect if the CMake file is part of an Espressif component or a local project. See #6234 for a roadmap of Espressif updates. Thanks again for all your patience and cooperation. Cheers |
Hi @gojimmypi , I had another look building. This time with the new CMake file, which I downloaded and placed into Running A question: does this new approach to Now I noticed line 30 of the new make file has
If I un-comment that and rebuild, the build is successful when the path to the WolfSSL library includes a space. :) The universal make file sounds very helpful. Generally I would put a library, such as WolfSSL, into a vendor folder, as a submodule in my project repository. This makes it easier to share the project with others. In theory they just check out the main project and get everything they need. Assuming they have the toolchain. Ideally, I would link the submodule to the vendors library but this only works if there is no need to edit any configuration file in the repository (because I wouldn't be able to push those changes to the vendor so a person checking out the project won't get them). So this means it would be much better if the |
Apologies for the delayed response. I got a bit nerd-sniped and chased this bunny down the rabbit hole to get a native Xtensa cross-compiled Linux kernel to run on the ESP32-S3 over the weekend. Oh, that's in addition to now having wolfSSL as a managed component that I also published just last week. Beware though, as a fellow VisualGDB user, I encountered an unexpected issue where the managed components are ignored in Visual Studio. I need to follow-up with the sysprogs response and see if I can get that working. Compiles just fine from commandline.
ah, that's excellent! I'm glad it worked as desired. Having the project-specific config and settings files is really the best approach. I'm glad my latest
That's an excellent question! No. At this time, the However, check out this blog Using user_settings.h with wolfSSL.
Probably, if that Note that when using managed components, the wolfSSL code is downloaded automatically at build time from the ESP Component Registry. This of course has pros and cons. The code is specific to the project, but contributing to the upstream GitHub repo would be rather awkward. Heads up I have an untested feature to specify wolfSSL source code location in an environment variable. (for Espressif)
Ah yes, I pointed you at the benchmark version of my newest
All excellent comments and observations. I'm interested in your feedback with regards to the new information on ESP Registry capabilities.
Yes, I'd like to see a universal makefile that detects, or can be configured, where it is and how to act accordingly. Plus: Customers can request just about any feature. :) |
Hi @gojimmypi , Sorry for the tardy reply. I've been busing getting ws-eventing going and building support to maintain multiple simultaneous sockets so we only need to the TLS handshake once, or at least not for every message to every peer. Keeping me busy, but nothing like running linux on an ESP32s3! I'm impressed! On the ESP registry capabilities: I think this is a nice, low effort way to explore/test libraries. Similar concept to nuget and the Arduino library manager, which works well. And I used this approach initially with an Espressif neo-pixel library. In the long run, I would include the library as a git submodule, either referencing the vendor library or a fork of it. That's the approach I used with WolfSSL. I suppose vendor registries might be less stable over than major sources like github. On build settings: for me, targeting windows and esp32s3 has turned out to be a good decision, partly because the Espressif support is still evolving in the WolfSSL library and partly because development is just easier in a desktop environment. Portable code is always a bonus. For that reason, I think flagging A better solution overall, for me at least, would be to setup the Windows build to use a shared items project. The WolfSSL repository's example lib project would reference that to create Shared item projects is the approach I'm currently using to target Windows and Espressif (with |
I'm so glad things are going well for your project!
Thanks! Yes, that's absolutely the objective of wolfSSL in the ESP Registry: something super easy to get started.
I'm standing on the shoulders of giants, but yes!! Linux on the ESP32-S3 is really cool. Did you see my initial performance benchmark comparison?
Interesting; can you provide more details, perhaps provide an example solution / project set? Is this regarding the my cmake warning when more than one I do have an update - recall the local I've reached out to the internal support to see if there are other ideas on sharing wolfSSL between projects on different platforms, such as your Windows & ESP32.
I'll take a look at that. Thanks for the suggestion. :) As for this issue, I'm fairly certain that the ED25519 test on the ESP32 for both hardware and software was fixed some time ago. I think it is a bit misleading to leave this issue open. If you don't mind, I'm going to go ahead and close it. Please feel free to open any additional issues as appropriate. I do appreciate all your feedback and ideas. We can certainly continue the discussion here, or perhaps over on my Espressif Improvements Roadmap Issue #6234. |
Contact Details
pmartinsen@gmail.com
Version
d686f0a (last commit on 2022/12/23)
Description
Calling
wolf_test_task()
fails in theed25519_test()
function when running on ESP32, IDF5.0 with error code -229.Failure point is in
ed25519_verify_msg_final_with_sha(...)
(ed25519.c) where
ret = ConstantCompare(rcheck, sig, ED25519_SIG_SIZE/2);` returns 255 instead of the expected 0.Note, it takes about 2 minutes between completing the AES-GCM test and reaching this fault, so the log reports several watchdog events warning the idle task is not running (because the tests are using up all the processor time).
A work around is to comment out
HAVE_ED25519
inuser_settings.h
if ED255519 isn't needed.Reproduction steps
Created new project, based on hello example, in VisualGDB. Using IDFv5, recent commit from WolfSSL master branch and also v5.5.4.
Calling
wolf_test_task()
from main.Wait patiently.
Relevant log output
The text was updated successfully, but these errors were encountered: