-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: test with multiple Nim version #429
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, but strange that windows 1.2 / 1.4 segfaults?
You're probably aware of this but the linux-amd64 failure is a sporadic failure we see everywhere: #148
Plenty of failures, including this test case crashing at run time, on Windows only, after unrelated changes: nim-eth/tests/p2p/test_discoveryv5.nim Line 413 in ce296ff
or this check failing with Nim-1.6:
All this on top of the usual 1.6 partial GC-safety analysis that stops halfway, asking me to add {.gcsafe.} pragmas so it can continue. I'm trying to fix all I can, before giving up on some Nim versions and moving them in a daily workflow. |
Yeah, I was referring to the 1.2 crash on Windows. That is strange as I've never seen it before (including Windows). I can have a look at that if you aren't already. When adding multiple Nim versions being tested in CI I don't expect necessarily for all of the new versions also to work and/or be fixed in the same PR. But if so, even better, sure. |
Please do. I don't know what triggers it. Feel free to push to this branch, while testing. I'll refrain from rebasing it. |
It happens in here: nim-eth/tests/rlp/test_api_usage.nim Line 24 in ce296ff
Got a GDB backtrace:
|
A memory debugger called "Dr. Memory" has this to say about it:
|
A newer Dr. Memory version doesn't crash, but is more terse:
|
You got this locally? Looks like it crashes when it should catch the AssertionError with the unittest And the CI crash also happens in a test that does an expect on a defect: https://github.com/status-im/nim-eth/blob/master/tests/p2p/test_discoveryv5.nim#L435 |
Yes, locally. The segfault doesn't appear with ARC or ORC and it seems to be triggered by attempting to unwind a corrupt call stack while raising that exception. That first "unaddressable access" reported by Dr. Memory is a false positive. That's something |
Ok, I'll see if I can also reproduce in a VM. I quickly tried without the QUICK_AND_DIRTY flags as I noticed that as one clear difference in the CI compared to before, but it still failed. I've removed that commit again. |
Those are just to avoid building tools we don't need and to skip a final compiler bootstrap step that is known to produce a binary identical to the previous one. |
Commenting out these two lines towards the end of the test suite makes it pass: nim-eth/tests/rlp/test_api_usage.nim Lines 197 to 198 in ce296ff
Adding a simple |
It appears that you can work around the crash in the discoveryv5 tests on Windows by replacing |
You know what's weird? I cannot replicate this new crash locally, in a Windows 10 VM. |
Did a bit more testing to see why this works, turns out that just putting the test code in a proc is not enough. I needed to add Adding |
Just for reference: in my Windows 10 VM I only see the crash in |
Which puts an env struct pointer on the stack, right? Still feels like a random stack change. |
I don't know what |
I managed to replicate the crash locally by adding
|
Believe it or not, As to how it works, I don't know yet. Maybe it changes the stack alignment, which MinGW is notoriously bad with, creating problems when DLLs with one stack alignment are used from a binary with another one. |
d52aa28
to
a3c7fb3
Compare
|
and clean up the testing tree a little