-
Notifications
You must be signed in to change notification settings - Fork 842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault on Apple Silicon / arm64 #343
Comments
Out of curiosity, would the |
Fails with exactly the same issue:
|
weird because the assembly is supposed to be different. |
As far as I see in the diff to master, some of the |
Right ... what happens if you use this change :
Just theorical, I wish I had a M1 to test for real :-) |
Seems to pass and now faults in |
Yes it would be similar-ish solution if it works like : Thanks for your patience :-) |
With With
|
Too bad the changes work on Linux arm64 maybe the M1 gear needs a specific code path in this case. Just wanted to see while mimalloc uses the thread slot maybe that s not such good thing for the M1 case. Maybe I m wrong tough, lot of maybe at the moment :-) |
When I comment line 298, the
|
Right so maybe M1 needs a specific code path (specific register like x86 case ?). I see on other projects issues popping up about M1 here and there seems to be really a particular animal. |
Hi @xhochy , lucky you with an M1 :-) Thanks @devnexen for looking into this. If you get to it, can you submit the assembly changes together with the correct pre-processor test for the M1? Hope we can get this working! |
@xhochy, I just pushed an update for macOS ; can you try again with the latest |
Latest dev didn't make a difference. Exactly the same behaviour as before. |
Also didn't make any difference, with or without the suggestions by @devnexen. In general this looks like the same issue as people are facing on iOS (see #262 and I tried to use the fixes suggested there. Thus I defined This then leads to a slight different error:
I guess that the fix for the M1 and iOS will be the exact same as they both use similar execution models and the same chip design ;) |
Yes, that also works me for. Only |
Was wondering if, only in the case of darwin ARM devices, we would use, tpidrr0_el0 TID register instead ? tpidr_el0 always return 0 otherwise. |
Hmm, I wish I had an M1 to test with; It is always tricky on the mac as the loader calls |
|
They're completely unrelated registers that are both available. Linux and FreeBSD leave it entirely up to userspace to manage TLS how it sees fit, and so userspace needs a writable register not a read-only one. On macOS/Mach-O TLS used to be more bolted on the side (and came extremely late) unlike the implementation for ELF, and it appears it still is. But aside from that, my guess is that for dubious "security" reasons they decided to have the kernel play a larger part in controlling userspace's TLS and, whilst userspace can read it quickly, writing it needs to go through the kernel (which isn't exactly a hot path, but also there's not much good reason for it, it's just unnecessary overhead, especially since you still need to context switch So, both choose which register to use based on their requirements and design. Neither is right or wrong, nor is either a special case, they're just different. Having said that, macOS is the odd one out for the Unixes. |
I believe we figured out a way to make this work -- thanks so much! Really nice to see how this could be patched and merged without me having an M1 available -- the power of the internet :-) Thanks again! |
I bought an M1 for testing because I contribute to Open Source projects. You are welcomed to use it. Send over your The GCC Compile Farm also has a M1 for testing. You can get an account at https://cfarm.tetaneutral.net/; see the "Request and Account" link at the top of the page. Through the Compile Farm you get access to a lot of machines including ARM64, POWER8, POWER9 and Sparc. OSes include AIX, Linux, OS X, the BSDs and Solaris. |
Hi @xhochy -- this has taken awhile but I think I fixed the issue in the latest releases (v1.7.2, v2.0.2). I did some quick testing and mimalloc is often twice as fast (up to 10x in xmalloc-test) than the system allocator:
|
I tested the latest Environment: Apple M1 (MacBook Air), macOS v11.4 |
any progress on it? |
Building
mimalloc
fails in the TLS initialization with a segmentation fault:The text was updated successfully, but these errors were encountered: