Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

statepoints instead of the shadow stack #16847

Closed
wants to merge 13 commits into from
Closed

statepoints instead of the shadow stack #16847

wants to merge 13 commits into from

Conversation

carnaval
Copy link
Contributor

@carnaval carnaval commented Jun 9, 2016

this is finally passing tests. It still uses our "one big alloca" frame since the statepoint lowering does a terrible job at spilling individual roots. Instead we get an address relative to the frame and generate no additional code.

caveats :

  • only works on linux, x86-64, glibc
  • only works with llvm-svn
  • needs a few patches

patches to llvm are IMO uncontroversial bug fixes
the patch to libunwind is unfortunate. It is there because the gc needs to unwind tasks that are not the current task which is a pain with COPY_STACKS.
The easiest way I found to do so is to use the "remote" api of libunwind to pretend that the task's stack is switched. However the remote part does not support our dynamic table format.
The "proper" fix would be to implement it but that felt a little silly: we know that the dwarf info is never on the stack and we would be potentially leaving performance on the table by indirecting all the memory accesses through a function call. I just forced the resolution to use the local api instead, meh.

On the bright side this saves us the TLS access on function entry, the push/pop stores and the size/link stores.

I would be surprised if we didn't find other bugs in llvm's statepoint handling so this will probably stay optional for a long time :-/

TODOs

  • move the jmpbuf glibc specific stuff out and factor it from where it was taken in task.c
  • stress test for a day or two with aggressive gc
  • make it work with multithreading (not too hard I think, just need to unwind different contexts)

@@ -49,7 +49,7 @@ using namespace llvm;
#include "julia_internal.h"
#include "codegen_internal.h"
#ifdef _OS_LINUX_
# define UNW_LOCAL_ONLY
//# define UNW_LOCAL_ONLY
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(note this is probably not needed here, I'm not sure anymore I'll have to check)

@carnaval
Copy link
Contributor Author

carnaval commented Jun 9, 2016

Ah, also I think it'd be better to intercept the stackmaps in the section allocator instead of the debuginfo handler. That way we can parse them ahead of time and throw away most of the info, keeping only what's useful to us. Will wait for Yichao's PR to be merged to integrate that.

@Keno
Copy link
Member

Keno commented Jun 9, 2016

patches to llvm are IMO uncontroversial bug fixes

Did you submit them upstream?

@JeffBezanson
Copy link
Member

JeffBezanson commented Jun 9, 2016

Amazing! Any performance (or code size) numbers?

@carnaval
Copy link
Contributor Author

carnaval commented Jun 9, 2016

Did you submit them upstream?

not yet. I was not entirely sure about one of them so I asked on llvm-dev and got no answer yet.

Amazing! Any performance numbers?

Not yet :-) I wouldn't be surprised if this slowed down the gc a lot when there are a lot of tasks. @yuyichao I remember you quoting numbers for the price of the TLS access in the prologue, what did you use as a bench ?

@yuyichao yuyichao added compiler:codegen Generation of LLVM IR and native code GC Garbage collector labels Jun 9, 2016
@yuyichao
Copy link
Contributor

yuyichao commented Jun 9, 2016

#14083 (comment)
$(Expr(:the_exception)) (or whatever it is) also used to generate a TLS reference without a GC frame before #15369 (comment)

@tkelman
Copy link
Contributor

tkelman commented Jun 9, 2016

I was not entirely sure about one of them so I asked on llvm-dev and got no answer yet.

Try IRC. llvm-dev can be very hit or miss with actually getting responses.

@@ -178,6 +178,7 @@ $(build_sysconfdir)/julia/juliarc.jl: $(JULIAHOME)/contrib/windows/juliarc.jl
endif

$(build_private_libdir)/%.$(SHLIB_EXT): $(build_private_libdir)/%.o
objcopy --globalize-symbol __LLVM_StackMaps $<
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

osx binutils doesn't have objcopy, and for cross-compile it would need the cross prefix

@ararslan
Copy link
Member

ararslan commented Jun 9, 2016

Your commit messages read like poetry

@StefanKarpinski
Copy link
Member

"statepoints instead of the shadow stack" by O. Blumberg:

aaa
wip
close
fix silly llvm bug
almost there
dum dum
squeak squeak
fix llvm memory dep w/ statepoints
remove size & link from the frame
fix
ups
--_

@JeffBezanson
Copy link
Member

If it's passing at least some tests, make test-perf is likely to work and give us some idea of where we are.

@carnaval
Copy link
Contributor Author

ref http://reviews.llvm.org/D21259

@Keno
Copy link
Member

Keno commented Jan 17, 2020

Our GC lowering has changed quite a bit, so this is quite outdated. Additionally, we've found that relying on libunwind can be quite problematic, since that unwind information is often wrong. We can keep this branch around for reference if somebody ever wants to try again in the future, but I think this PR can be closed.

@Keno Keno closed this Jan 17, 2020
@DilumAluthge DilumAluthge deleted the ob/stp branch March 25, 2021 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code GC Garbage collector
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants