Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

follow-up: longjmp annotations and optimization #30

Open
nikomatsakis opened this issue Jun 11, 2020 · 9 comments
Open

follow-up: longjmp annotations and optimization #30

nikomatsakis opened this issue Jun 11, 2020 · 9 comments

Comments

@nikomatsakis
Copy link
Contributor

As discussed in this week's meeting, we realized that permitting longjmp out of a "C" function will lose optimization potential in a case like this:

fn foo(x: &mut u32) {
    *x += 1;
    bar();
}

At least at present, we should be able to move the x down below the call to bar(), so long as we do it also on unwinding. However, longjmp would make that observable.

We discussed an idea where people annotate functions that "may longjmp" in some way -- two ideas were #[longjmp] and #[pof], though neither is ideal. It would be UB for a function that is to be deallocated unless it carries this annotation. Further:

  • The compiler can warn if a #[longjmp] function calls another longjmp fn (or a fn that may be longjmp, i.e., by fn pointer) with pending destructors in scope. This carries a risk of false warning since the fn may in fact not longjmp.
  • The compiler can warn if a non-longjmp fn calls a longjmp one. This carries a risk of missed warnings since calls by pointer could target a longjmp fn.

We would also suppress reordering optimizations around function calls in longjmp-functions.

@Amanieu
Copy link
Member

Amanieu commented Jun 11, 2020

It occurs to me that this optimization might be unsound even without longjmp. Consider the case where x points to a memory-mapped file and bar calls exit(). You would expect the write to x to be reflected in the file on exit, but that won't happen if the write is moved after the call.

@nikomatsakis
Copy link
Contributor Author

@Amanieu Yes, so I brought up the idea that &mut references into shared memory were simply not compatible with this optimization, and I thought the conclusion might be that you should not create &mut references into shared memory (you should instead prefer raw pointers or &Cell). But memory mapped files are a good example of shared memory in practice, not sure if avoiding &mut in such cases is really practical -- maybe it's widespread practice?

@BatmanAoD
Copy link
Member

I also suggested #[cancelable]. #[cancel-safe] may be more descriptive. The Linux pthreads man page uses the term "async-cancel-safe" for some functions, but I'm not sure what the "async" part means.

We discussed whether an annotation is sufficient and agreed that it probably is. This prevents annotating function pointers as #[cancelable], but that is acceptable (and in any case, annotations for function pointers may eventually be added to the language).

We also agreed that when the annotation is introduced, we can specify that using longjmp or pthread_exit to skip destructors in functions that are not annotated with #[cancelable] is always UB.

@petrochenkov
Copy link

petrochenkov commented Jun 12, 2020

One common case for longjumps is jumping from code cache (jitted code produced by any system that does interpretation and needs to speed up it in common cases, e.g. simulator) to regular code on exceptional situations (e.g. some access violation in the simulated system).

The handler for exceptional situations normally resides somewhere at the top of the code tree, so if functions that can terminate with longjump need to be annotated explicitly, then pretty much whole codebase will have to be annotated.

EDIT: Unless some inference is done for code that we see and know, and annotations are needed only for the cases of "external" code, which includes function jumping into the code cache.

EDIT2: The longjump in this case is required to be a "teleportation" rather than unwinding in this case, since there's no common stack between the code cache and regular code, so you have to tweak the lingjump behavior on platforms where it unwinds by default.
However the requirement that *x += 1 must not be moved over bar() still holds.

@BatmanAoD
Copy link
Member

BatmanAoD commented Jun 12, 2020 via email

@bjorn3
Copy link
Member

bjorn3 commented Jun 12, 2020

That won't make dependencies longjmp-safe, while making it much easier to forget about longjmp-safety when you use it.

@BatmanAoD
Copy link
Member

Per Niko's suggested warning scheme, it would emit a warning for every single call into a non-longjmp-safe dependency. So in practice, it would only be convenient to use in isolation or with other dependencies designed with longjmp-safety in mind.

@nikomatsakis
Copy link
Contributor Author

The Linux pthreads man page uses the term "async-cancel-safe" for some functions, but I'm not sure what the "async" part means.

I imagine the "async" in "async cancel safe" refers to whether an asynchronous signal could cancel the function at any point (versus saying that it can be canceled at each point where it invokes another function). The lint we were proposing (check that no dtors are in scope at each function call) would therefore make things "cancel safe" but not "async cancel safe".

@nikomatsakis
Copy link
Contributor Author

I feel like the "annotate a ton of code" use case might also be handled by procedural macros at the module level, though I don't know how well that works. It feels like a bit of an edge case to me I guess. Still, we could permit it at the module level for sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants