-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix BTreeMap
UB caused by using pointer identity on a static
#63338
Conversation
In rust-lang#50352, `EMPTY_ROOT_NODE` was introduced to avoid allocating when creating an empty `BTreeMap`. `EMPTY_ROOT_NODE` was distinguished from regular nodes using pointer identity. However, `EMPTY_ROOT_NODE` can have multiple addresses if multiple copies of `liballoc` are used together, i.e. when interoperating with a dynamic library. In that situation, all sorts of scary things will happen, since it's possible for `EMPTY_ROOT_NODE` to be treated as a regular node. To fix that, this PR adds a flag to every node, and simply uses that flag to distinguish `EMPTY_ROOT_NODE` from regular nodes. This only increases the node size if `keys` previously used every byte of padding after `len`, or if every byte of remaining padding was previously used to also fit `vals`. In those cases, every node will grow by 1 word. This playground link can be used to easily test the space impact of this PR: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=87850055d6049be048f9e73f6a5055d6
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @cramertj (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
r? @gankro |
cc @RalfJung |
Do we... actually support such a cursed linkage? I guess we have to..? Maybe..? |
I think we only need to support it if LLVM is permitted to generate multiple copies with e.g. codegen units set to non-1 value; there's no way to pass a BTreeMap across different copies of liballoc since we don't have a stable ABI, if I understand our guarantees correctly. Cc @alexcrichton who knows more about linkage I think |
consts can freely diverge with codegen units but iirc we guarantee that statics have a single symbol (should be using something like linkonce/linkonce_odr). |
Regardless of the answer we get here, here's a suggested Cute Hack to avoid bloating up the map: Just have two null parent pointer values, 0 and 1. e.g.
Then just change But of course I'd rather not do this if we explicitly don't support this linkage scenario. |
AFAIK the entire point of statics is to guarantee having a unique address. If you have multiple liballoc in one binary you must make sure objects from one liballoc don't "leak" to the other; these two instances are basically ABI-incompatible. |
also to be clear this question needs answering because the stdlib is not unique in using this pattern (see: thin_vec) |
Cc @rust-lang/wg-unsafe-code-guidelines |
Statics guarantee a unique address, yes, and passing something like |
It sounds like we should warn/error if a non-FFI-safe type appears in the public API of something compiled as a dynamic library..? I know the linker is an infinite source of terrors that we can't hope to tame, but it feels like this is a case where something better could be done. |
See #19834, though that won't help if someone exposes Rust-ABI function symbols. |
Ping from triage. @Gankra any updates on this? Thanks. |
I'm pretty sure the conclusion of this discussion is that the author is trying to "fix" a fundamentally unsupported way of using Rust, which doesn't make sense to do. In addition, I also reviewed the patch and provided an alternative implementation that would be acceptable if for some reason we concluded this configuration was worth supporting. I think we're all implicitly waiting for the author to respond and try to provide a compelling argument for their use case, but if they don't do that we should just close this as WONTFIX. |
@francesca64 any updates? |
I concede defeat! My use-case is hot reloading dylibs, which I admit is a mine field of unavoidable UB. It seems reasonable not to support this. |
In #50352,
EMPTY_ROOT_NODE
was introduced to avoidallocating when creating an empty
BTreeMap
.EMPTY_ROOT_NODE
wasdistinguished from regular nodes using pointer identity. However,
EMPTY_ROOT_NODE
can have multiple addresses if multiple copiesof
liballoc
are used together, i.e. when interoperating with adynamic library. In that situation, all sorts of scary things will
happen, since it's possible for
EMPTY_ROOT_NODE
to be treated asa regular node.
To fix that, this PR adds a flag to every node, and simply uses
that flag to distinguish
EMPTY_ROOT_NODE
from regular nodes.This only increases the node size if
keys
previously used everybyte of padding after
len
, or if every byte of remaining paddingwas previously used to also fit
vals
. In those cases, every nodewill grow by 1 word. This playground link can be used to easily
test the space impact of this PR:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=87850055d6049be048f9e73f6a5055d6