Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix BTreeMap UB caused by using pointer identity on a static #63338

Closed
wants to merge 1 commit into from

Conversation

francesca64
Copy link
Contributor

In #50352, EMPTY_ROOT_NODE was introduced to avoid
allocating when creating an empty BTreeMap. EMPTY_ROOT_NODE was
distinguished from regular nodes using pointer identity. However,
EMPTY_ROOT_NODE can have multiple addresses if multiple copies
of liballoc are used together, i.e. when interoperating with a
dynamic library. In that situation, all sorts of scary things will
happen, since it's possible for EMPTY_ROOT_NODE to be treated as
a regular node.

To fix that, this PR adds a flag to every node, and simply uses
that flag to distinguish EMPTY_ROOT_NODE from regular nodes.
This only increases the node size if keys previously used every
byte of padding after len, or if every byte of remaining padding
was previously used to also fit vals. In those cases, every node
will grow by 1 word. This playground link can be used to easily
test the space impact of this PR:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=87850055d6049be048f9e73f6a5055d6

In rust-lang#50352, `EMPTY_ROOT_NODE` was introduced to avoid
allocating when creating an empty `BTreeMap`. `EMPTY_ROOT_NODE` was
distinguished from regular nodes using pointer identity. However,
`EMPTY_ROOT_NODE` can have multiple addresses if multiple copies
of `liballoc` are used together, i.e. when interoperating with a
dynamic library. In that situation, all sorts of scary things will
happen, since it's possible for `EMPTY_ROOT_NODE` to be treated as
a regular node.

To fix that, this PR adds a flag to every node, and simply uses
that flag to distinguish `EMPTY_ROOT_NODE` from regular nodes.
This only increases the node size if `keys` previously used every
byte of padding after `len`, or if every byte of remaining padding
was previously used to also fit `vals`. In those cases, every node
will grow by 1 word. This playground link can be used to easily
test the space impact of this PR:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=87850055d6049be048f9e73f6a5055d6
@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @cramertj (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 6, 2019
@cramertj
Copy link
Member

cramertj commented Aug 6, 2019

r? @gankro

@rust-highfive rust-highfive assigned Gankra and unassigned cramertj Aug 6, 2019
@Centril
Copy link
Contributor

Centril commented Aug 7, 2019

cc @RalfJung

@Gankra
Copy link
Contributor

Gankra commented Aug 7, 2019

Do we... actually support such a cursed linkage? I guess we have to..? Maybe..?

@Mark-Simulacrum
Copy link
Member

I think we only need to support it if LLVM is permitted to generate multiple copies with e.g. codegen units set to non-1 value; there's no way to pass a BTreeMap across different copies of liballoc since we don't have a stable ABI, if I understand our guarantees correctly.

Cc @alexcrichton who knows more about linkage I think

@Gankra
Copy link
Contributor

Gankra commented Aug 7, 2019

consts can freely diverge with codegen units but iirc we guarantee that statics have a single symbol (should be using something like linkonce/linkonce_odr).

@Gankra
Copy link
Contributor

Gankra commented Aug 7, 2019

Regardless of the answer we get here, here's a suggested Cute Hack to avoid bloating up the map: Just have two null parent pointer values, 0 and 1. e.g.

const IS_EMPTY_SINGLETON = 0
const HAS_NO_PARENT = 1

Then just change ascend (the only function that reads parent) to do <= HAS_NO_PARENT (<= 1) instead of == 0. Only overhead would be an extra load to check if we're the empty singleton.

But of course I'd rather not do this if we explicitly don't support this linkage scenario.

@RalfJung
Copy link
Member

RalfJung commented Aug 7, 2019

AFAIK the entire point of statics is to guarantee having a unique address. If you have multiple liballoc in one binary you must make sure objects from one liballoc don't "leak" to the other; these two instances are basically ABI-incompatible.

@Gankra
Copy link
Contributor

Gankra commented Aug 7, 2019

also to be clear this question needs answering because the stdlib is not unique in using this pattern (see: thin_vec)

@RalfJung
Copy link
Member

RalfJung commented Aug 7, 2019

Cc @rust-lang/wg-unsafe-code-guidelines

@alexcrichton
Copy link
Member

Statics guarantee a unique address, yes, and passing something like BTreeMap across dynamic library boundaries is not supported. Almost no types in the standard library are supported to cross dynamic library boundaries.

@Gankra
Copy link
Contributor

Gankra commented Aug 7, 2019

It sounds like we should warn/error if a non-FFI-safe type appears in the public API of something compiled as a dynamic library..? I know the linker is an infinite source of terrors that we can't hope to tame, but it feels like this is a case where something better could be done.

@hanna-kruppe
Copy link
Contributor

It sounds like we should warn/error if a non-FFI-safe type appears in the public API of something compiled as a dynamic library..?

See #19834, though that won't help if someone exposes Rust-ABI function symbols.

@wirelessringo
Copy link

Ping from triage. @Gankra any updates on this? Thanks.

@Gankra
Copy link
Contributor

Gankra commented Aug 16, 2019

I'm pretty sure the conclusion of this discussion is that the author is trying to "fix" a fundamentally unsupported way of using Rust, which doesn't make sense to do. In addition, I also reviewed the patch and provided an alternative implementation that would be acceptable if for some reason we concluded this configuration was worth supporting.

I think we're all implicitly waiting for the author to respond and try to provide a compelling argument for their use case, but if they don't do that we should just close this as WONTFIX.

@Dylan-DPC-zz Dylan-DPC-zz added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 16, 2019
@Dylan-DPC-zz
Copy link

@francesca64 any updates?

@francesca64
Copy link
Contributor Author

I concede defeat! My use-case is hot reloading dylibs, which I admit is a mine field of unavoidable UB. It seems reasonable not to support this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.