Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using const fn constructor instead of struct expression with large number of constants causes compilation to fail because of large memory allocation. #68957

Closed
CDirkx opened this issue Feb 8, 2020 · 7 comments · Fixed by #69072
Assignees
Labels
A-associated-items Area: Associated items (types, constants & functions) I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@CDirkx
Copy link
Contributor

CDirkx commented Feb 8, 2020

I have:

  • A struct CodePoint with a const fn from(...) -> Self constructor
  • A large number of CodePoint constants (42.000+) using the constructor
#[repr(transparent)]
pub struct CodePoint {
    raw: u32
}

impl CodePoint {
    pub const fn from(raw: u32) -> Self {
        // debug_assert!(raw <= 0x10FFFF);
        CodePoint { raw }
    }
}

// include!(...)
// 42508 constants, 3.28MB file
impl CodePoint {
    pub const SPACE : CodePoint = CodePoint::from(32u32);
    ...
}

Building this leads to:

  • Compiler busy for 2min+
  • Compiler error trying to allocate 1.2GB:
    memory allocation of 1207959552 bytes failed
    even though I have 12GB of unused RAM

Using struct expressions instead of a const fn constructor:

impl CodePoint {
    pub const SPACE : CodePoint = CodePoint { raw: 32u32 };
    ...
}
  • Compiler finishes within 5s.

Tested on nightly and stable, with include!(...) and copy-pasting: no differences.

Source file to reproduce: reproduce.zip

@jonas-schievink jonas-schievink added A-const-eval Area: Constant evaluation, covers all const contexts (static, const fn, ...) I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 8, 2020
@CDirkx
Copy link
Contributor Author

CDirkx commented Feb 8, 2020

I know 42.000+ constants is probably not the best idea, but with the struct expression case finishing within 5s I would expect the compiler to be able to handle them, as the const constructor does exactly the same thing.

@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Feb 8, 2020

The hot code here is rustc_typeck::check::method::<impl rustc_typeck::check::FnCtxt>::associated_item (80% CPU), which does a linear scan to find an associated item with a particular name. Seems like each associated const needs to do method lookup to find the constructor, which then needs to scan over all other associated consts before it finds CodePoint::from.

@ecstatic-morse ecstatic-morse added I-compiletime Issue: Problems and improvements with respect to compile times. A-associated-items Area: Associated items (types, constants & functions) and removed A-const-eval Area: Constant evaluation, covers all const contexts (static, const fn, ...) labels Feb 8, 2020
@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Feb 10, 2020

@rustbot claim

@ecstatic-morse
Copy link
Contributor

@CDirkx compile time has improved a lot in the current nightly. const_constructor now takes ~30s with a local, non-LTO master build compared to ~15s for struct_literal. #69072 will improve this further: const_constructor takes about the same amount of time as struct_literal with that PR applied.

@CDirkx
Copy link
Contributor Author

CDirkx commented Feb 15, 2020

Yes I can confirm that on the current nightly memory usage by rustc is way lower (not claiming multiple GBs of RAM anymore), resulting in not running out of memory anymore and succesfull compilation at the speeds you mention.

I also tried with all named Unicode code points included (including names like CJK_UNIFIED_IDEOGRAPH_2D6CE, 135.000+ total; reproduce_all.zip), this also compiled just fine while taking around 1.5 mins, which will be improved by your PR.

Thanks for taking the time to make Rust work with this admittedly non-standard code.

@CDirkx

This comment has been minimized.

@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Feb 16, 2020

@CDirkx if the issue has been resolved to your satisfaction, feel free to close it. #68837 was the PR that made the big difference here. There's no rush however.

I have another fix in mind in addition to #69072 to make compile-time linear/linearithmic in the number of associated items, so your compile times should continue to improve regardless of the status of this issue. I'm also looking to add a variation of your MCVE to the rustc-perf benchmarks so we don't regress this in the future.

Thank you for posting this in the first place. It made finding these rough spots very easy!

@CDirkx CDirkx closed this as completed Feb 18, 2020
bors added a commit that referenced this issue Feb 20, 2020
O(log n) lookup of associated items by name

Resolves #68957, in which compile time is quadratic in the number of associated items. This PR makes name lookup use binary search instead of a linear scan to improve its asymptotic performance. As a result, the pathological case from that issue now runs in 8 seconds on my local machine, as opposed to many minutes on the current stable.

Currently, method resolution must do a linear scan through all associated items of a type to find one with a certain name. This PR changes the result of the `associated_items` query to a data structure that preserves the definition order of associated items (which is used, e.g., for the layout of trait object vtables) while adding an index of those items sorted by (unhygienic) name. When doing name lookup, we first find all items with the same `Symbol` using binary search, then run hygienic comparison to find the one we are looking for. Ideally, this would be implemented using an insertion-order preserving, hash-based multi-map, but one is not readily available.

Someone who is more familiar with identifier hygiene could probably make this better by auditing the uses of the `AssociatedItems` interface. My goal was to preserve the current behavior exactly, even if it seemed strange (I left at least one FIXME to this effect). For example, some places use comparison with `ident.modern()` and some places use `tcx.hygienic_eq` which requires the `DefId` of the containing `impl`. I don't know whether those approaches are equivalent or which one should be preferred.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-associated-items Area: Associated items (types, constants & functions) I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants