Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove precise box allocation #36

Merged
merged 5 commits into from
May 10, 2021

Conversation

jacob-hughes
Copy link
Collaborator

This removes precise box allocation as it can't currently be implemented correctly due to limitations with Boehm. However, rustgc will now default to trying to allocate things as untraceable to reduce mark-time wherever possible. In longer running benchmarks on yksom, this gives us some modest performance improvement:

----------------------------------------------------------------------------------------
  Benchmark     Executor      Suite   Extra   Core   Size   Var   #Samples   Mean (ms)
----------------------------------------------------------------------------------------
  Bounce        baseline      micro       2      1                     100   91
  Bounce        untraceable   micro       2      1                     100   105
  BubbleSort    baseline      micro       3      1                     100   146
  BubbleSort    untraceable   micro       3      1                     100   142
  DeltaBlue     baseline      macro      50      1                      10   91
  DeltaBlue     untraceable   macro      50      1                      10   105
  Dispatch      baseline      micro       2      1                     100   71
  Dispatch      untraceable   micro       2      1                     100   75
  Fannkuch      baseline      micro       6      1                     100   64
  Fannkuch      untraceable   micro       6      1                     100   61
  Fibonacci     baseline      micro       3      1                     100   270
  Fibonacci     untraceable   micro       3      1                     100   266
  FieldLoop     baseline      micro       1      1                     100   51
  FieldLoop     untraceable   micro       1      1                     100   43
  IntegerLoop   baseline      micro       2      1                     100   135
  IntegerLoop   untraceable   micro       2      1                     100   130
  JsonSmall     baseline      macro      10      1                      10   1316
  JsonSmall     untraceable   macro      10      1                      10   1261
  List          baseline      micro       2      1                     100   131
  List          untraceable   micro       2      1                     100   125
  Loop          baseline      micro       5      1                     100   188
  Loop          untraceable   micro       5      1                     100   188
  Mandelbrot    baseline      micro      30      1                     100   146
  Mandelbrot    untraceable   micro      30      1                     100   138
  NBody         baseline      macro     500      1                      10   72
  NBody         untraceable   macro     500      1                      10   59
  PageRank      baseline      macro      40      1                      10   157
  PageRank      untraceable   macro      40      1                      10   124
  Permute       baseline      micro       3      1                     100   112
  Permute       untraceable   micro       3      1                     100   108
  Queens        baseline      micro       2      1                     100   84
  Queens        untraceable   micro       2      1                     100   84
  QuickSort     baseline      micro       1      1                     100   71
  QuickSort     untraceable   micro       1      1                     100   67
  Recurse       baseline      micro       3      1                     100   97
  Recurse       untraceable   micro       3      1                     100   94
  Sieve         baseline      micro       4      1                     100   358
  Sieve         untraceable   micro       4      1                     100   326
  Storage       baseline      micro       1      1                     100   70
  Storage       untraceable   micro       1      1                     100   62
  Sum           baseline      micro       2      1                     100   70
  Sum           untraceable   micro       2      1                     100   67
  Towers        baseline      micro       2      1                     100   129
  Towers        untraceable   micro       2      1                     100   131
  TreeSort      baseline      micro       1      1                     100   182
  TreeSort      untraceable   micro       1      1                     100   189
  WhileLoop     baseline      micro      10      1                     100   171
  WhileLoop     untraceable   micro      10      1                     100   168

Unfortunately, yksom makes use of the Gc::new_from_layout pattern quite a bit, which currently still allocates everything conservatively. We might see even better results once that is also changed to use this approach.

) -> Result<NonNull<[u8]>, AllocError> {
match layout.size() {
0 => Ok(NonNull::slice_from_raw_parts(layout.dangling(), 0)),
// SAFETY: `layout` is non-zero in size,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this comment is supposed to be attached to (presumably the second clause?).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rustc has a lint which won't allow an unsafe block without a comment explaining why the unsafe block is ok to use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Would it be clearer to write this as:

x => y {
   // comment
  unsafe { ... }
}

?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is better and I went to change it, but then x.py fmt reverted my changes. I think we should leave it because fighting with the formatter is just going to cause pain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it reverts it, I don't think we have much choice :)

@ltratt
Copy link
Member

ltratt commented May 7, 2021

However, rustgc will now default to trying to allocate things as untraceable to reduce mark-time wherever possible.

I'm not sure I understand the implications of this: as written it sounds like it might cause uncorrectness? I'm sure I'm wrong, but I'd appreciate a deeper explanation :)

@jacob-hughes
Copy link
Collaborator Author

I'm sure I'm wrong, but I'd appreciate a deeper explanation :)

This will make use of the NoTrace auto trait to work out whether something should allocate untraceable or not. If a type T implements NoTrace, then it will be allocated untraceable. This is familiar (and used to be enabled), but I temporarily disabled everything apart from conservative allocation while I tried fixing issues with the alloc precise stuff.

In yksom, this change alone leads to incorrectness because yksom sometimes treats usizes as pointers. However, it's safe to use now because of this change. From now on, this change will only be incorrect when used with applications which type pun pointers in some way. If that's the case, those applications need to inform GC by negatively implementing NoTrace for those types (or disable this optimization entirely with -C gc_precise_marking=false)

@ltratt
Copy link
Member

ltratt commented May 7, 2021

OK, I'm with you. So, to be clear, the default is now "usizes are not treated as potential pointers unless you explicitly tell us that they are"?

@jacob-hughes
Copy link
Collaborator Author

jacob-hughes commented May 7, 2021

That's correct. However *mut T and *const T are still treated as potential pointers to GC values even if T impls NoTrace. For example:

struct S {
   inner: *mut u8
}

S.inner would be scanned by the GC. Personally I think this is a bit too conservative, and it should be the responsibility of the user to tell the GC that this is, in fact, a pointer to a Gc value. But I'm happy to leave this as is for now.

@ltratt
Copy link
Member

ltratt commented May 7, 2021

I agree that if a struct is declared NoTrace that should mean "don't even scan pointers in here".

I'm fine with the PR, modulo the comment about possibly sort-of reformatting the unsafe block.

@ltratt
Copy link
Member

ltratt commented May 7, 2021

bors r+

@bors
Copy link

bors bot commented May 7, 2021

Build failed:

@jacob-hughes
Copy link
Collaborator Author

bors try

@bors
Copy link

bors bot commented May 10, 2021

try

Build failed:

@jacob-hughes
Copy link
Collaborator Author

bors try

@bors
Copy link

bors bot commented May 10, 2021

try

Build failed:

@jacob-hughes
Copy link
Collaborator Author

bors try

@bors
Copy link

bors bot commented May 10, 2021

try

Build succeeded:

@jacob-hughes
Copy link
Collaborator Author

This can be squashed now when you're ready.

@ltratt
Copy link
Member

ltratt commented May 10, 2021

Please squash.

There is currently no support for non-GC'd precise heap allocation in
Boehm. Until support for this is added, it is not correct to allocate
non-garbage-collectable memory this way.
This is a hot path for all of Rust's box allocation. For some reason, I
implemented this with enum pattern matching, which caused unnecessary
branching in such a performance critical part of the allocator.
@jacob-hughes
Copy link
Collaborator Author

Squashed

@ltratt
Copy link
Member

ltratt commented May 10, 2021

bors r+

@bors
Copy link

bors bot commented May 10, 2021

Build succeeded:

@bors bors bot merged commit e234c1c into softdevteam:master May 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants