interpret: refactor projection handling code #99101

RalfJung · 2022-07-09T23:51:39Z

Moves our projection handling code into a common file, and avoids the use of a
general mplace-based fallback function by have more specialized implementations.

mplace_index (and the other slice-related functions) could be more efficient by
copy-pasting the body of operand_index. Or we could do some trait magic to share
the code between them. But for now this is probably fine.

This is the common part of #99013 and #99097. I am seeing some strange perf results so this probably should be its own change so we know which diff caused which perf changes...

r? @oli-obk

rustbot · 2022-07-09T23:51:41Z

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

RalfJung · 2022-07-10T00:15:29Z

First testing the version that I think is the faster of the two, since it avoids introducing new assert_mem_place.
If this is still too slow, my guess is that the loop in operand_array_fields is slightly slower than the old one in mplace_array_fields, and I have some ideas for that.

@bors try @rust-timer queue

rust-timer · 2022-07-10T00:15:31Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-07-10T00:15:36Z

⌛ Trying commit af4c939cd1568eb13c4d1581664fd8840a90eaaf with merge b9fe2470ecb8d858b78e60c1ab3143012cdef548...

bors · 2022-07-10T02:06:01Z

☀️ Try build successful - checks-actions
Build commit: b9fe2470ecb8d858b78e60c1ab3143012cdef548 (b9fe2470ecb8d858b78e60c1ab3143012cdef548)

rust-timer · 2022-07-10T02:06:03Z

Queued b9fe2470ecb8d858b78e60c1ab3143012cdef548 with parent 6dba4ed, future comparison URL.

rust-timer · 2022-07-10T04:46:19Z

Finished benchmarking commit (b9fe2470ecb8d858b78e60c1ab3143012cdef548): comparison url.

Instruction count

Primary benchmarks: no relevant changes found
Secondary benchmarks: 😿 relevant regressions found

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	5.1%	6.9%	8
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	N/A	N/A	0

Max RSS (memory usage)

Results

Primary benchmarks: 🎉 relevant improvement found
Secondary benchmarks: no relevant changes found

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	-5.0%	-5.0%	1
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	-5.0%	-5.0%	1

Cycles

Results

Primary benchmarks: no relevant changes found
Secondary benchmarks: 😿 relevant regressions found

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	8.3%	10.5%	6
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	N/A	N/A	0

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

the arithmetic mean of the percent change ↩ ↩² ↩³
number of relevant changes ↩ ↩² ↩³

RalfJung · 2022-07-11T01:55:04Z

I have some ideas for that.

Ah, of course that doesn't work since impl Iterator needs to always be the same type. Even if we Either this, that will just re-introduce the runtime check that I was trying to avoid (replacing the try_as_mplace we currently have in OpTy::offset).

I think I really need some valgrind profiles here to even know if operand_array_fields truly is the problem. It seems hard to believe.

oli-obk · 2022-07-11T09:44:56Z

valgrind diff

-2,582,601,110  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::run
 1,416,740,602  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::place_projection
   780,827,916  ???:<core::iter::adapters::copied::Copied<core::slice::iter::Iter<rustc_middle::mir::syntax::ProjectionElem<rustc_middle::mir::Local, rustc_middle::ty::Ty>>> as core::iter::traits::iterator::Iterator>::try_fold::<rustc_const_eval::interpret::place::PlaceTy, <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::eval_place::{closure
   517,636,783  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::operand_projection
   422,177,426  ???:<rustc_middle::ty::context::TyCtxt>::try_subst_and_normalize_erasing_regions::<rustc_middle::mir::ConstantKind>
  -234,321,940  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::eval_place
   231,210,917  ???:<hashbrown::map::RawEntryBuilder<rustc_middle::ty::ParamEnvAnd<rustc_middle::mir::interpret::GlobalId>, (core::result::Result<rustc_middle::mir::interpret::value::ConstAlloc, rustc_middle::mir::interpret::error::ErrorHandled>, rustc_query_system::dep_graph::graph::DepNodeIndex), core::hash::BuildHasherDefault<rustc_hash::FxHasher>>>::from_key_hashed_nocheck::<rustc_middle::ty::ParamEnvAnd<rustc_middle::mir::interpret::GlobalId>>
   193,633,898  ???:<core::iter::adapters::copied::Copied<core::slice::iter::Iter<rustc_middle::mir::syntax::ProjectionElem<rustc_middle::mir::Local, rustc_middle::ty::Ty>>> as core::iter::traits::iterator::Iterator>::try_fold::<rustc_const_eval::interpret::operand::OpTy, <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::eval_place_to_op::{closure
   193,462,272  ???:<rustc_middle::ty::context::TyCtxt>::def_kind::<rustc_span::def_id::DefId>
    82,575,360  ???:<rustc_middle::ty::ParamEnvAnd<rustc_middle::mir::interpret::GlobalId> as core::hash::Hash>::hash::<rustc_hash::FxHasher>
    77,440,963  ???:core::iter::adapters::try_process::<core::iter::adapters::map::Map<core::slice::iter::Iter<rustc_middle::mir::syntax::Operand>, <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::eval_operands::{closure
    68,975,205  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::eval_fn_call
    56,819,838  ???:<rustc_middle::mir::interpret::value::Scalar>::to_bool
    52,187,019  ???:<rustc_middle::ty::context::TyCtxt>::normalize_erasing_late_bound_regions::<rustc_middle::ty::sty::FnSig>
   -50,331,648  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::mplace_projection
    39,303,720  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::copy_op_no_validate
    35,599,183  ???:<hashbrown::map::RawEntryBuilder<rustc_middle::ty::ParamEnvAnd<(rustc_middle::ty::instance::Instance, &rustc_middle::ty::list::List<rustc_middle::ty::Ty>)>, (core::result::Result<&rustc_target::abi::call::FnAbi<rustc_middle::ty::Ty>, rustc_middle::ty::layout::FnAbiError>, rustc_query_system::dep_graph::graph::DepNodeIndex), core::hash::BuildHasherDefault<rustc_hash::FxHasher>>>::from_key_hashed_nocheck::<rustc_middle::ty::ParamEnvAnd<(rustc_middle::ty::instance::Instance, &rustc_middle::ty::list::List<rustc_middle::ty::Ty>)>>
     9,909,729  ???:<rustc_middle::ty::ParamEnvAnd<(rustc_middle::ty::instance::Instance, &rustc_middle::ty::list::List<rustc_middle::ty::Ty>)> as core::hash::Hash>::hash::<rustc_hash::FxHasher>
     8,441,621  ???:<rustc_const_eval::const_eval::machine::CompileTimeInterpreter as rustc_const_eval::interpret::machine::Machine>::after_stack_pop
     6,606,486  ???:<rustc_middle::ty::context::TyCtxt>::mk_type_list::<core::iter::adapters::map::Map<core::slice::iter::Iter<rustc_const_eval::interpret::operand::OpTy>, <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::eval_terminator::{closure
     2,657,099  ???:<rustc_middle::ty::context::TyCtxt>::intern_type_list
    -2,359,340  ???:<rustc_middle::ty::normalize_erasing_regions::TryNormalizeAfterErasingRegionsFolder as rustc_middle::ty::fold::FallibleTypeFolder>::try_fold_mir_const
     1,974,134  ???:<rustc_middle::ty::instance::Instance>::resolve_opt_const_arg

most likely an inlining issue, slapping #[inline(never)] on run should do the trick

RalfJung · 2022-07-11T11:32:32Z

Thanks!

How do I read that diff? run is taking 2,582,601,110 instructions fewer than it used to...? And place_projection is taking a lot more? But eval_place actually got cheaper? confused

oli-obk · 2022-07-11T11:34:12Z

yes, that's exactly how to read that diff. I'll try to produce percentages, too, anything that goes to 0% is inlined away

RalfJung · 2022-07-11T13:32:08Z

But then why should inline(never) on run help...? run got cheaper! We are spending fewer instructions on the entire CTFE loop. How can that be a slowdown?!?

place_projection getting more expensive is odd, it does basically the same thing as before...

RalfJung · 2022-07-11T13:33:46Z

@bors try @rust-timer queue

rust-timer · 2022-07-11T13:33:48Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-07-11T13:33:54Z

⌛ Trying commit 55bab620a371ab69ac1d8eb76d2a88cb8e04bcc3 with merge 88bfc4522b4c58ab7e03dfeace62a6a358f8837c...

oli-obk · 2022-07-11T14:25:56Z

run got cheaper! We are spending fewer instructions on the entire CTFE loop. How can that be a slowdown?!?

it may just have gotten inlined into all callers, and thus disappeared from cachegrind.

bors · 2022-07-11T15:27:24Z

☀️ Try build successful - checks-actions
Build commit: 88bfc4522b4c58ab7e03dfeace62a6a358f8837c (88bfc4522b4c58ab7e03dfeace62a6a358f8837c)

rust-timer · 2022-07-11T15:27:26Z

Queued 88bfc4522b4c58ab7e03dfeace62a6a358f8837c with parent 7d1f57a, future comparison URL.

bors · 2022-07-11T23:07:43Z

☀️ Try build successful - checks-actions
Build commit: cdd68a378c6fa7e29eec68b2b63999c5dfc2a3f3 (cdd68a378c6fa7e29eec68b2b63999c5dfc2a3f3)

rust-timer · 2022-07-11T23:07:45Z

Queued cdd68a378c6fa7e29eec68b2b63999c5dfc2a3f3 with parent 38b7215, future comparison URL.

rust-timer · 2022-07-12T01:36:33Z

Finished benchmarking commit (cdd68a378c6fa7e29eec68b2b63999c5dfc2a3f3): comparison url.

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

Primary benchmarks: 🎉 relevant improvement found
Secondary benchmarks: 😿 relevant regression found

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	3.1%	3.1%	1
Improvements 🎉 (primary)	-2.4%	-2.4%	1
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	-2.4%	-2.4%	1

Cycles

Results

Primary benchmarks: 😿 relevant regressions found
Secondary benchmarks: mixed results

	mean¹	max	count²
Regressions 😿 (primary)	2.7%	5.3%	7
Regressions 😿 (secondary)	3.4%	5.5%	5
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	-8.3%	-8.3%	1
All 😿🎉 (primary)	2.7%	5.3%	7

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

the arithmetic mean of the percent change ↩ ↩²
number of relevant changes ↩ ↩²

Moves our projection handling code into a common file, and avoids the use of a general mplace-based fallback function by have more specialized implementations. mplace_index (and the other slice-related functions) could be more efficient by copy-pasting the body of operand_index. Or we could do some trait magic to share the code between them. But for now this is probably fine.

RalfJung · 2022-07-12T02:53:06Z

Looks like fold was the problem. If perf still looks good, this is ready for review.

@bors try @rust-timer queue
@rustbot ready

rust-timer · 2022-07-12T02:53:09Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-07-12T02:53:15Z

⌛ Trying commit 04b3cd9 with merge da916a909deef1bbb257db0f54cb5347ebdc86b0...

bors · 2022-07-12T04:46:30Z

☀️ Try build successful - checks-actions
Build commit: da916a909deef1bbb257db0f54cb5347ebdc86b0 (da916a909deef1bbb257db0f54cb5347ebdc86b0)

rust-timer · 2022-07-12T04:46:32Z

Queued da916a909deef1bbb257db0f54cb5347ebdc86b0 with parent 8a33254, future comparison URL.

rust-timer · 2022-07-12T06:43:38Z

Finished benchmarking commit (da916a909deef1bbb257db0f54cb5347ebdc86b0): comparison url.

Instruction count

Primary benchmarks: no relevant changes found
Secondary benchmarks: 🎉 relevant improvements found

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	-1.9%	-2.2%	6
All 😿🎉 (primary)	N/A	N/A	0

Max RSS (memory usage)

Results

Primary benchmarks: 🎉 relevant improvement found
Secondary benchmarks: mixed results

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	6.4%	6.4%	1
Improvements 🎉 (primary)	-2.9%	-2.9%	1
Improvements 🎉 (secondary)	-2.1%	-2.3%	2
All 😿🎉 (primary)	-2.9%	-2.9%	1

Cycles

Results

Primary benchmarks: 🎉 relevant improvement found
Secondary benchmarks: no relevant changes found

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	-2.3%	-2.3%	1
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	-2.3%	-2.3%	1

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

the arithmetic mean of the percent change ↩ ↩² ↩³
number of relevant changes ↩ ↩² ↩³

oli-obk · 2022-07-12T07:19:50Z

@bors r+

bors · 2022-07-12T07:19:52Z

📌 Commit 04b3cd9 has been approved by oli-obk

It is now in the queue for this repository.

bors · 2022-07-13T02:43:28Z

⌛ Testing commit 04b3cd9 with merge 7b57152...

bors · 2022-07-13T05:24:13Z

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing 7b57152 to master...

rust-timer · 2022-07-13T06:40:55Z

Finished benchmarking commit (7b57152): comparison url.

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

Primary benchmarks: 😿 relevant regression found
Secondary benchmarks: 🎉 relevant improvements found

	mean¹	max	count²
Regressions 😿 (primary)	3.7%	3.7%	1
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	-3.9%	-5.0%	2
All 😿🎉 (primary)	3.7%	3.7%	1

Cycles

Results

Primary benchmarks: 🎉 relevant improvement found
Secondary benchmarks: no relevant changes found

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	-2.3%	-2.3%	1
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	-2.3%	-2.3%	1

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

the arithmetic mean of the percent change ↩ ↩²
number of relevant changes ↩ ↩²

rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jul 9, 2022

rust-highfive assigned oli-obk Jul 9, 2022

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 9, 2022

This was referenced Jul 9, 2022

interpret: get rid of MemPlaceMeta::Poison #99013

Merged

shrink OpTy back down to 80 bytes #99097

Closed

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 10, 2022

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jul 10, 2022

RalfJung force-pushed the interpret-projections branch from af4c939 to 55bab62 Compare July 11, 2022 13:32

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 11, 2022

rustbot removed S-waiting-on-perf Status: Waiting on a perf run to be completed. perf-regression Performance regression. labels Jul 12, 2022

RalfJung added 2 commits July 11, 2022 22:50

use a loop rather than try_fold

04b3cd9

RalfJung force-pushed the interpret-projections branch from dc767b7 to 04b3cd9 Compare July 12, 2022 02:51

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 12, 2022

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 12, 2022

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 12, 2022

bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 13, 2022

bors merged commit 7b57152 into rust-lang:master Jul 13, 2022

rustbot added this to the 1.64.0 milestone Jul 13, 2022

bors mentioned this pull request Jul 13, 2022

Allow destructuring opaque types in their defining scopes #98582

Merged

RalfJung deleted the interpret-projections branch July 13, 2022 22:27

interpret: refactor projection handling code #99101

interpret: refactor projection handling code #99101

Uh oh!

Conversation

RalfJung commented Jul 9, 2022

Uh oh!

rustbot commented Jul 9, 2022

Uh oh!

RalfJung commented Jul 10, 2022

Uh oh!

rust-timer commented Jul 10, 2022

Uh oh!

bors commented Jul 10, 2022

Uh oh!

bors commented Jul 10, 2022

Uh oh!

rust-timer commented Jul 10, 2022

Uh oh!

rust-timer commented Jul 10, 2022

Footnotes

Uh oh!

RalfJung commented Jul 11, 2022

Uh oh!

oli-obk commented Jul 11, 2022

Uh oh!

RalfJung commented Jul 11, 2022

Uh oh!

oli-obk commented Jul 11, 2022

Uh oh!

RalfJung commented Jul 11, 2022

Uh oh!

RalfJung commented Jul 11, 2022

Uh oh!

rust-timer commented Jul 11, 2022

Uh oh!

bors commented Jul 11, 2022

Uh oh!

oli-obk commented Jul 11, 2022

Uh oh!

bors commented Jul 11, 2022

Uh oh!

rust-timer commented Jul 11, 2022

Uh oh!

bors commented Jul 11, 2022

Uh oh!

rust-timer commented Jul 11, 2022

Uh oh!

rust-timer commented Jul 12, 2022

Footnotes

Uh oh!

RalfJung commented Jul 12, 2022

Uh oh!

rust-timer commented Jul 12, 2022

Uh oh!

bors commented Jul 12, 2022

Uh oh!

bors commented Jul 12, 2022

Uh oh!

rust-timer commented Jul 12, 2022

Uh oh!

rust-timer commented Jul 12, 2022

Footnotes

Uh oh!

oli-obk commented Jul 12, 2022

Uh oh!

bors commented Jul 12, 2022

Uh oh!

bors commented Jul 13, 2022

Uh oh!

bors commented Jul 13, 2022

Uh oh!

rust-timer commented Jul 13, 2022

Footnotes

Uh oh!

Uh oh!