Remove `ConstValue::Slice` #105653

oli-obk · 2022-12-13T09:31:43Z

This only removes Slice, we can do ZST and Scalar individually to judge their performance effect

rustbot · 2022-12-13T09:31:51Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @RalfJung (or someone else) soon.

Please see the contribution instructions for more information.

rustbot · 2022-12-13T09:31:54Z

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

Some changes occurred in src/tools/clippy

cc @rust-lang/clippy

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

oli-obk · 2022-12-13T09:34:23Z

@bors try @rust-timer queue

bors · 2022-12-13T09:34:33Z

⌛ Trying commit d9b98ef0f3dd6dc20cf835a154c5bf3bee839767 with merge 45be547054cbb06ab494b34fe16bb68a559385b2...

RalfJung · 2022-12-13T10:45:49Z

compiler/rustc_const_eval/src/const_eval/eval_queries.rs

+            Immediate::ScalarPair(a, b) => ConstValue::ByRef {
+                alloc: ecx.tcx.intern_const_alloc(


This is silly, we start with an MPlace, then get the immediate, and then intern again?

We should just make sure that try_as_immediate is false for ScalarPair, then this match arm here is unreachable.

we... don't always start with an MPlace. I tried this originally.

Oh wait no there are more hacks here than I remember. Just saw this comment

// It is guaranteed that any non-slice scalar pair is actually ByRef here. // When we come back from raw const eval, we are always by-ref. The only way our op here is // by-val is if we are in destructure_mir_constant, i.e., if this is (a field of) something that we // "tried to make immediate" before. We wouldn't do that for non-slice scalar pairs or // structs containing such.

But, still, the ScalarPair arm should then be unreachable, right?

How far away are we from being able to remove destructure_mir_constant entirely, porting all its users to valtrees?

I guess this shows that we have an impedence mismatch if Immediate can contain ScalarPair but we still need to be able to convert arbitrary OpTy into ConstValue. But it's still a plus if we can do the ScalarPair/Slice handling once centrally in op_to_const rather than each codegen backend having to do this.

FWIW to fix #105536 it would probably be enough to change only try_as_immediate... then maybe the former intern_const_alloc arm here becomes unreachable?

we... don't always start with an MPlace. I tried this originally.

Ah, I think ConstProp will also go through this and it can produce pairs. I hope the new ConstProp will use valtrees instead...

compiler/rustc_middle/src/mir/interpret/allocation.rs

RalfJung · 2022-12-13T11:04:18Z

Something is odd with ByRef const printing: the ByRef points to an allocation that contains a wide reference, which does not get printed. That allocation however contains a pointer to the actual string data, and that gets printed.

oli-obk · 2022-12-13T11:51:44Z

Something is odd with ByRef const printing: the ByRef points to an allocation that contains a wide reference, which does not get printed. That allocation however contains a pointer to the actual string data, and that gets printed.

yea, this is because the ByRef contains an &'tcx Allocation instead of an AllocId, causing that allocation to not get picked up

oli-obk · 2022-12-13T11:52:38Z

@rust-timer build 45be547054cbb06ab494b34fe16bb68a559385b2

RalfJung · 2022-12-13T12:42:44Z

Something is odd with ByRef const printing: the ByRef points to an allocation that contains a wide reference, which does not get printed. That allocation however contains a pointer to the actual string data, and that gets printed.

yea, this is because the ByRef contains an &'tcx Allocation instead of an AllocId, causing that allocation to not get picked up

That's a bug though (and doesn't seem to be fixed by the last commit) -- from the current rendering it is impossible to tell which constant in the source refers to which of the allocations at the bottom.

oli-obk · 2022-12-13T12:43:38Z

That's a bug though (and doesn't seem to be fixed by the last commit)

it's preexisting. I'm not fixing everything that's wrong about constants in this PR, just one "small" thing 😆

oli-obk · 2022-12-13T12:44:55Z

from the current rendering it is impossible to tell which constant in the source refers to which of the allocations at the bottom.

that's only relevant for when the source above actually mentions alloc ids. It's a bit too much info printed at the bottom now, I can filter that out if you want

RalfJung · 2022-12-13T12:47:46Z

it's preexisting. I'm not fixing everything that's wrong about constants in this PR, just one "small" thing laughing

Hm, fair. Can you open an issue to track this?

RalfJung · 2022-12-13T12:48:38Z

that's only relevant for when the source above actually mentions alloc ids. It's a bit too much info printed at the bottom now, I can filter that out if you want

Yeah, if we don't show the allocID there's little point in printing the corresponding allocation.

compiler/rustc_const_eval/src/const_eval/eval_queries.rs

compiler/rustc_middle/src/mir/interpret/allocation.rs

compiler/rustc_const_eval/src/const_eval/mod.rs

rust-timer · 2022-12-13T14:28:17Z

Finished benchmarking commit (45be547054cbb06ab494b34fe16bb68a559385b2): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.6%	[0.2%, 1.5%]	21
Regressions ❌ (secondary)	7.1%	[0.3%, 22.8%]	19
Improvements ✅ (primary)	-0.4%	[-0.4%, -0.4%]	1
Improvements ✅ (secondary)	-0.7%	[-0.7%, -0.7%]	1
All ❌✅ (primary)	0.6%	[-0.4%, 1.5%]	22

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.8%	[0.6%, 4.1%]	5
Regressions ❌ (secondary)	8.0%	[1.1%, 19.7%]	30
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.0%	[-2.5%, -1.5%]	2
All ❌✅ (primary)	1.8%	[0.6%, 4.1%]	5

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.8%	[1.2%, 2.8%]	5
Regressions ❌ (secondary)	16.6%	[1.7%, 37.1%]	16
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.8%	[1.2%, 2.8%]	5

oli-obk · 2023-05-22T16:36:37Z

paired with #111768 the LLVM regressions go away entirely.

There is still a major regression in the coercions benchmark that I don't think is resolvable. creating a single Allocation is simply more expensive than creating a value of ConstValue::Slice.

We could explore using some SmallVec or similar for all the small Allocations that are created everywhere.

RalfJung · 2023-05-22T16:39:40Z

These are small allocations for string literals in the source, I assume? Can those generate a valtree instead of an Allocation?

oli-obk · 2023-05-23T09:31:15Z

These are small allocations for string literals in the source, I assume? Can those generate a valtree instead of an Allocation?

yea, that's the plan, getting rid of ConstValue in const patterns entirely and only having ValTrees there. I think I wanted to remove ConstValue::Slice first, as I thought it was annoying to move to valtrees otherwise, but I can't remember why I thought that. I'll do the valtree stuff first.

Optimize scalar and scalar pair representations loaded from ByRef in llvm in rust-lang#105653 I noticed that we were generating suboptimal LLVM IR if we had a `ConstValue::ByRef` that could be represented by a `ScalarPair`. Before rust-lang#105653 this is probably rare, but after it, every slice will go down this suboptimal code path that requires LLVM to untangle a bunch of indirections and translate static allocations that are only used once to read a scalar pair from.

Optimize scalar and scalar pair representations loaded from ByRef in llvm in rust-lang/rust#105653 I noticed that we were generating suboptimal LLVM IR if we had a `ConstValue::ByRef` that could be represented by a `ScalarPair`. Before rust-lang/rust#105653 this is probably rare, but after it, every slice will go down this suboptimal code path that requires LLVM to untangle a bunch of indirections and translate static allocations that are only used once to read a scalar pair from.

oli-obk · 2023-06-20T09:14:53Z

@bors try @rust-timer queue

bors · 2023-06-20T09:15:06Z

⌛ Trying commit 246d190 with merge d870714f42cfdb6491316e5c358010b2268d6e93...

bors · 2023-06-20T10:49:30Z

☀️ Try build successful - checks-actions
Build commit: d870714f42cfdb6491316e5c358010b2268d6e93 (d870714f42cfdb6491316e5c358010b2268d6e93)

bors · 2023-06-20T10:49:30Z

☀️ Try build successful - checks-actions
Build commit: d870714f42cfdb6491316e5c358010b2268d6e93 (d870714f42cfdb6491316e5c358010b2268d6e93)

rust-timer · 2023-06-20T12:04:07Z

Finished benchmarking commit (d870714f42cfdb6491316e5c358010b2268d6e93): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.4%	[0.2%, 0.9%]	17
Regressions ❌ (secondary)	13.3%	[0.2%, 27.6%]	19
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.4%	[0.2%, 0.9%]	17

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.3%	[0.5%, 2.7%]	3
Regressions ❌ (secondary)	8.7%	[2.2%, 22.2%]	18
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.2%	[-4.4%, -2.5%]	4
All ❌✅ (primary)	1.3%	[0.5%, 2.7%]	3

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	27.4%	[4.7%, 46.3%]	15
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.0%, 1.2%]	74
Regressions ❌ (secondary)	0.2%	[0.0%, 0.4%]	15
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.3%	[0.0%, 1.2%]	74

Bootstrap: 658.605s -> 656.253s (-0.36%)

oli-obk · 2023-06-21T08:25:52Z

The coercions regression is almost entirely due to the allocation and deallocation of allocations for slice constants:

51,379,440  ???:<rustc_const_eval::interpret::memory::AllocRef<rustc_middle::mir::interpret::AllocId, ()>>::read_scalar
44,257,269  ???:rustc_const_eval::interpret::intern::intern_const_alloc_recursive::<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>
33,685,073  ???:<rustc_const_eval::interpret::memory::AllocRefMut<rustc_middle::mir::interpret::AllocId, ()>>::write_scalar
24,221,336  ???:free
19,857,105  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::read_immediate_raw
19,683,306  ???:rustc_const_eval::const_eval::slice_for_alloc
18,808,545  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::write_immediate_no_validate
16,252,930  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::allocate
15,352,449  ???:malloc
15,109,539  ???:<rustc_middle::ty::context::TyCtxt>::mk_const_alloc
14,941,453  ???:<rustc_const_eval::interpret::intern::InternVisitor<rustc_const_eval::const_eval::machine::CompileTimeInterpreter> as rustc_const_eval::interpret::visitor::ValueVisitor<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::visit_value
13,762,350  ???:rustc_const_eval::const_eval::eval_queries::op_to_const
12,668,915  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::statement
12,121,936  ???:_rjem_je_tcache_bin_flush_small
11,646,331  ???:<rustc_middle::ty::context::CtxtInterners>::intern_ty
11,490,183  ???:<hashbrown::raw::RawTable<(rustc_middle::ty::context::InternedInSet<rustc_middle::mir::interpret::allocation::Allocation>, ())>>::reserve_rehash::<hashbrown::map::make_hasher<rustc_middle::ty::context::InternedInSet<rustc_middle::mir::interpret::allocation::Allocation>, (), core::hash::BuildHasherDefault<rustc_hash::FxHasher>>::{closure
10,616,661  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::ref_to_mplace

Maybe removing this for mir constants isn't actually the right way forward. We could start mirroring LLVM and have a full on ConstValue::ScalarPair?

RalfJung · 2023-06-22T11:45:30Z

Hm, I'm trying to understand/remember what this is actually all about. What is ConstValue::Slice used for? I doubt "consts that evaluate to a slice" can be very relevant; also those already have allocations created for them anyway (for the return place) so that can't be the perf hit I assume.

So is this about slice literals that appear in the source, and having an efficient representation for those? That should be valtrees though? coercions has tons of slice literals so I guess this is the right trace. But why is there a ton of mk_const_alloc calls in that benchmark?

oli-obk · 2023-09-14T09:31:58Z

closing in favor of #115764 preserving the AllocId

Optimize scalar and scalar pair representations loaded from ByRef in llvm in rust-lang/rust#105653 I noticed that we were generating suboptimal LLVM IR if we had a `ConstValue::ByRef` that could be represented by a `ScalarPair`. Before rust-lang/rust#105653 this is probably rare, but after it, every slice will go down this suboptimal code path that requires LLVM to untangle a bunch of indirections and translate static allocations that are only used once to read a scalar pair from.

rustbot assigned RalfJung Dec 13, 2022

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 13, 2022

This comment has been minimized.

Sign in to view

oli-obk force-pushed the op_to_const branch from d9b98ef to f4d5a78 Compare December 13, 2022 09:35