Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework inc_rc, dec_rc and how arrays are handled in brillig in general #7287

Open
jfecher opened this issue Feb 5, 2025 · 10 comments
Open
Labels
enhancement New feature or request

Comments

@jfecher
Copy link
Contributor

jfecher commented Feb 5, 2025

Problem

inc_rc and dec_rc are a common source of bugs in SSA code generated by the compiler. They require special handling since they are effectively implicit variables attached to each array in unconstrained code.

Happy Case

Now that runtimes are separated before SSA we could represent arrays differently in acir & brillig. In Acir they can continue to be represented as normal arrays, while in brillig we could lower them as a tuple of (&mut Field, [T; N]) where the first element of the tuple is the reference count represented as a mutable field value, and the second is the array itself. If we do this, reference counts now need much fewer special handling or optimizations since they'll be handled by mem2reg along with other references.

This could make it more difficult for optimizations like deduplication to insert extra inc_rc equivalents since they'd have to find the integer that represents the array's reference count instead of it always being attached to the array.

Uncertain how much code improvement this will bring or if it will improve optimizations so I labeled this as "experiment" with rather than a "lets definitely do this."

EDIT: See comments below for some alternative approaches.

Workaround

None

Workaround Description

No response

Additional Context

No response

Project Impact

None

Blocker Context

No response

Would you like to submit a PR for this Issue?

None

Support Needs

No response

@aakoshh
Copy link
Contributor

aakoshh commented Feb 6, 2025

Just to make sure I understand, you're not suggesting that inc_rc will disappear from the code, but rather that instead of figuring out which array it needs to act on, with the ref count being an internal part of the array in the Brillig VM, it will instead be like the length of a BoundedVec, a number passed around in the SSA together with the array used for storage. But where the inc_rc and dec_rc are inserted into the SSA will remain a source of speculation about how the array is going to be used, as it is now.

I'm asking because I have difficulty understanding some of the SSA generated from the test that fails in #7297

For example copy_mut seems to insert an inc_rc because it receives a mut array, but it doesn't seem to account for the possibility that the array might not be mutated, and the count stays higher after the call no matter what.

Take this example:

maybe_mut
use std::mem::array_refcount;

fn main() {
    let mut array = [0, 1, 2];
    maybe_mut(array, false);
    maybe_mut(array, true);
    println(array[0]);
    println(array_refcount(array));
}

fn maybe_mut(mut array: [Field; 3], should_mutate: bool) {
    if should_mutate {
        array[0] = 4;
    }
    println(array[0]);
}

It prints 3 as the final refcount in main, because of how maybe_mut looks like:

brillig(inline) fn maybe_mut f1 {
  b0(v0: [Field; 3], v1: u1):
    inc_rc v0
    v2 = allocate -> &mut [Field; 3]
    store v0 at v2
    jmpif v1 then: b1, else: b2
  b1():
    v3 = load v2 -> [Field; 3]
    v6 = array_set v3, index u32 0, value Field 4
    store v6 at v2
    jmp b2()
  b2():
    v7 = load v2 -> [Field; 3]
    v8 = array_get v7, index u32 0 -> Field
    call f2(v8)
    return
}

Does your proposal address the speculative nature of inc_rc? As it stands it's primary goal is to prevent against accidental modification, at the cost of doing more copies

I wonder if we could have a concept of drop, that is to understand when a reference goes out of scope, and decrement the RC at that point, and increment at the point where a borrow is made. For example I see maybe_mut emits an inc_rc because there is a possibility of mutation, while borrow in the test does not. Now, if I change borrow to return the array it borrowed then it still doesn't increase the count, but the caller does on the returned value. Then, if I change it to return a struct wrapping the array, then both borrow and the caller increases the RC, with the two consecutive increments later being deduplicated. By contrast, we could increment the RC where in the caller when it passes an array to its callee, and decrement it if/when that reference is finally dropped, whether at the end of the called function, or when the returned structure wrapping it goes out of scope. So we would make simpler-to understand rules about inserting them, and perform the optimisation to remove them later, once we have the full picture. I can imagine identifying when a value is dropped is not trivial, though.

(The deduplication of increments is a counter example for where my assumption that observable side effects - the printing of the ref count in this instance - should be the same between the initial and final SSA breaks down).

@jfecher
Copy link
Contributor Author

jfecher commented Feb 6, 2025

But where the inc_rc and dec_rc are inserted into the SSA will remain a source of speculation about how the array is going to be used, as it is now.

No, this proposal is to remove inc_rc and dec_rc

I wonder if we could have a concept of drop,

I've considered trying to track when the arrays may be dropped in the past but as soon as the array gets stored in a mutable reference - which is the case with most arrays being mutated anyway - we lose almost all ability to track them since we'd have to delegate to mem2reg. You could also say we do have a drop of sorts since we do issue some dec_rc instructions, just only where we "know" a value won't be used afterward, e.g. an array parameter at the end of a function where no values containing arrays are returned. ("know" is in quotes due to the negative RC issue - clearly some assumptions are incorrect).

(The deduplication of increments is a counter example for where my assumption that observable side effects - the printing of the ref count in this instance - should be the same between the initial and final SSA breaks down).

can_be_deduplicated should be false for both inc_rc and dec_rc so we shouldn't be deduplicating these instructions. We may deduplicate a constant make_array though - in which case we should add an extra inc_rc.

@aakoshh
Copy link
Contributor

aakoshh commented Feb 6, 2025

we shouldn't be deduplicating these instructions.

The deduplication of consecutive RC instructions happens here, can_be_deduplicated is not involved, it's a special case.

we do issue some dec_rc instructions, just only where we "know" a value won't be used afterward, e.g. an array parameter at the end of a function where no values containing arrays are returned.

As another example of why it's difficult to handle this in SSA when the RC is also in the hands of the Brillig VM, here's the SSA of such a function from my experiment:

    v11 = load v4 -> [Field; 3]
    inc_rc v11
    ...
    v18 = load v4 -> [Field; 3]
    v20 = array_set v18, index u32 0, value Field 6
    store v20 at v4
    v21 = load v4 -> [Field; 3]
    dec_rc v21

So it originally inserted an inc_rc and a dec_rc acting on the input v4 (one of the array: &mut [Field; 3] parameters). What it can't know, is that v20 may be a new array (I suppose it could, because after an inc_rc it will be, but in general I assume it doesn't), and that after store v20 at v4 overwrites v4 with this new array, dec_rc v21 acting on a freshly loaded reference from v4 will potentially act on an array different from what the original inc_rc was issued for.

No, this proposal is to remove inc_rc and dec_rc

Thanks for making that clear. I feared that for the same reason as above, if dec_rc was acting on a tuple member instead of the array it would be susceptible to fall victim of the same gotchas with forwarding rules and act on the values made by the copy-on-write procedure than the original.

@aakoshh
Copy link
Contributor

aakoshh commented Feb 6, 2025

For the record, the RC rules that would make more sense to me would be along the following lines:

  • Generally the caller should make the increment when they pass a borrowed value to a function, rather than the callee guessing whether to inc or not
  • fn borrow(array: [Field; 3]: the caller should increment the RC before passing array, and at the end of the method it can be dropped because it goes out of scope and doesn't get stored
  • fn borrow_mut(array: &mut [Field; 3]: unlike borrow I would not consider this an additional reference, just use write semantics and modify the original without increasing the RC
  • fn copy_mut(mut array: [Field; 3]: the caller should increment the RC because it's a borrow and copy-on-write should be used
  • fn borrow_mut_two(array1: &mut [Field; 3], array2: &mut [Field; 3]: like borrow_mut, I would argue that the RC should not be increased at all (currently it increases it once for array2). That's because we want write semantics and it should be the users' responsibility whether they pass reference to the same array or not, I don't think copies should be made unless there are other borrows such as borrow and copy_mut do, which imply copy-on-write. The effect of writing to the original seems to be the same anyway.

Then we could let the SSA figure out which inc/dec pairs it can get rid of for optimisation, but it would make it clear when the RC is increased and why.

Returning the array or storing it in a data structure should cancel the decrement.

Is this 🌈 and 🦄 thinking?

@aakoshh
Copy link
Contributor

aakoshh commented Feb 6, 2025

One more observation about placing the dec_rc at the end is that the Brillig VM also modifies the RC of the original array during the copy procedure here, so these two alone can end up bringing the RC to zero:

  • if the final dec_rc acts on the original array after a copy was made, it's double decrementing
  • if the final dec_rc acts on the copy of the array, it's bringing it from 1 to 0 immediately
  • the final dec_rc should act on the original array, and only if no copy ended up being made

This might be fixed by #7297 (comment)

@jfecher
Copy link
Contributor Author

jfecher commented Feb 7, 2025

Is this 🌈 and 🦄 thinking?

Not necessarily but there are a couple of issues / practical concerns. The scheme you gave is more or less how they're intended to behave but have been changed over time as we were pressured to create smaller brillig bytecode sizes. Increasing the ref counts on each immutable function was leading to a bytecode explosion since it happens much more often and we cannot usually optimize out these cases. A related issue is that Noir doesn't have the concept of an immutable borrow which would normally be the case you'd keep the ref count the same - where a conceptual clone like we'd have where you pass by value you do normally increase it.

So we wanted to change where ref counts are to find places where less of them were needed and they were easier to optimize out. That's why we have both inc & dec within a function since then it is easier to identify when we can remove them as a pair. Versus if we had the inc outside the function we can't do that unless the function is inlined.

Returning the array or storing it in a data structure should cancel the decrement.

How do you know if the array was returned or stored in a data structure (and returned)? Usually references are involved so you won't know if the array in this reference is the same or not. Our answer for this currently is to say if the function returns any types containing an array of the same type then we have to pessimistically assume the array in question may be returned. This should lead to counts drifting up, not down, but I point it out as something to keep in mind that we probably can't be perfect here.

Anyways - this was mostly to explain why we have our current system / how we got here since I think once you know that its easier to make changes to fix it or tear it down if necessary.

EDIT: Clarification on why we increment on mutable values currently. In the current scheme to minimize unnecessary inc/decs we:

  • Assume all array values are passed as immutable references by default.
  • Only when the array is taken by mutable value, copied into a let binding, or into multiple &mut references do we increase the ref count. This is so if you have a deeply nested series of functions with one mutation deep in the call stack, that we don't need to inc/dec the rc for every function in between.

@jfecher
Copy link
Contributor Author

jfecher commented Feb 7, 2025

Ah, some details I left out from our current scheme:

  • Noir has no concept of moves so we approximate by incrementing ref count where an array may be copied. This is wherever a new variable binding may introduce an alias to the array. So let bindings and function parameters.

@jfecher
Copy link
Contributor Author

jfecher commented Feb 7, 2025

The deduplication of consecutive RC instructions happens here, can_be_deduplicated is not involved, it's a special case.

Honestly, I don't see the correctness argument for that optimization - I think we should remove it entirely. Especially since counts are underflowing, one reason could be that these extra inc_rcs are required.

As another example of why it's difficult to handle this in SSA when the RC is also in the hands of the Brillig VM, here's the SSA of such a function from my experiment:

v11 = load v4 -> [Field; 3]
inc_rc v11
...
v18 = load v4 -> [Field; 3]
v20 = array_set v18, index u32 0, value Field 6
store v20 at v4
v21 = load v4 -> [Field; 3]
dec_rc v21

Yeah, this seems bad. It makes me think that we need a larger redesign as well since right not trying to fix it up seems to me like it'd just be "garbage in, garbage out."

Move & Copy Elision Optimization scheme

I think going forward for more confidence in correctness & still having some optimizations we could:

  • Start with a scheme we have more confidence to be correct e.g. the one you described where we don't increment rc for references but do for passing by value.
  • Remove existing optimizations on ref counts. We won't need them.
  • Add an analysis to determine where an array is moved versus where it needs to be copied. This will need to be done before we issue inc/decs so before ssa-gen. We should issue explicit clone(array) IR nodes at this point (which probably just inc_rc internally).
  • Add an analysis to try to figure out which functions can immutably borrow an array (or just add immutable borrows to the language) so that those functions will not need to conceptually clone the array (inc_rc) either. This could be part of the previous pass since otherwise their ordering would be important since the move pass may issue a copy for a function that later turned out could have immutably borrowed the array. Note that an immutable borrow analysis will have to be conservative. If we have a function like fn foo(a: [Field; 1], b: &mut [Field; 1]) we have to assume a and b could refer to the same array so a still can't be borrowed even if never mutated.

@aakoshh
Copy link
Contributor

aakoshh commented Feb 7, 2025

Sounds good!

since right not trying to fix it up seems to me like it'd just be "garbage in, garbage out."

I hope that this particular issue will be fixed by #7297

@jfecher
Copy link
Contributor Author

jfecher commented Feb 7, 2025

Expose immutable references directly scheme

Spoke with @michaeljklein and @aakoshh and another option we're more strongly considering now is to add immutable references to Noir. If we do that:

  • Any by-value use of an array type would incur a copy
  • We would need to add a cloning operation to the language which would clone the array and unwrap it from the reference
  • Where arrays are cloned becomes a user's issue for optimizing their own code.
  • The compiler is simpler, we need fewer optimizations on RCs and testing on arrays becomes easier.
  • We could remove RCs entirely, copying every time a full array value is used as mentioned above, or keep rcs and just increase the reference count on a clone. This later case may be an easier transition but the compiler could still be copying unnecessarily since we'd still not be able to determine in all cases when to dec_rc if references are involved. For the always copy case above though, you could argue users still unnecessarily copy e.g. if using arrays in generics instead of array references in generics, although this could also be considered a user problem in that case.

This is something I'd support experimenting with. We would have to update libraries to take references where possible but could compare performance afterward.

@jfecher jfecher changed the title Experiment replacing inc_rc & dec_rc with normal operations on integer references Rework inc_rc, dec_rc and how arrays are handled in brillig in general Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: 📋 Backlog
Development

No branches or pull requests

2 participants