Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(map): Allow chaining lists before mapping #136

Merged
merged 1 commit into from
Sep 20, 2024
Merged

Conversation

jan-ferdinand
Copy link
Member

Previously, the higher-order assembly function map took all elements of an input list, mapped them according to some supplied function, and generated an output list. Equivalently, written in rust, this was:list.into_iter().map(f).collect_vec().

Now, map is generalized over the number of input lists, enabling chaining multiple input lists while still producing only one output list. As rust: list_0.into_iter().chain(list_1).map(f).collect_vec().

Between 0 and 15 (both inclusive) input lists are supported.

Note: the ABI for map has changed. Before, an inner function could access runtime parameters on the stack at a distance of 3. Now, this distance has increased to (3 + num_input_lists). For a “traditional” map, this is now 4.

@Sword-Smith
Copy link
Contributor

Looks very good. Two observations:

  1. There are some benchmark results that show surprising changes.
  2. Instead of restricting values generated for environment_args in pseudorandom_initial_state, would it make sense to rewrite the test containing an assert that there was no overflow (on a u128 addition) not use the .test() method but let it set up its environment in a more manual way?

@jan-ferdinand
Copy link
Member Author

There are some benchmark results that show surprising changes.

True. I guess that pertains mainly to tasmlib_list_higher_order_u32_map_test_hash_xfield_element. Could it be explained by different test input being generated? Since the input state generator has to accomodate multiple lists, I rewrote it, re-ordering the operations and arguments to the various rng::gen() calls.

[W]ould it make sense to rewrite the test [to] not use the .test() method but let it set up its environment in a more manual way?

Potentially yes. The one thing I tried was accessing ShadowedFunction::test_initial_state(), but that's a private method (and I'm fine with that). What would be a good way to set up a test in this way?

tasm-lib/src/list/higher_order/map.rs Outdated Show resolved Hide resolved
tasm-lib/src/list/higher_order/map.rs Outdated Show resolved Hide resolved
tasm-lib/src/list/higher_order/map.rs Outdated Show resolved Hide resolved
tasm-lib/src/list/higher_order/map.rs Outdated Show resolved Hide resolved
@Sword-Smith
Copy link
Contributor

Please note that I rebased this branch against master in order to be able to call the test_rust_equivalence_given_execution_state test helper function such that I could rewrite the test case that forced you to do shady things in the pseudorandom_initial_state trait function.

@jan-ferdinand
Copy link
Member Author

Apart from your suggestions, I also made one more addition that might be worth reviewing: the type ChainMap now declares an associated constant NUM_INTERNAL_REGISTERS, with which the required offset for any mapping function requiring runtime parameters from deeper on the stack can be accessed programatically. The intention is to

  1. centralize knowledge: other snippets do not need to be aware of chain_map internals, and
  2. future-proofing our code: should we choose to optimize ChainMap::<1>, we can now do so without having to touch all snippets that use Map.

Let me know if you have any suggestions to improve this design.

@jan-ferdinand
Copy link
Member Author

some benchmark results that show surprising changes

Is the change to the initial state generator a likely explanation? If so, are we fine with the benchmark changing more dramatically than the changes to the assembly alone would suggest?

@Sword-Smith
Copy link
Contributor

Sword-Smith commented Sep 20, 2024

some benchmark results that show surprising changes

Is the change to the initial state generator a likely explanation? If so, are we fine with the benchmark changing more dramatically than the changes to the assembly alone would suggest?

Yes, we are fine with that. But with the suggested change of using the bench input parameter to set the list length (a suggestion you already implemented), we can trust the benchmarks of Map going forward. That's enough for me.

Other benchmarks reveal the performance changes to Map resulting from this PR, and those changes look positive (positive in the value sense, i.e. a decrese in the row counts).

@Sword-Smith Sword-Smith self-requested a review September 20, 2024 12:13
Comment on lines 91 to +95
assert_eq!(1, sn.input_types().len(), "{INNER_FN_INCORRECT_NUM_INPUTS}");
let fn_body = sn.function_code(library);
let (_, instructions) = tokenize(&fn_body).unwrap();
let labelled_instructions = isa::parser::to_labelled_instructions(&instructions);
let snippet_name =
library.explicit_import(&sn.entrypoint_name(), &labelled_instructions);
(triton_asm!(call { snippet_name }), String::default())
let label = library.explicit_import(&sn.entrypoint_name(), &labelled_instructions);
Copy link
Contributor

@Sword-Smith Sword-Smith Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, we are using explicit_import here. I'm not 100 % sure that's necessary. Maybe we can just use the import method directly on the snippet instead? Or maybe there's some borrowing check that fails if you do that? Unsure. Feel free to investigate, or to just ignore this comment, as the current approach works just fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that, but could not get it to work. The main culprit seems to be the trait object snippet, which is of type &Box<dyn BasicSnippet>. Library::import() requires a Box<dyn BasicSnippet>. Dereferencing snippet does not work because that would constitute a move, but the method only takes &self. Similar problems arise when changing the match statement to match self.f instead of &self.f. Cloning the snippet also does not work, because the Snippet trait does not have a trait bound on Clone. When trying to introduce such a trait bound, all hell breaks lose.

Long story short: it might be possible, but I decided to give up after a few minutes of trying.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, my comment is about the match arm one further down. The place where you put the comment has the additional complication that the type of sn is &Box<dyn DeprecatedSnippet>, which would have to be transformed into a Box<dyn BasicSnippet> first.

Previously, the higher-order assembly function `map` took all elements
of an input list, mapped them according to some supplied function, and
generated an output list. Equivalently, written in rust, this
was:`list.into_iter().map(f).collect_vec()`.

Now, `map` is generalized over the number of input lists, enabling
chaining multiple input lists while still producing only one output
list. As rust: `list_0.into_iter().chain(list_1).map(f).collect_vec()`.

Between 0 and 15 (both inclusive) input lists are supported.

Note: the ABI for map has changed. Before, an inner function could
access runtime parameters on the stack at a distance of 3. Now, this
distance has increased to (3 + num_input_lists). For a “traditional”
map, this is now 4.
@jan-ferdinand jan-ferdinand merged commit 031927a into master Sep 20, 2024
3 checks passed
@jan-ferdinand jan-ferdinand deleted the chain_map branch September 20, 2024 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants