-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor Hasher lookup handling in the chiplets bus #360
Comments
Regarding computing lookups as the requests are made by the decoder, this is being worked on in PR #436 (the code is WIP and not very readable right now). The issue with this approach is that the context about the blocks is lost for intermediate requests while respan is called for span blocks, and when
However, all of this requires the decoder to keep track of details that it ideally shouldn't. The approach we already use is much cleaner in comparison. @bobbinth What do you think? Should we close this issue without this change or is there a better way to handle this? |
Currently, the first and third items from the issue description have been completed and we've gotten improvements from these. The final change (how lookups are requested) wouldn't reduce the amount of data we're storing, it would just change the interface so we wouldn't manage these intermediary |
Sorry for the delay on this (I've been trying to think it through)! First, I should say that I'm very happy with the results we've gotten so far, and I'm fine with closing this issue as remaining work will give us only marginal benefit. So, we can drop the work done in #436. Having said that, I think we should keep an open issue for the remaining work (i.e., figuring out how to get rid of I also want to clarify that I don't think we need to change how the hasher actually builds the execution trace - only how lookups are computed. Specifically:
Unless I'm missing something, I don't think computing lookup values is related to setting selectors. The entire trace for a span block would still be computed as a part of the There is a bit more complexity here as
Again, I think memoization code could stay as it is now. The only thing that needs to change (in my opinion) is that For control blocks, the solution should be relatively simple:
So unless I missed something (which is very possible), as far as control blocks go, the changes should be pretty light. |
If I understand correctly, you're suggesting adding new methods in the chiplets to add the hasher lookups rather than having them be part of the methods control block and span block methods in the I agree that this should be simpler than the current approach. The only complexity is in providing the needed address for lookups during span blocks. Other control blocks are straightforward, as you said. I think it makes sense to open a new issue for getting rid of the |
@tohrnii is there anything either of us are missing here? |
Sorry for a late response.
You are right, it was required with earlier structure as we were building the trace and computing the lookups together as the decoder sends the request.
Thanks. Yeah makes sense. I made a mistake in thinking through this. I'll close this issue and the draft PR for now and add this as a separate issue for later. |
As discussed in #348, the first version of the hasher handling in the chiplets bus was straightforward and left room for several future optimizations.
In particular, the following should be refactored and optimized in the future:
Each item is described below. In all cases, there are also inline TODOs in the code with comments.
how hasher lookup values are computed
The hasher state is currently stored in the
HasherLookup
struct, which makes it heavy and means that other things which contain it become correspondingly much heavier (e.g.ChipletsLookupRow
). Similarly, the "next" hasher state is held in theAbsorb
variant of theHasherLookupContext
enum, which has the same knock-on effects. The aforementioned struct & enum are here.Instead, we could get the state from the trace when the lookup values are included in the$b_{chip}$ column. However, because the trace is in column-major form, this might not be more performant, so any change should be benchmarked.
Edit (tohrnii):
Once we remove the hasher state from
HasherLookup
, we should refactor the hash_span_block method to avoid doingis_memoized
checks multiple times.how lookups are requested from the decoder
Currently, the entire hash computation is done when the decoder makes its initialization request and all intermediate lookups required for the correctness of$b_{chip}$ are queued. When the decoder needs subsequent lookups (e.g. as it absorbs new operations during $b_{chip}$ bus.
RESPAN
or when the completes code blocks and needs the return hash), it sends a request, and they are dequeued and sent to theInstead, it might be better to compute the lookups at the time they are needed. This requires refactoring the decoder and the functions in the hasher more extensively.
See the relevant discussions in the previous PR:
how lookups are stored for "providing" them to the bus from the hash chiplet module.
Currently, all lookups are saved as they are computed during hash computations. At the end, during$b_{chip}$ one by one.
fill_trace
, the hash chiplet iterates through and provides each lookup to the busThere are a few different options here:
fill_trace
function, add the lookup "responses" to the chiplet bus in bulk instead of individually.There are a few relevant comments about this:
The text was updated successfully, but these errors were encountered: