Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoDiff] Automatically determine AdStack's size #2438
[AutoDiff] Automatically determine AdStack's size #2438
Changes from 12 commits
2b92337
65e8ec4
1cda1cd
78e2486
6020799
8625399
244b249
1ae751f
d01283c
b2f31b2
411c332
9b8cfc5
e04dfe7
005a560
6d86c71
243764f
838bf5d
6652b8d
39c8d0d
838e969
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How serious it is if the necessary AD stack size overflows
max_ad_stack_size
? If this would directly result in wrong results, we should better report an error and stop immediately?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not that serious -- IMHO in most cases, it's the control-flow graph that unable to determine the max capacity. For example:
The control-flow graph does not have the range
10
, and thus cannot determine the capacity.From another perspective, the current codebase uses a fixed capacity of 16, and it works fine.
On the other hand, I don't see any warnings in the current tests, so maybe even the above case (a loop with #pushes > #pops) doesn't exist, and the control-flow graph is able to determine all maximum capacities. Whether the above case exists depends on the usage of AD-stack in auto_diff.cpp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the code implementation looks like it is able to figure out the necessary stack size, which overflows
max_ad_stack_size
. But the interpretation is "unable to determine the max capacity". Is it possible to distinguish these two cases, or maybe I'm misunderstanding something ? 🤣There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I was using the terms "size" and "capacity" interchangeably... It should be the necessary stack size, not the max capacity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If auto_diff.cpp is bug-free, then #pushes should be always equal to #pops?
I still don't see why the fact that an undetermined stack would not lead to a bad result. IIUC, this is serious, but there are cases where this stack size just cannot be determined statically? (Like the
range(10)
example you gave).Were you referring to the python tests, or the new CPP tests? If the former, i think that's because the output are not shown in
pytest
by default..There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. Let's add this comment to the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I think maybe it's better to use the Bellman-Ford algorithm -- when we are "able to figure out all stack sizes statically", we are guaranteed to figure them out even if a stack needs a large size (and the algorithm's running time is approximately the same); when we are unable to figure out at least one stack size, the algorithm will run slower but we will be sure that we are unable to figure out the stack size statically (instead of a warning about possible overflow) (So, in this case, I think the message should not be a
TI_WARN
-- maybe aTI_INFO
orTI_DEBUG
). WDYT?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds great!
IIUC, we cannot determine the stack size as soon as the kernel has a loop in it? If so, yeah, maybe it's better to make this TI_DEBUG to reduce the noise..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ljcc0930 will help review Bellman-Ford XD