Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoDiff] Automatically determine AdStack's size #2438

Merged
merged 20 commits into from
Jun 29, 2021

Conversation

xumingkuan
Copy link
Contributor

@xumingkuan xumingkuan commented Jun 18, 2021

Related issue = #656

This PR uses the control-flow graph to compute the necessary size for each Autodiff stack.

In case there may be a loop in which there is a stack push, a limit constexpr int kMaxAdStackSize = 32; is set to prevent infinite loops.

We use the Bellman-Ford algorithm to compute the sizes. When there is a positive loop (#pushes > #pops in a loop) for an Autodiff stack, we cannot determine the size, and a default value CompileConfig::default_ad_stack_size = 32 is used.

@xumingkuan xumingkuan mentioned this pull request Jun 18, 2021
18 tasks
@xumingkuan
Copy link
Contributor Author

/format

@xumingkuan xumingkuan requested review from k-ye and yuanming-hu June 18, 2021 13:39
Copy link
Member

@yuanming-hu yuanming-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't carefully checked ControlFlowGraph::determine_ad_stack_size, but I believe this is great progress and a very useful feature in AutoDiff :-)

At a high level, I think we should add a few test cases using the CHI IR builder. A subtle error in this inference pass may result in undefined behavior (since stack overflows) instead of a laud error message. It would also make sense to add a runtime check of stack overflowing in debug mode.

taichi/transforms/determine_ad_stack_size.cpp Outdated Show resolved Hide resolved
taichi/backends/cc/codegen_cc.cpp Show resolved Hide resolved
taichi/ir/control_flow_graph.cpp Outdated Show resolved Hide resolved
taichi/ir/statements.h Outdated Show resolved Hide resolved
taichi/ir/transforms.h Show resolved Hide resolved
@xumingkuan xumingkuan marked this pull request as draft June 19, 2021 16:12
@xumingkuan
Copy link
Contributor Author

xumingkuan commented Jun 19, 2021

TODOs:

  • Add C++ tests
  • Add documentation in transforms.h
  • Final verify pass?

xumingkuan and others added 5 commits June 22, 2021 13:32
Co-authored-by: Ye Kuang <k-ye@users.noreply.github.com>
Co-authored-by: Yuanming Hu <yuanming-hu@users.noreply.github.com>
@xumingkuan
Copy link
Contributor Author

verify -> verify_ir_structure
verify_before_codegen

taichi/ir/control_flow_graph.cpp Outdated Show resolved Hide resolved
taichi/ir/control_flow_graph.cpp Outdated Show resolved Hide resolved
taichi/ir/control_flow_graph.cpp Outdated Show resolved Hide resolved
taichi/ir/control_flow_graph.cpp Outdated Show resolved Hide resolved
@xumingkuan
Copy link
Contributor Author

verify -> verify_ir_structure
verify_before_codegen

I think it's better to put this refactoring in another PR. WDYT?

@xumingkuan xumingkuan marked this pull request as ready for review June 22, 2021 08:46
@xumingkuan xumingkuan requested a review from k-ye June 22, 2021 08:46
@yuanming-hu
Copy link
Member

verify -> verify_ir_structure
verify_before_codegen

I think it's better to put this refactoring in another PR. WDYT?

Sounds good!

taichi/backends/cc/codegen_cc.cpp Show resolved Hide resolved
for (auto &stack : oversized_stacks) {
oversized_stacks_name.push_back(stack->name());
}
TI_WARN(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How serious it is if the necessary AD stack size overflows max_ad_stack_size? If this would directly result in wrong results, we should better report an error and stop immediately?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not that serious -- IMHO in most cases, it's the control-flow graph that unable to determine the max capacity. For example:

s = stack()
for i in range(10):
  s.push(i)

The control-flow graph does not have the range 10, and thus cannot determine the capacity.

From another perspective, the current codebase uses a fixed capacity of 16, and it works fine.

On the other hand, I don't see any warnings in the current tests, so maybe even the above case (a loop with #pushes > #pops) doesn't exist, and the control-flow graph is able to determine all maximum capacities. Whether the above case exists depends on the usage of AD-stack in auto_diff.cpp.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the code implementation looks like it is able to figure out the necessary stack size, which overflows max_ad_stack_size. But the interpretation is "unable to determine the max capacity". Is it possible to distinguish these two cases, or maybe I'm misunderstanding something ? 🤣

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I was using the terms "size" and "capacity" interchangeably... It should be the necessary stack size, not the max capacity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If auto_diff.cpp is bug-free, then #pushes should be always equal to #pops?

I still don't see why the fact that an undetermined stack would not lead to a bad result. IIUC, this is serious, but there are cases where this stack size just cannot be determined statically? (Like the range(10) example you gave).

I don't see any warnings in the current tests

Were you referring to the python tests, or the new CPP tests? If the former, i think that's because the output are not shown in pytest by default..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. Let's add this comment to the code?

Copy link
Contributor Author

@xumingkuan xumingkuan Jun 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I think maybe it's better to use the Bellman-Ford algorithm -- when we are "able to figure out all stack sizes statically", we are guaranteed to figure them out even if a stack needs a large size (and the algorithm's running time is approximately the same); when we are unable to figure out at least one stack size, the algorithm will run slower but we will be sure that we are unable to figure out the stack size statically (instead of a warning about possible overflow) (So, in this case, I think the message should not be a TI_WARN -- maybe a TI_INFO or TI_DEBUG). WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I think maybe it's better to use the Bellman-Ford algorithm

Sounds great!

So, in this case, I think the message should not be a TI_WARN -- maybe a TI_INFO or TI_DEBUG. WDYT?

IIUC, we cannot determine the stack size as soon as the kernel has a loop in it? If so, yeah, maybe it's better to make this TI_DEBUG to reduce the noise..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, we cannot determine the stack size as soon as the kernel has a loop in it? If so, yeah, maybe it's better to make this TI_DEBUG to reduce the noise..

Right...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ljcc0930 will help review Bellman-Ford XD

taichi/ir/control_flow_graph.cpp Outdated Show resolved Hide resolved
tests/cpp/transforms/determine_ad_stack_size_test.cpp Outdated Show resolved Hide resolved
tests/cpp/transforms/determine_ad_stack_size_test.cpp Outdated Show resolved Hide resolved
for (auto &stack : oversized_stacks) {
oversized_stacks_name.push_back(stack->name());
}
TI_WARN(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the code implementation looks like it is able to figure out the necessary stack size, which overflows max_ad_stack_size. But the interpretation is "unable to determine the max capacity". Is it possible to distinguish these two cases, or maybe I'm misunderstanding something ? 🤣

@xumingkuan xumingkuan requested a review from k-ye June 26, 2021 09:14
tests/cpp/transforms/determine_ad_stack_size_test.cpp Outdated Show resolved Hide resolved
for (auto &stack : oversized_stacks) {
oversized_stacks_name.push_back(stack->name());
}
TI_WARN(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for (auto &stack : oversized_stacks) {
oversized_stacks_name.push_back(stack->name());
}
TI_WARN(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I see, but I thought oversized_stacks means a different thing? I.e., "I am able to figure out the stack size statically, and it definitely overflows the configured max". So capping it to max_ad_stack_size will lead to wrong result?

Copy link
Member

@k-ye k-ye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great thanks!

@k-ye k-ye requested a review from ljcc0930 June 28, 2021 06:54
Copy link
Contributor

@ljcc0930 ljcc0930 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@xumingkuan xumingkuan requested a review from ljcc0930 June 29, 2021 05:30
@xumingkuan
Copy link
Contributor Author

Hi @ljcc0930 , could you please re-review 6652b8d ? I wasn't really using Bellman-Ford before this commit. Thanks in advance!

Copy link
Contributor

@ljcc0930 ljcc0930 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@k-ye k-ye merged commit 7fe1cf2 into taichi-dev:master Jun 29, 2021
@squarefk squarefk mentioned this pull request Jul 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants