-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat[dace]: Updated DaCe Transformations #1639
base: main
Are you sure you want to change the base?
feat[dace]: Updated DaCe Transformations #1639
Conversation
No real improvement though but it is now a bit clearer and does not use the DaCe properties for storing the intermediate sets it computes.
This is that we can continue until the transformations are fully merged.
..._tests/program_processor_tests/runners_tests/dace/transformation_tests/test_loop_blocking.py
Outdated
Show resolved
Hide resolved
..._tests/program_processor_tests/runners_tests/dace/transformation_tests/test_loop_blocking.py
Outdated
Show resolved
Hide resolved
..._tests/program_processor_tests/runners_tests/dace/transformation_tests/test_loop_blocking.py
Outdated
Show resolved
Hide resolved
gtx_transformations.LoopBlocking(blocking_size=10, blocking_parameter="j"), | ||
validate=True, | ||
validate_all=True, | ||
) | ||
assert count > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we would not like to assert on the exact number we expect? I mean count ==
1 or 2 (applies also below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is actually no particular reason for it.
However you are right using the exact number is a bit better, because it ensures that the transformation is not applied twice in the same place.
# this preconditions on SDFG should ensure that, but there are a few | ||
# super hard edge cases. | ||
# TODO(phimuell): Add an intermediate here in this case | ||
assert isinstance(independent_node, dace_nodes.AccessNode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure whether we should have an assert or an exception. Probably a runtime exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right a NotImplementedError
is better.
Before the transformation only did the promotion, but it now also does the fusion in one go.
It is now possible to specify values for maps fo different dimensions. The `gt_set_gpu_blocksize()` function still has the old interface, if the values are used then they are used as default and can be overwritten.
I slightly refactored the function.
…`wcr_nonatomic` properties of edges. The code that does this was now relocated to the auto optimizer directly.
This also applies to some other transformations.
It was not checked that the whole array was translated.
The case of sequential trivial map was not handled. This lead to on infinite loop since a trivial map was promoted to a trivial map. The fix is that such promition are rejected and handled later when the promottion was propagated.
There was a forgotten `not`.
However, It seems that it is not needed.
In some cases we might generate dead dataflow. This is an effect of how fusion currently and probably also in the future works.
The initial version of the optimization pipeline only contained a rough draft.
Currently this PR contains a copy of the map fusion transformations from DaCe that are currently under review. As soon as that PR is merged and DaCe was updated in GT4Py these files will be deleted.
This PR collects some general improvements:
LoopBlocking
transformation (with tests).MapFusionParallel
Todo:
can_be_applied_to()
as soon as DaCe is updated (TrivialGPUMapElimination
,SerialMapPromoter
).BaseMapPromoter
; as probably nobody needs that