Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat[dace]: Updated DaCe Transformations #1639

Draft
wants to merge 35 commits into
base: main
Choose a base branch
from

Conversation

philip-paul-mueller
Copy link
Contributor

@philip-paul-mueller philip-paul-mueller commented Sep 11, 2024

The initial version of the optimization pipeline only contained a rough draft.
Currently this PR contains a copy of the map fusion transformations from DaCe that are currently under review. As soon as that PR is merged and DaCe was updated in GT4Py these files will be deleted.

This PR collects some general improvements:

  • More liberal LoopBlocking transformation (with tests).
  • Incorporate MapFusionParallel

Todo:

  • Using of can_be_applied_to() as soon as DaCe is updated (TrivialGPUMapElimination, SerialMapPromoter).
  • Removing of the BaseMapPromoter; as probably nobody needs that
  • Removing of the fusion transformations (once they are updaed in DaCe).
  • Looking at strides that the Lowering generates.

@philip-paul-mueller philip-paul-mueller marked this pull request as draft September 11, 2024 12:59
gtx_transformations.LoopBlocking(blocking_size=10, blocking_parameter="j"),
validate=True,
validate_all=True,
)
assert count > 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we would not like to assert on the exact number we expect? I mean count == 1 or 2 (applies also below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is actually no particular reason for it.
However you are right using the exact number is a bit better, because it ensures that the transformation is not applied twice in the same place.

# this preconditions on SDFG should ensure that, but there are a few
# super hard edge cases.
# TODO(phimuell): Add an intermediate here in this case
assert isinstance(independent_node, dace_nodes.AccessNode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure whether we should have an assert or an exception. Probably a runtime exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right a NotImplementedError is better.

Before the transformation only did the promotion, but it now also does the fusion in one go.
It is now possible to specify values for maps fo different dimensions.
The `gt_set_gpu_blocksize()` function still has the old interface, if the values are used then they are used as default and can be overwritten.
…`wcr_nonatomic` properties of edges.

The code that does this was now relocated to the auto optimizer directly.
This also applies to some other transformations.
It was not checked that the whole array was translated.
The case of sequential trivial map was not handled.
This lead to on infinite loop since a trivial map was promoted to a trivial map.
The fix is that such promition are rejected and handled later when the promottion was propagated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants