-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA early exit on empty grid #1729
Merged
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
aafed48
early exit instead of launching cuda kernel on empty grid
petiaccja 923fd89
moved domain size check one level up
petiaccja 140f6d2
replaced check on block with check on sizes argument, moved check to top
petiaccja db5dd93
messed up the predicate
petiaccja 670777e
early return
petiaccja f9750b7
moved empty domain tests to regression
petiaccja 30ef04b
forgot this one include
petiaccja 5c9e6e5
forgot cassert
petiaccja be5679a
Merge remote-tracking branch 'origin/master' into fix-empty-domain
petiaccja File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
/* | ||
* GridTools | ||
* | ||
* Copyright (c) 2014-2021, ETH Zurich | ||
* All rights reserved. | ||
* | ||
* Please, refer to the LICENSE file in the root directory. | ||
* SPDX-License-Identifier: BSD-3-Clause | ||
*/ | ||
|
||
#include <gtest/gtest.h> | ||
|
||
#include <gridtools/fn/unstructured.hpp> | ||
|
||
#include <fn_select.hpp> | ||
#include <test_environment.hpp> | ||
|
||
namespace { | ||
using namespace gridtools; | ||
using namespace fn; | ||
using namespace literals; | ||
|
||
struct empty_stencil { | ||
GT_FUNCTION constexpr auto operator()() const { | ||
return []() { return 0.0f; }; | ||
} | ||
}; | ||
|
||
GT_REGRESSION_TEST(empty_domain_stencil, test_environment<>, fn_backend_t) { | ||
auto out = TypeParam::make_storage([](...) { return 0; }); | ||
|
||
auto domain = unstructured_domain(tuple{0, 0}, tuple{0, 0}); | ||
auto backend = make_backend(fn_backend_t{}, domain); | ||
backend.stencil_executor()().arg(out).assign(0_c, empty_stencil{}).execute(); | ||
} | ||
|
||
struct empty_column : fwd { | ||
static GT_FUNCTION constexpr auto prologue() { | ||
return tuple(scan_pass([](auto acc) { return acc; }, host_device::identity())); | ||
} | ||
|
||
static GT_FUNCTION constexpr auto body() { | ||
return scan_pass([](auto acc) { return acc; }, host_device::identity()); | ||
} | ||
}; | ||
|
||
GT_REGRESSION_TEST(empty_domain_column, vertical_test_environment<>, fn_backend_t) { | ||
auto out = TypeParam::make_storage([](...) { return 0; }); | ||
|
||
auto domain = unstructured_domain(tuple{0, 0}, tuple{0, 0}); | ||
auto backend = make_backend(fn_backend_t{}, domain); | ||
backend.vertical_executor()().arg(out).assign(0_c, empty_column{}, 0.0f).execute(); | ||
} | ||
|
||
} // namespace |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something like the following (as the first thing in the function) would be more readable. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and maybe even better to move it outside of gpu.hpp to the common part where we are dispatching to the backends (but didn't check where that is and if it really makes sense).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created an
is_domain_empty
function. I usually use anif
block instead of an early return when the code is simple, but I'm happy with either.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer Hannes early return in this case as I my eyes don’t have to track any braces opening and closing. And my proposed changes would be even a bit more minimalistic, just before the kernel launch:
(of course this does not include the optimization of skipping unnecessary computation of ptr_holder, strides, etc., but I don’t think we need any optimization of such trivial computations for an empty domain)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, uhh, it was with the
blocks
before Hannes suggested to make it a function :) Now that I wrote the empty domain function, I'd rather keep it, because that satisfies the requirements independent of how you calculate the blocks.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll also put an early return