Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dsl: Introduce ability to define Functions on Subdomains #2245

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

EdCaunt
Copy link
Contributor

@EdCaunt EdCaunt commented Oct 24, 2023

Add Functions on SubDomains functionality.

Outstanding todos:

  • Fix bug in subdomain injection/interpolation where having multiple injection/interpolation operations in an operator leads to incorrect subdimension thicknesses appearing in the conditional in the generated code.
  • Fix leftovers from rebase

Remaining todo (for future PRs I think):

  • Tutorial notebook showing
    • Demo for seafloor-like configuration
    • Demo for damping-like configuration
    • Demo for snapshotting only one region of the grid
  • Tests for propagator on subdomain
    • Including with MPI
  • MPI in full mode + functions on subdomains leads to an error which I haven't had time to debug

@codecov
Copy link

codecov bot commented Oct 24, 2023

Codecov Report

Attention: Patch coverage is 26.90722% with 709 lines in your changes missing coverage. Please review.

Project coverage is 45.93%. Comparing base (71e7eda) to head (cfd2a9b).

Files with missing lines Patch % Lines
tests/test_subdomains.py 14.83% 287 Missing ⚠️
tests/test_interpolation.py 15.06% 141 Missing ⚠️
devito/mpi/distributed.py 36.66% 75 Missing and 1 partial ⚠️
tests/test_mpi.py 14.11% 73 Missing ⚠️
devito/types/grid.py 47.77% 46 Missing and 1 partial ⚠️
devito/operations/interpolators.py 54.16% 19 Missing and 3 partials ⚠️
devito/data/decomposition.py 45.94% 14 Missing and 6 partials ⚠️
devito/types/basic.py 45.45% 11 Missing and 1 partial ⚠️
devito/operator/operator.py 56.52% 6 Missing and 4 partials ⚠️
tests/test_pickle.py 45.45% 6 Missing ⚠️
... and 5 more

❗ There is a different number of reports uploaded between BASE (71e7eda) and HEAD (cfd2a9b). Click for more details.

HEAD has 9 uploads less than BASE
Flag BASE (71e7eda) HEAD (cfd2a9b)
16 7
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #2245       +/-   ##
===========================================
- Coverage   87.28%   45.93%   -41.35%     
===========================================
  Files         238      238               
  Lines       45703    46561      +858     
  Branches     4057     4124       +67     
===========================================
- Hits        39892    21389    -18503     
- Misses       5126    24279    +19153     
- Partials      685      893      +208     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mloubout mloubout added the API api (symbolics, types, ...) label Oct 24, 2023
return None

@property
def has_grid(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overkill


@property
def distributor(self):
if self.has_grid:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pythonic way of doing this is:

try:
    return self.grid._distributor
except AttributeError:
    return None

for k, v, s in zip(self.define(grid.dimensions).keys(),
self.define(grid.dimensions).values(), grid.shape):

for k, v, s in zip(self.define(self.grid.dimensions).keys(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for .keys()

@@ -483,12 +500,17 @@ class SubDomain(AbstractSubDomain):
"""

def __subdomain_finalize__(self, grid, **kwargs):
if self.has_grid:
Copy link
Contributor

@FabioLuporini FabioLuporini Oct 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if self.grid:

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@EdCaunt EdCaunt added the WIP Still work in progress label Dec 13, 2023
Copy link
Contributor

@FabioLuporini FabioLuporini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the amount of changes introduced, I'd prefer to see a lot more new tests.

Shouldn't all (most) of the old tests that @rhodrin wrote be here as well?

class SubDomainDistributor(DenseDistributor):

"""
Decompose a subdomain over a set of MPI processes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking: SubDomain

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this class should be (or really is aside from needing a couple of tweaks) tied specifically to SubDomain's - why not make the slightly more more general SubDistributor class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think just SubDistributor would be good

Parameters
----------
subdomain : SubDomain
The subdomain to be decomposed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking: full stop

decomposition of the grid.
"""
dist_interval = {dim: Interval(s.start, s.stop-1)
for dim, s in self.parent.glb_slices.items()}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for comprehensions, use d instead of dim . Not just for homogeneity, but also to minimize verbosity

bounds_map = {**{dim.symbolic_min: 0
for dim in self.parent.dimensions},
**{dim.symbolic_max: sha-1
for dim, sha in zip(self.parent.dimensions,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps you could add properties

subdomain dimensions -> subdims
self.parent.dimensions -> dimensions
self.parent.glb_shape -> glb_shape

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seem to remember that making SubDistributor.dimensions not match SubDomain.dimensions requires a bunch of extra complexity anywhere it's accessed, so I want to leave these properties intact. Have introduced par_dimensions and par_shape for the latter two however, as I agree it would be tidier with the attributes.

# The domain decomposition
decompositions = []
for dec, dim in zip(self.parent._decomposition, self.parent.dimensions):
decompositions.append([d if sdim_interval[dim] is None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to find a way of decreasing verbosity because in its current form the deciphering overhead is too high

@@ -333,8 +353,17 @@ def _size_outhalo(self):
self._distributor.myrank,
min(self.grid.shape_local))
if not self._distributor.is_boundary_rank:
# FIXME: Will throw an error if a Function on a Subdomain doesn't
# enter any boundary ranks -> Needs to not do this
print("First warning")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover

@@ -552,13 +582,17 @@ def _data_in_region(self, region, dim, side):
Typically, this accessor won't be used in user code to set or read
data values.
"""
if self._distributor:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the changes in this method seem very odd to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They were leftover from something previous I think? They weren't doing anything so have been reverted.

@@ -169,6 +188,9 @@ def __str__(self):
__repr__ = __str__


_uxreplace_registry.register(Eq)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah it's here now 😂 why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To prevent a circular import. However, given what Rhodri said about explicitly specifying the subdomain, this could probably be put back

"""The counter for a SubDomain"""
if self.grid:
try:
_subdomain_count[self.grid].add(id(self))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to rework all this because:

  1. it's too ugly, being based on a global variable
  2. it can leak memory

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@georgebisbas indeed..

if isinstance(v, Dimension):
sub_dimensions.append(v)
sdshape.append(s)
else:
name = 'i%d%s' % (self._counter, k.name)
try:
# Case ('middle', int, int)
side, thickness_left, thickness_right = v
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you change it into side, ltkn, rtkn = v

Copy link
Contributor

@mloubout mloubout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick pass

@@ -12,7 +12,8 @@

from devito.data import LEFT, CENTER, RIGHT, Decomposition
from devito.parameters import configuration
from devito.tools import EnrichedTuple, as_tuple, ctypes_to_cstr, filter_ordered
from devito.tools import EnrichedTuple, as_tuple, ctypes_to_cstr, filter_ordered, \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convention through devito is parenthesise not line break

return self._comm

@property
def myrank(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rank

return 0

@property
def mycoords(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coords

devito/mpi/distributed.py Outdated Show resolved Hide resolved
devito/types/dimension.py Outdated Show resolved Hide resolved
devito/types/dimension.py Outdated Show resolved Hide resolved
devito/types/grid.py Outdated Show resolved Hide resolved
devito/types/grid.py Outdated Show resolved Hide resolved
@@ -831,10 +831,13 @@ def test_solve(self, operate_on_empty_cache):
# created by the finite difference (u.dt, u.dx2). We would have had
# three extra references to u(t + dt), u(x - h_x) and u(x + h_x).
# But this is not the case anymore!
# TODO: This cache doesn't clear currently as retained legacy SubDomain API
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds quite dangerous as it might end up preventing Function and TimeFucntion to be reachable by the gc

@EdCaunt EdCaunt requested a review from mloubout April 4, 2024 13:01
@EdCaunt EdCaunt removed the WIP Still work in progress label Apr 4, 2024
@EdCaunt EdCaunt force-pushed the funcs_on_subdims branch from 4268c54 to af06ba9 Compare April 8, 2024 15:41
@@ -204,9 +204,19 @@ def index_glb_to_loc(self, *args, rel=True):
>>> d.index_glb_to_loc((1, 6), rel=False)
(5, 6)
"""
# Need to offset the loc_abs_min, loc_abs_min, glb_min, and glb_max
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loc_abs_max

devito/data/decomposition.py Show resolved Hide resolved
devito/data/decomposition.py Show resolved Hide resolved
devito/types/grid.py Show resolved Hide resolved
@@ -22,6 +24,8 @@

GlobalLocal = namedtuple('GlobalLocal', 'glb loc')

_subdomain_count = {} # Track attachment of grids to SubDomains for dimension labelling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is good to have! Need to check!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it isn't

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have to fix it

self._distributor = SubDistributor(self)

# Intervals of form Interval(n, n) automatically become FiniteSet
# +1 as intervals are in terms of indices (inclusive of endpoints)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this +1 mean here?

raise ValueError("`SubDomain` %s has already been attached to a `Grid`"
% self)
else:
self._grid = grid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newline

Comment on lines 865 to 851
if grid:
if self.grid:
raise ValueError("`SubDomain` %s has already been attached to a `Grid`"
% self)
else:
self._grid = grid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be done in less lines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped a line. Could I do this with a decorator, since it's used at the start of SubDomain.__init_finalize__?

Comment on lines +245 to +267
if spec[0] == 'middle':
start = spec[1]
end = size - spec[2] - 1
elif spec[0] == 'left':
start = 0
end = spec[1] - 1
else:
start = size - spec[1]
end = size - 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can also be parametrized, having an 'expected' argument with the expected to be asserted result

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would probably make it messier and less extensible, given the combinatorial use of subdomain specifications and their orientation vs the domain decomposition

@@ -226,6 +237,24 @@ def _interp_idx(self, variables, implicit_dims=None, pos_only=()):
"""
Generate interpolation indices for the DiscreteFunctions in ``variables``.
"""
# Check if any Functions are defined on SubDomains
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this new chunk of code deserves an utility function

@@ -585,25 +585,34 @@ def _prepare_arguments(self, autotune=None, **kwargs):
discretizations = {getattr(kwargs[p.name], 'grid', None) for p in overrides}
discretizations.update({getattr(p, 'grid', None) for p in defaults})
discretizations.discard(None)
# Remove subgrids if multiple grids

# Remove subgrids if multiple Grids
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this chunk of code that pulls the grids and sanitizes them to ensure there's just one Grid could/should be now moved to a separate utility function, as I feel it could be reused in some other places as well (surely I found myself in need to extract the unique grid from a list of something... I think IIRC cluster.grid is one such case)

return self._dimension_map

@cached_property
def _d_interval(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw _domain_interval would be clearer, I wouldn't have guessed it w/o a _sd_interval

crosses = {}

for d, d_i, s_i in zip(self.dimensions, self._d_interval, self._sd_interval):
if d.is_Sub:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

devito/types/grid.py Outdated Show resolved Hide resolved
@EdCaunt EdCaunt force-pushed the funcs_on_subdims branch 2 times, most recently from c6b95d7 to 12d7169 Compare May 21, 2024 16:03
cdmapper = {}
# Need to adjust bounds if Function defined on a SubDomain
if subdomain:
adjust_interp_indices(subdomain, mapper, smapper, cdmapper)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems overlly intricated and overkill to build everything then redo a pass to replace everything with the input dimensions. What's the problem with doing self._rrdims(grid) and self._gdims(grid) and catching directly the right dimensions instead of rebuilding everything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was thinking something along those lines. I figured it would be best to get it working then refactor it

if d in a.free_symbols:
# Shift by origin d -> d - o.
indices.append(sympy.sympify(a.subs(d, d - o).xreplace(s)))
# Shift by origin and offset d -> d - o - of.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have origin just directly return the right one, or maybe local_origin ?

@@ -177,7 +177,7 @@ def parallel(item, m):
args = ["-n", "1", pyversion, "-m", "pytest", "-s", "--runxfail", "-qq", testname]
if nprocs > 1:
args.extend([":", "-n", "%d" % (nprocs - 1), pyversion, "-m", "pytest",
"-s", "--runxfail", "--tb=no", "-qq", "--no-summary", testname])
"-s", "--runxfail", "--tb=no", "-qq", testname])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a rebase leftover?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API api (symbolics, types, ...)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants