feat[next]: Optimisations for icon4py #1536

samkellerhals · 2024-04-22T10:38:05Z

Description

Changes to speedup diffusion granule execution from ICON.

Removing isinstance checks from extract_connectivity_args and convert_args.
Removing warning from field _maker.
Allows connectivities to be passed (effectively allowing caching of connectivities on the caller side)

Further (future) optimisations:

All connectivities are cached on the icon4py side, so the use of ensure_is_on_device is only necessary now in order to be able to run certain gt4py tests. This should be removed in the future (and done in the gt4py tests itself) as it adds an overhead here that is not necessary.
convert_args still takes up the bulk of time, can we get rid of it or improve it?
We have to pass array sizes for each dimension to gt4py stencils if no explicit domain bounds are defined, can this be standardised?

src/gt4py/next/program_processors/runners/gtfn.py

…n4py

…' into optimisations-for-icon4py

havogt

Forwarding the review to @egparedes for improving Python patterns in performance critical code parts.

havogt · 2024-05-06T12:49:38Z

src/gt4py/next/iterator/embedded.py

@@ -1000,8 +999,6 @@ def _shift_field_indices(
 def np_as_located_field(
    *axes: common.Dimension, origin: Optional[dict[common.Dimension, int]] = None
 ) -> Callable[[np.ndarray], common.Field]:
-    warnings.warn("`np_as_located_field()` is deprecated, use `gtx.as_field()`", DeprecationWarning)  # noqa: B028 [no-explicit-stacklevel]


Why this change? Can you undo?

I see it was intentional. Still should be undone and fixed on the icon4py side.

So basically emitting the warning was slowing things down on the icon4py side when creating the gt4py fields when calling the granule. In what way could this be fixed on the icon4py side?

The reason we currently still use np_as_located_field is because it gives you back a numpy array view of the pointer we pass from fortran. If we switch to as_field it does not use the same memory location any longer (I guess it makes a copy?) which should be fixed on the gt4py side.

I propose to just wrap the warning in an if __debug__ for now, so that it can be disabled in -O mode.

Ok, I see, then it probably makes sense to use (temporarily) the common._field() function to wrap an array into a field.

havogt · 2024-05-06T12:57:58Z

src/gt4py/next/program_processors/runners/gtfn.py

+        # If we don't pass them as in the case of a CachedProgram extract connectivities here.
+        if conn_args is None:
+            conn_args = extract_connectivity_args(offset_provider, device)
+


feels like code should be refactored such that we always pass conn_args here. Probably makes sense to do this change in the context of a coarser refactoring. @DropD might have ideas.

Once I get size args extraction moved to this stage, I will look into adding that.

havogt · 2024-05-06T12:59:22Z

src/gt4py/next/program_processors/runners/gtfn.py

+    return arr, origin
+
+
+type_handlers_convert_args = {


would functools.singledispatch work?

I think it could work, however whilst nicer I have the feeling that it would be slower than a simple dictionary lookup.

I think it would be preferable to sacrifice readability only on the basis of hard evidence.

havogt · 2024-05-06T13:01:18Z

But in general, could make sense to discuss this in the context of going towards frozen programs.

DropD · 2024-05-16T07:25:32Z

src/gt4py/next/program_processors/runners/gtfn.py

+def handle_connectivity(
+    conn: NeighborTableOffsetProvider, zero_tuple: tuple[int, ...], device: core_defs.DeviceType
+) -> ConnectivityArg:
+    return (_ensure_is_on_device(conn.table, device), zero_tuple)
+
+
+def handle_other_type(*args: Any, **kwargs: Any) -> None:
+    return None
+
+
+type_handlers_connectivity_args = {
+    NeighborTableOffsetProvider: handle_connectivity,
+    common.Dimension: handle_other_type,
+}
+
+


If this pattern proves to be a significant optimization over singledispatch, it still needs to be made more readable.

I propose to encode the pattern in a class. That class then needs a docstring explaining when to use it over standard approaches and why, for example, subclasses of the relevant types won't work with it unless they are added explicitly to the dict.

Sketch:

class FastDispatch: """ Optimized version of functools.singledispatch, does not take into account inheritance or protocol membership. This leads to a speed-up of XXX, as documented in ADR YYY. Usage: >>> @Fastdispatch.fastdispatch(Any) ... def extract_connectivity_args(connectivity, *args, **kwargs): ... return None ... ... @extract_connectivity_args(NeighborTableOffsetProvider): ... def extract_connectivity_args_from_nbtable(connectivity, device, *args, **kwargs): ... return (_ensure_is_on_device(connectivity.table, device), zero_tuple) """ _registry: dict[type: str] def __call__(self, dispatch_arg, *args, **kwargs): return getattr(self, self._registry[type(dispatch_arg)])(dispatch_arg, *args, **kwargs) def register(self, type): def decorator(function): self._registry[type] = function return function return decorator @classmethod def fastdispatch(cls, default_type): return decorator(function): dispatcher = cls() dispatcher.register(default_type)(function) return dispatcher return decorator

…sations-for-icon4py

optimise extract_connectivity_args

f16d03b

havogt reviewed Apr 22, 2024

View reviewed changes

src/gt4py/next/program_processors/runners/gtfn.py Outdated Show resolved Hide resolved

samkellerhals added 8 commits April 22, 2024 17:18

Remove isinstance checks from convert_args

4e2a31f

More small optimisations, and connecitivities caching

ddf16f1

Only do asserts in debug mode

0f364a9

Import CuPyArrayField only when cp is available

789b211

Run precommit

a674725

Merge branch 'main' into optimisations-for-icon4py

f5fe70d

Merge remote-tracking branch 'origin/main' into optimisations-for-ico…

c5cbe5d

…n4py

Merge remote-tracking branch 'samkellerhals/optimisations-for-icon4py…

b080621

…' into optimisations-for-icon4py

samkellerhals mentioned this pull request May 2, 2024

[Py2F]: add profiling support & optimisations C2SM/icon4py#449

Merged

samkellerhals added 3 commits May 3, 2024 17:33

Merge remote-tracking branch 'origin' into optimisations-for-icon4py

4400ede

Add _ensure_is_on_device checks to run tests

0829ab4

Place _ensure_is_on_device in right place

e717b33

samkellerhals requested review from egparedes and havogt May 6, 2024 08:13

samkellerhals changed the title ~~Optimisations for icon4py~~ feat[next]: Optimisations for icon4py May 6, 2024

havogt requested changes May 6, 2024

View reviewed changes

Add deprecation warning

d921bca

samkellerhals requested a review from DropD May 15, 2024 13:12

DropD requested changes May 16, 2024

View reviewed changes

samkellerhals and others added 2 commits May 24, 2024 11:48

Merge branch 'main' of https://github.com/GridTools/gt4py into optimi…

33bd45b

…sations-for-icon4py

Merge branch 'main' into optimisations-for-icon4py

c68b970

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat[next]: Optimisations for icon4py #1536

feat[next]: Optimisations for icon4py #1536

samkellerhals commented Apr 22, 2024 •

edited

Loading

havogt left a comment

havogt May 6, 2024

havogt May 6, 2024

samkellerhals May 7, 2024

samkellerhals May 7, 2024 •

edited

Loading

samkellerhals May 7, 2024 •

edited

Loading

havogt May 7, 2024

havogt May 6, 2024

DropD May 16, 2024

havogt May 6, 2024

samkellerhals May 7, 2024 •

edited

Loading

DropD May 16, 2024

havogt commented May 6, 2024

DropD May 16, 2024

feat[next]: Optimisations for icon4py #1536

Are you sure you want to change the base?

feat[next]: Optimisations for icon4py #1536

Conversation

samkellerhals commented Apr 22, 2024 • edited Loading

Description

havogt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samkellerhals May 7, 2024 • edited Loading

Choose a reason for hiding this comment

samkellerhals May 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samkellerhals May 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

havogt commented May 6, 2024

Choose a reason for hiding this comment

samkellerhals commented Apr 22, 2024 •

edited

Loading

samkellerhals May 7, 2024 •

edited

Loading

samkellerhals May 7, 2024 •

edited

Loading

samkellerhals May 7, 2024 •

edited

Loading