Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add regridding benchmark #1557

Merged
merged 2 commits into from
Oct 10, 2024
Merged

Add regridding benchmark #1557

merged 2 commits into from
Oct 10, 2024

Conversation

hendrikmakait
Copy link
Member

Closes #1556

@mrocklin
Copy link
Member

mrocklin commented Oct 9, 2024

Do we have enough context here to add this to the benchmark post? If you give me bullet points and an image (if one makes sense) I'm happy to write up words.

Copy link

@aulemahal aulemahal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! I'm one of the de facto maintainer of xESMF and I think this benchmark is a good start!

To complexify the test and test areas where xESMF needs more improvement, I would suggest using a very large grid either as input or output and have some chunking across the spatial dimensions.

IIUC, these benchmarks are more geared towards dask ? Another bottleneck of xESMF is the generation of the weights (the Regridder initialization) with very large grids and more complex methods ("conservative"). But that part is neither parallelized nor lazy, so benchmarking this might be out of scope here.
We do have some code to make the weights generation in parallel but I would say it is still experimental and of limited scope.

@hendrikmakait
Copy link
Member Author

@aulemahal: Thanks for the input. I'll add a follow-up issue to look into some of the suggestions for increasing the complexity of this workload.

IIUC, these benchmarks are more geared towards dask ?

It is, but it's also aimed at reflecting real workloads. Would more complex methods also result in more complex computations or just in more complex weight generation?

We do have some code to make the weights generation in parallel but I would say it is still experimental and of limited scope.

Please let us know if we can help with anything from a Dask perspective.

@mrocklin:

Do we have enough context here to add this to the benchmark post? If you give me bullet points and an image (if one makes sense) I'm happy to write up words.

I'll whip something up. At first glance, this benchmark seems to do alright; it's mostly an embarrassingly parallel computation. Performance and # of tasks could probably look better but that's already a lot better than some of the other benchmarks.

@aulemahal
Copy link

Would more complex methods also result in more complex computations or just in more complex weight generation?

Mostly more complex weight generation, which is totally on the ESMF side, so partly in C/Fortran I think. Maybe two very different grids (curvilinear ones for example) and more complex methods would make for weights with more connected nodes, but I don't think this would affect the computation so much.

@slevang
Copy link

slevang commented Oct 9, 2024

xarray-regrid is a much less established tool than xesmf, but could alternatively (or in addition) be used as a more "pure" test of the dask workload. It assumes rectilinear grids and therefore separates the operations along each axis, which makes weight generation near instantaneous.

The resulting dask workload is very similar to xesmf, just an einsum(data, weights), at least for the conservative method. I wrote up a little notebook comparing the two tools here. Both libraries now use sparse weights.

The case in which we have chunking along the dimensions to regrid would also be interesting to add to your benchmark, but I don't know of any publicly available equivalents to the GCP ERA5 ARCO stores with that sort of chunking.

@hendrikmakait
Copy link
Member Author

@slevang, thanks for the additional input! Would you be interested in contributing a benchmark implemented with xarray-regrid?

Copy link
Contributor

@phofl phofl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comment, otherwise lgtm

AB_environments/AB_sample.conda.yaml Show resolved Hide resolved
@phofl phofl merged commit ae4f68b into main Oct 10, 2024
5 checks passed
@phofl phofl deleted the hendrik/regridding-benchmark branch October 10, 2024 16:57
@phofl
Copy link
Contributor

phofl commented Oct 10, 2024

thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regridding
5 participants