apply_ufunc with dask="parallelized" and "allowed" #315

dougiesquire · 2021-05-09T04:21:01Z

A large number of xskillscore methods use xarray's apply_ufunc with dask="parallelized" for dask array support. A preferred option if the wrapped function natively supports dask arrays is to use dask="allowed". See here for details.

This issues list all current methods within xskillscore that use apply_ufunc and tries to summarise for each method how much work is involved in enabling dask="allowed".

xskillscore.contingency

gerrity_score : already dask="allowed"

xskillscore.deterministic

xskillscore.probabilistic

crps_gaussian : this is a wrapper on properscoring which expects numpy arrays (and actually triggers compute with dask="allowed"). Getting dask="allowed" working properly would require a full refactor of properscoring
crps_quadrature : this is a wrapper on properscoring which expects numpy arrays (and actually triggers compute with dask="allowed"). Getting dask="allowed" working properly would require a full refactor of properscoring
crps_ensemble : this is a wrapper on properscoring which expects numpy arrays (and actually triggers compute with dask="allowed"). Getting dask="allowed" working properly would require a full refactor of properscoring
brier_score : this is a wrapper on properscoring which expects numpy arrays (and actually triggers compute with dask="allowed"). Getting dask="allowed" working properly would require a full refactor of properscoring
threshold_brier_score : this is a wrapper on properscoring which expects numpy arrays (and actually triggers compute with dask="allowed"). Getting dask="allowed" working properly would require a full refactor of properscoring
rank_histogram : need to wrap bottleneck.nanrankdata with dask.map_blocks or equivalent, although I don't know if this is any better than using dask="parallelized"
reliability : already dask="allowed"

xskillscore.resampling

resample_iterations_idx : use dask moveaxis when dask array. This would be easily handled with duck array ops - see below.

The text was updated successfully, but these errors were encountered:

dougiesquire · 2021-05-09T04:22:34Z

As a general note, I'd suggest that we implement in xskillscore something like xarray's duck_array_ops module. This would make some of the suggestions above very easy to implement/read and replace, for example, the _get_numpy_funcs function in xskillscore.deterministic. I'll try to open a PR for this when I next find some time.

dougiesquire · 2021-05-09T04:36:28Z

I should also point out that even for those functions that ostensibly can just be switched to dask="allowed", I think @ahuang11 encountered some issues when the forecasts, observations and weights are not all dask or all numpy arrays. Details here. We'll need to resolve these issues with the first PR.

dougiesquire mentioned this issue May 9, 2021

Replace parallelized with allowed #221

Closed

aaronspring mentioned this issue Dec 22, 2021

Using dask="allowed" is slightly faster #207

Open

aaronspring added refactor component: dask labels Oct 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

apply_ufunc with dask="parallelized" and "allowed" #315

apply_ufunc with dask="parallelized" and "allowed" #315

dougiesquire commented May 9, 2021

dougiesquire commented May 9, 2021

dougiesquire commented May 9, 2021

apply_ufunc with dask="parallelized" and "allowed" #315

apply_ufunc with dask="parallelized" and "allowed" #315

Comments

dougiesquire commented May 9, 2021

xskillscore.contingency

xskillscore.deterministic

xskillscore.probabilistic

xskillscore.resampling

dougiesquire commented May 9, 2021

dougiesquire commented May 9, 2021