Parallel model runs: separate forcing files for each model #35

BSchilperoort · 2024-05-08T09:21:01Z

In eWaterCycle/leakybucket-bmi#3 we debugged an issue related to resource locking when models are trying to access a shared file.
In this specific case it is a model where it's possible to easily change the model code (as you're the creator of the model, @Daafip ), but for a general solution it would be best that models do not share the same resources on disk.

Therefore I propose that the default behavior of ewatercycle-da is to have separate forcing objects for separate models (at least for parallel runs).

Still todo:

If a single forcing object is passed, duplicate the forcing so that each ensemble member has its own forcing object.
HBVForcing will have to change (long term todo) to accomodate this

Daafip · 2024-05-08T09:28:13Z

to not share the same resources on disk

For now i've left this choice up to the user but definitely something to document and make clear.
I'd like to keep providing a single forcing object as a option for initial users who say want to use it for just a quick sensitivity analysis.
- A work around could be to duplicate the forcing object in case just one is provided?

at least for parallel runs

Currently I've implemented parrallel runs only when running models in docker, for local models it seems to introduce too much overhead, but I still encountered the locking issue.

BSchilperoort · 2024-05-08T09:52:10Z

For now i've left this choice up to the user but definitely something to document and make clear.

I think having it as an option is OK, while by default using separate forcing files.

A work around could be to duplicate the forcing object in case just one is provided?

I would not call this a workaround, but rather the desired functionality.

Currently I've implemented parrallel runs only when running models in docker

Yes I would keep it that way. Local models should only be for developing/testing if the model works as ewatercycle plugin.

Daafip · 2024-05-08T10:12:10Z

Should be an easy enough fix, will also ensure works nicely with DA.generate_forcing, which allows for generation of forcinging in parrallel.

I'll discus today how much more dev i'll be doing towards the end of my thesis.

Edit: not much

Daafip · 2024-05-10T09:08:52Z

but for a general solution it would be best that models do not share the same resources on disk.

So the issue was with using the same forcing object. But in generating the forcing I also had issues where if I had the same source .txt file containing the timeseries, it would also fail:

...
  File ".../ewatercycle_HBV/model.py", line 77, in _make_cfg_file
    self.forcing.from_camels_txt()
  File "..../ewatercycle_HBV/forcing.py", line 204, in from_camels_txt
    ds = xr.Dataset(data_vars=df,
  File ".../xarray/core/dataset.py", line 694, in __init__
    variables, coord_names, dims, indexes, _ = merge_data_and_coords(
  File ".../xarray/core/dataset.py", line 423, in merge_data_and_coords
    return merge_core(
...

So even source resources can cause issues it seem...

edit: now tried with adjusting this so there is one txt file per forcing object. This works fine.
From testing on Windows Subsystem Linux (not ideal), I still have a memory issue causing instability.

Edit edit: Seems to be a memory issue on WSL, when runnnig (slower) on surf, this doesn't seem to be an issue. IO is just hit first as it loads it into memory.

Daafip · 2024-05-15T07:47:52Z

After more testing, it is a memory issue with WSL. Running bare ubuntu on my machine only used a fraction of the memory WSL consumed (which makes sense).

Daafip · 2024-05-16T12:48:06Z

it is a memory issue with WSL

For this reason i'll leave it as is at the moment, but something to still look at. Thus moved the todo points up to your initial comment so it shows nicecly on git (sorry for editing your comment bart).

BSchilperoort · 2024-05-24T07:22:03Z

it is a memory issue with WSL

Glad that you figured it out!

For this reason i'll leave it as is at the moment, but something to still look at. Thus moved the todo points up to your initial comment so it shows nicecly on git (sorry for editing your comment bart).

No worries haha.

HBVForcing will have to change (long term todo) to accomodate this

We can also add a copy method to the base class (ewatercycle's DefaultForcing). Forcings should be self contained in a directory, and should be movable (only relative paths), so you would just have to copy the directory to a new location.

for i in range(n_ensembles):
    target_dir = forcing.copy(ensemble_forcings_dir / "ensemble_{i}")

which internally would call;

class DefaultForcing:
    ...
    def copy(self, target: Path | str) -> Path:
        return Path(shutil.copytree(src=self.directory, dst=target))

BSchilperoort mentioned this issue May 8, 2024

Issues using xarrays in update step eWaterCycle/leakybucket-bmi#3

Closed

Daafip mentioned this issue May 8, 2024

Issues with crashing of large runs #31

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel model runs: separate forcing files for each model #35

Parallel model runs: separate forcing files for each model #35

BSchilperoort commented May 8, 2024 •

edited by Daafip

Loading

Daafip commented May 8, 2024 •

edited

Loading

BSchilperoort commented May 8, 2024

Daafip commented May 8, 2024 •

edited

Loading

Daafip commented May 10, 2024 •

edited

Loading

Daafip commented May 15, 2024

Daafip commented May 16, 2024

BSchilperoort commented May 24, 2024

Parallel model runs: separate forcing files for each model #35

Parallel model runs: separate forcing files for each model #35

Comments

BSchilperoort commented May 8, 2024 • edited by Daafip Loading

Daafip commented May 8, 2024 • edited Loading

BSchilperoort commented May 8, 2024

Daafip commented May 8, 2024 • edited Loading

Daafip commented May 10, 2024 • edited Loading

Daafip commented May 15, 2024

Daafip commented May 16, 2024

BSchilperoort commented May 24, 2024

BSchilperoort commented May 8, 2024 •

edited by Daafip

Loading

Daafip commented May 8, 2024 •

edited

Loading

Daafip commented May 8, 2024 •

edited

Loading

Daafip commented May 10, 2024 •

edited

Loading