Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hack Wrapup #7

Open
1 of 7 tasks
jbusecke opened this issue Jun 3, 2024 · 2 comments
Open
1 of 7 tasks

Hack Wrapup #7

jbusecke opened this issue Jun 3, 2024 · 2 comments

Comments

@jbusecke
Copy link
Owner

jbusecke commented Jun 3, 2024

Great day hacking on using virtualizarr to produce ref files on ESGF with @sashakames.

We successfully produced a few ref files, exposed them via HTTp, and were able to access (and compute on them) in some environments.

Bugs:

  • The metadata is lost in the combination step (@jbusecke will provide fix). (fixed in 3baa6a9)

Optional:

  • @jbusecke parallelize virtual dataset generation

Plan to wrap up proof of concept:

  • Produce a few more demo references
  • Write instructions on how to access them
  • Send out to users to test (@jbusecke was not able to compute on the data via Google Cloud)
  • Write up for report.

We should also wait until zarr-developers/VirtualiZarr#126 is fully tested and merged until we produce a lot of references. I believe there is currently a bug in the PR, but it is easy enough to circumvent.

@jbusecke
Copy link
Owner Author

jbusecke commented Jun 3, 2024

@sashakames I just cleaned up the code a little bit.
Lets use https://github.com/jbusecke/esgf-virtual-zarr-data-access/blob/main/virtual-zarr-script.py and https://github.com/jbusecke/esgf-virtual-zarr-data-access/blob/main/requirements.txt to produce the next files.

I will add dependencies and code there to parallelize virtual data and fix the bugs above.

@TomNicholas
Copy link

TomNicholas commented Jun 3, 2024

The metadata is lost in the combination step

Do you mean the metadata stored in the xarray .attrs or something else?

Yes that is right. In my initial attempt the metadata was lost during the concat step due to the default behavior of xarray to just drop all of it. I have now set this option to only drop conflicts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants