Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hi from xarray #88

Open
rabernat opened this issue Feb 8, 2019 · 4 comments
Open

hi from xarray #88

rabernat opened this issue Feb 8, 2019 · 4 comments

Comments

@rabernat
Copy link

rabernat commented Feb 8, 2019

Thanks for working on this important project.

I am a developer on xarray, a python library for working with netcdf-style data. You can think of an xarray dataset as an in-memory representation of a netCDF file. However, xarray datasets don't have to to be backed by netCDF files. We are experimenting with new storage backends such as zarr.

We are exploring ways to export the metadata from xarray objects (see e.g. pydata/xarray#2659). I just discovered binarray-array-ld, netcdf-ld, etc. We might be interested in finding a way to implement these standards from xarray.

I thought I would just drop to say hi and open a conversation about how xarray and bald might interact in the future.

@marqh
Copy link
Member

marqh commented Nov 1, 2019

Hi there @rabernat

we're spinning up work on this activity again, after a hiatus

we're interested in conversations about producing RDF metadata graphs in gereralised fashion. the majority of recent work has taken place on implementing a first standard for interpretation directly from netCDF, this is one of a number of potential avenues we want to explore.

The work is being sponsored by the open geospatial consortium OGC and is being managed here:
https://github.com/opengeospatial/netCDF-Classic-LD

This library is now working towards an implementation of that standard. However it is already not limited to netCDF, with a partial implementation of HDF already set up within the library, although standardisation work on HDF has not yet progressed.

May I ask:

  1. is it the export of metadata that is the key deliverable for you?
  2. How would you assess the requirements for which information should be exported, and what is best kept in the binary payload?
  3. Do you see RDF graph as a useful encoding and processing format for this export?

many thanks
marqh

@nikokaoja
Copy link

Hi @rabernat and @marqh ,

I've been working last few years on converting large amount of our legacy and current data to be more 'FAIR'. I have been using xarray and netCDF and I was looking for a package that can convert netCDF and/or xarray to RDF to be inline with GO FAIR recommendations ((meta)data). As I am dealing with wind energy related data (atmosphere + turbine, modeling + measurements) you can imagine I am basically dealing with any possible combination of spatio-temporal data structures (time-series, single point, multi-point, grids, volumes, etc.) and with large data volumes.

Basically, I would like to contribute to the project. I have a plethora of use cases, in parallel building ontologies/restricted vocabs for wind-energy domain and re-using existing ones (e.g., cf).

Let me know how can I join the development.

Best,

Nikola

@marqh
Copy link
Member

marqh commented Nov 19, 2019

Hi @niva83

it is interesting to hear of your interest, it'd be great to explore your input more

in terms of joining the development, you've already made the first step.

I would suggest exploring the integration tests and considering submitting example CDL files. If we've set up the tests right, then just adding a CDL file in a Pull Request should trigger some limited testing of the file.

If you do this, please leave data payloads unspecified to keep the CDL files small.

In terms of use cases, I suggest the creation of specific issues to look to address individual facets. These can be for information, as some things may already be supported, or for future work, we can assess these status notions as we go

the issue tracker should be open to all, so please feel free to raise issues and begin to explore where this capability has progressed to

all the best
mark

@nikokaoja
Copy link

@marqh sounds as a plan!

I will prepare several test cases, i.e. small CDL files with different dimensional 'challenges' (point, profile, spatial, volumetric measurements). Also I bump into some issues running the library on my computer which I will report on Git.

Best,

Nikola

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants