Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xr.open_dataset no groups info #2916

Open
ThetomekK opened this issue Apr 24, 2019 · 7 comments
Open

xr.open_dataset no groups info #2916

ThetomekK opened this issue Apr 24, 2019 · 7 comments

Comments

@ThetomekK
Copy link

Code Sample, a copy-pastable example if possible

I have to write some data to disk using xr.to_netcdf() method. The data must be structured in groups so i use the group key word. Reading the .nc file from disk leads to empty data variables if group not supplied. Here some samples

data_ds.to_netcdf(path=savepath,mode='w',format='NETCDF4',group='Audio',engine='netcdf4')

datafromdisk = xr.open_dataset(savepath)
datafromdisk

<xarray.Dataset>
Dimensions:  ()
Data variables:
    *empty*

datafromdisk = xr.open_dataset(savepath,group='Audio')

<xarray.Dataset>
Dimensions:  (time: 15360000)
Coordinates:
  * time     (time) datetime64[ns] 2017-05-30T07:40:00 ... 2017-05-30T07:49:59.992280938
Data variables:
    audio    (time) float32 ...
Attributes:
    unit:     Pa

Problem description

Actually this is not a real problem, if you keep in mind what data groups are strored in a .nc file.
At the moment, i work arround with netCDF4 to get infos about possible groups within a .nc file.

from netCDF4 import Dataset
rootgrp = Dataset(savepath)
rootgrp

<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
    dimensions(sizes): 
    variables(dimensions): 
    groups: Audio

# or

rootgrp.groups

OrderedDict([('Audio', <class 'netCDF4._netCDF4.Group'>
              group /Audio:
                  unit: Pa
                  dimensions(sizes): time(15360000)
                  variables(dimensions): float32 audio(time), float64 time(time)
                  groups: )])

Expected Output

Well, at least something like this i would appreciate:

datafromdisk = xr.open_dataset(savepath)
datafromdisk

<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
    dimensions(sizes): 
    variables(dimensions): 
    groups: Audio

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.1 | packaged by conda-forge | (default, Mar 13 2019, 13:32:59) [MSC v.1900 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: en LOCALE: None.None libhdf5: 1.10.4 libnetcdf: 4.6.2

xarray: 0.12.1
pandas: 0.24.2
numpy: 1.16.2
scipy: 1.2.1
netCDF4: 1.5.0.1
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.0.3.4
nc_time_axis: None
PseudonetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.2.1
dask: 1.2.0
distributed: 1.27.0
matplotlib: 3.0.3
cartopy: None
seaborn: None
setuptools: 41.0.0
pip: 19.0.3
conda: None
pytest: None
IPython: 7.4.0
sphinx: 2.0.1

@dcherian
Copy link
Contributor

I'm in favour of printing out a nice warning message when the netcdf file has groups. Listing them would be even better.

Also it looks like the docs need to be updated to mention netCDF groups: https://xarray.pydata.org/en/stable/io.html#netcdf

@shoyer
Copy link
Member

shoyer commented Apr 24, 2019

It might make sense to print a warning if a group was not explicitly selected, the root group is empty and there is another non-empty group. Though I'm a little reluctant to do this since empty groups are perfectly valid, and it's a little annoying to get warnings for things that may not be programmer errors.

@ThetomekK
Copy link
Author

@shoyer
In my opinion, if xarray offers optional groups in to_netcdf() method, it should consequently provide a. groups attribute, if a.nc file contains groups and is read without group selection.

@shoyer
Copy link
Member

shoyer commented Apr 25, 2019

see #1092 for discussion about Dataset groups in xarray's data model

@dcherian
Copy link
Contributor

dcherian commented May 3, 2019

@shoyer What about adding groups to just the repr? That way the user knows there are group names they can pass to open_dataset

<xarray.Dataset>
Dimensions:  ()
Groups: Audio
Data variables:
    *empty*

@zdgriffith
Copy link
Contributor

I work with netCDF groups regularly and am interested in this issue. I agree with @dcherian that having groups added to the Dataset repr when a netCDF file with groups is loaded would be helpful (regardless of if the root group is empty or not) and pretty unobtrusive.

@dcherian
Copy link
Contributor

Actually I don't think that is a good idea any more. A Dataset represents a single group so it's weird to print group info under a dataset repr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants