Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xr.open_dataset user experience on S1-SLC #4

Closed
corrado9999 opened this issue Apr 11, 2021 · 5 comments
Closed

xr.open_dataset user experience on S1-SLC #4

corrado9999 opened this issue Apr 11, 2021 · 5 comments
Assignees
Labels
design Design choices

Comments

@corrado9999
Copy link
Collaborator

Issue to analyse the possible behaviour when calling xr.open_dataset on Sentinel 1 SLC data.

@corrado9999
Copy link
Collaborator Author

For acquisition modes with multiple bursts (all but stripmap), bursts "live" in separate spaces, because they differ at least in the range or in the azimuth dimension. Thus, we cannot put them in a single dataset. When Xarray will devise a data structure including multiple datasets we will be able to exploit it but, for the moment, we will expose each bursts as one group.

We are left with the problem of how to tell the user which groups are available.

  • Option 1 (currently implemented): return a dataset with just basic metadata a groups attributes listing which groups are available. To shorten the list, we can limit to the first level (e.g. only subswaths).
<xarray.Dataset>
Dimensions:  ()
Data variables:
    *empty*
Attributes: (12/13)
    ...                         ...
    groups:                     ['orbit', 'IW1', 'IW2', 'IW3']
  • Option 2a: expose a specific function to advertise which groups are availables:
>>> xarray_sentinel.list_groups("S1B_IW_SLC__1SDV_20210401T052622_20210401T052650_026269_032297_EFA4.SAFE")
['orbit', 'IW1', 'IW2', 'IW3']
>>> xarray_sentinel.list_groups("S1B_IW_SLC__1SDV_20210401T052622_20210401T052650_026269_032297_EFA4.SAFE", "IW1")
['IW1/1', 'IW1/2', 'IW1/3', 'IW1/4', 'IW1/5', 'IW1/6', 'IW1/7', 'IW1/8', 'IW1/9']
  • Option 2b: similarly to what rioxarray.open_dataset does, expose a specific function to load all the available groups as datasets (with options to filter in/out?). Can return either a dict or a custom class.

  • Option 3: raise an error, listing the available groups. Here, I do not think would be correct to shorten the list, because a user would not expect to obtain a further error when following the advise given by the first exception.

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-2-bfe1a817db51> in <module>
----> 1 xr.open_dataset("tests/data/S1B_IW_SLC__1SDV_20210401T052622_20210401T052650_026269_032297_EFA4.SAFE")

~\miniconda3\envs\xr-sentinel\lib\site-packages\xarray\backends\api.py in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, backend_kwargs, *args, **kwargs)
    507
    508     overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 509     backend_ds = backend.open_dataset(
    510         filename_or_obj,
    511         drop_variables=drop_variables,

~\xarray-sentinel\xarray_sentinel\sentinel1.py in open_dataset(self, filename_or_obj, drop_variables, group)
    155     ) -> xr.Dataset:
    156         if group is None:
--> 157             raise NotImplementedError("Cannot access to root dataset, please select one of the groups: orbit, IW1/1, IW1/2, ..., IW2/1, IW2/2, ..., IW3/1, IW3/2, ...")
    158         elif group == "gcp":
    159             ds = open_gcp_dataset(filename_or_obj)

NotImplementedError: Cannot access to root dataset, please select one of the groups: orbit, IW1/1, IW1/2, ..., IW2/1, IW2/2, ..., IW3/1, IW3/2, ...

@alexamici
Copy link
Member

alexamici commented Apr 13, 2021

I just realised there is a possible:

Example usage:

>>> slc = xr.open_dataset(".../manifes.safe", engine="sentinel1")
>>> list(slc.sentinel.swaths)
["IW1", "IW2", "IW3"]
>>> iw1 = slc.sentinel.swaths["IW1"]
>>> iw1
<xarray.Dataset>
...
>>> list(iw1.sentinel.bursts)
["N430_W0120_VV", ...]
>>> iw1.sentinel.bursts["N430_W0120_VV"]
<xarray.Dataset>
...
>>>

Or for a flatter experience:

>>> slc = xr.open_dataset(".../manifes.safe", engine="sentinel1")
>>> list(slc.sentinel.dataset)
[“IW1/orbit", “IW1/gcp", “IW1/N430_W0120_VV", ...]
>>> slc.sentinel.dataset[“IW1/N430_W0120_VV"]
<xarray.Dataset>
...
>>>

@corrado9999
Copy link
Collaborator Author

Option 4 looks nice, but AFAIK it has a main drawback: the accessor will be present on every dataset, independetly on whether it has been opened with sentinel_xarray or not, which looks pretty odd.

@alexamici
Copy link
Member

alexamici commented Apr 13, 2021

W00t! You are right. That makes option 4 quite ugly, indeed.

@alexamici
Copy link
Member

Current agreement with @aurghs is to map the data to open_dataset group option and leave it as similar as possible to the original structure. The structure is described in https://github.com/bopen/xarray-sentinel/blob/main/docs/DATATREE.md

Once xarray solves pydata/xarray#4118 we may reassess (tracked by #60).

I would propose to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Design choices
Projects
None yet
Development

No branches or pull requests

3 participants