Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add File Info Extension v1.0.0 -> v2.0.0 migration #472

Closed
duckontheweb opened this issue Jun 23, 2021 · 13 comments
Closed

Add File Info Extension v1.0.0 -> v2.0.0 migration #472

duckontheweb opened this issue Jun 23, 2021 · 13 comments
Labels
bug Things which are broken extension Implement a STAC extension in PySTAC question
Milestone

Comments

@duckontheweb
Copy link
Contributor

Support for v2.0.0 of the File Info Extension was added in #442, but the migrate hook was not updated to include a migration from v1.0.0 to the new version. That hook should be updated to include the migration.

Based on my understanding of the changes between versions, the following fields should be migrated to Raster Bands Object fields (dropping the file: prefix as well):

  • file:bits_per_sample
  • file:data_type
  • file:nodata
  • file:unit

@emmanuelmathot @matthewhanson @m-mohr Feel free to correct this if I got anything wrong.

@duckontheweb duckontheweb added the bug Things which are broken label Jun 23, 2021
@duckontheweb duckontheweb added this to the 1.0.0 milestone Jun 23, 2021
@emmanuelmathot
Copy link
Contributor

I am not sure we can easily transfer those information that were describing the file to one or more raster bands. There are several cases that should be supported and thus several implementation rules must be defined. A basic rule could be to apply the file fields to all raster bands corresponding fields. If the assets of the original item does not describe any raster band, is a default raster band created?

@duckontheweb
Copy link
Contributor Author

A basic rule could be to apply the file fields to all raster bands corresponding fields.

I was thinking that if raster:bands is defined we would check if the field is already defined in any of the Raster Band Objects. If it is, we would take no action since we would not want to overwrite a value defined by the Raster Extension. If it is not defined, then we would populate that field with the file:* value.

If the assets of the original item does not describe any raster band, is a default raster band created?

Yeah, this is a tricky. I was thinking that we should create a default Raster Band Object so that we don't lose the information. However, that could be misleading if an asset has more than 1 band but we create only a single Raster Band Object. It also doesn't seem possible to discover the number of Raster Band Objects that should be created without opening the file, which is not something we want to do here. What is your opinion on this @emmanuelmathot?

@emmanuelmathot
Copy link
Contributor

That is really tricky. We should avoid add raster bands to non raster assets. I would create a default raster band so that we do not loose the information . But I would add some kind of filters to be sure that the asset is a potential raster (e.g. mime-type, file extension)

@duckontheweb
Copy link
Contributor Author

We should avoid add raster bands to non raster assets. I would create a default raster band so that we do not loose the information . But I would add some kind of filters to be sure that the asset is a potential raster (e.g. mime-type, file extension)

Are there legitimate cases in File Info Extension v1.0.0 where someone would have used these fields for something besides a raster? It seems like these fields were always implicitly intended for raster files and that we just made that more explicit by splitting the extensions (I may not be aware of the full evolution of these, though, so this could be off-base). If this is true, we may just want to give the publisher the benefit of the doubt and always create the Raster Band rather than trying to filter on all of the possible MIME types and extensions. However, if there are other legitimate uses this may be our only viable option.

cc @m-mohr

@m-mohr
Copy link
Contributor

m-mohr commented Jun 28, 2021

I guess I could construct use cases, but I doubt someone has actually implemented file for anything other than raster.

I'm not sure though how to convert to raster:bands. Is it always the case that a global e.g. file:nodata, can be converted to a "single band" raster:bands containing file:nodata? We are assuming it is one band, but that may not be true.

@duckontheweb duckontheweb modified the milestones: 1.0.0, 1.1.0 Jul 8, 2021
@duckontheweb duckontheweb modified the milestones: 1.1.0, 1.2.0 Jul 30, 2021
@duckontheweb
Copy link
Contributor Author

Given some of the uncertainties around handling the case where no Raster Band exist it might be best not to attempt any migration in that case.

In that case, the logic would be to migrate file:bits_per_sample, file:data_type, file:nodata, file:unit as follows:

  1. If no Raster Bands already exist on the object, do nothing
  2. If Raster Bands do exist, then for each of these properties:
    i) If any of the Raster Band objects contain the field already, do nothing
    ii) If none of the Raster Band object contain the field, add it to each existing Raster Band

@m-mohr @emmanuelmathot How do you feel about this approach?

@m-mohr
Copy link
Contributor

m-mohr commented Aug 17, 2021

Maybe additionally: If no Raster Bands already exist, create as many as eo:bands exist and add the field to each of them.

But maybe I may not even try to migrate it in JS and just handle leave implementation on v1.0.0.

@duckontheweb duckontheweb modified the milestones: 1.2.0, 1.3.0, 1.4.0 Nov 8, 2021
@duckontheweb duckontheweb added the extension Implement a STAC extension in PySTAC label Jan 24, 2022
@duckontheweb duckontheweb removed this from the 1.4.0 milestone Feb 16, 2022
@duckontheweb duckontheweb added this to the 1.7 milestone Jul 22, 2022
@gadomski gadomski modified the milestones: 1.7, 1.8 Jan 31, 2023
@jsignell
Copy link
Member

jsignell commented Jun 1, 2023

I am wondering if this is still necessary given its age? I would think that most providers have probably moved to the newer version over the last 2 years.

@gadomski
Copy link
Member

gadomski commented Jun 1, 2023

Let's keep the issue open but not roadmap it. IMO dropping support for an old extension version should be avoided if possible, at least after v1 of that extension.

I'm soft-banking on our (as of yet undeveloped) solution for #448 being so good that it makes issues like this trivial 🤞🏼.

@m-mohr
Copy link
Contributor

m-mohr commented Jun 1, 2023

FYI, some implementations still use v1 as some removed v1 functionality can't easily be mapped to other extensions (e.g. global no data value). But that also means the migration path is not trivial.

@gadomski gadomski removed this from the 1.8 milestone Jun 5, 2023
@gadomski gadomski added this to the 1.9 milestone Sep 28, 2023
@gadomski gadomski self-assigned this Sep 28, 2023
@gadomski gadomski moved this from Todo to In Progress in STAC Sprint 2023 Sep 28, 2023
@m-mohr
Copy link
Contributor

m-mohr commented Sep 28, 2023

With the bands PR this can now indeed be solved :-)

@jsignell
Copy link
Member

#1265 makes it so if you load an object with any of those removed fields you get a warning. Not sure if that is enough to close this ticket, but maybe it is enough until raster bands is fully baked.

@gadomski
Copy link
Member

I think it's good enough. If folks start complaining about the migration w/ warnings, we can open a ticket to actually move things over once raster bands is baked. But that's fine as a new issue if it arises, IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Things which are broken extension Implement a STAC extension in PySTAC question
Projects
None yet
Development

No branches or pull requests

5 participants