Duplicate Sentinel-2 L2A items removed #275
TomAugspurger
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi all,
While doing some quality checks on our sentinel-2-l2a collection, we noticed some duplicate items. A previous version of our processing pipeline failed to handle how sen2cor embeds a processing timestamp in the asset paths when converting L1C data to L2A (this is the second datetime in our sentinel-2-l2a item IDs).
This would result in multiple sentinel-2-l2a STAC items for a single L1C scene, and so duplicate results for queries that should have just returned a single item for a specific area of interest and datetime.
We've fixed that issue in our processing pipeline and have removed the duplicates from the STAC database. We'll be deleting the duplicate assets in the next week or so.
I've uploaded a parquet file with the duplicates and originals (both IDs and prefixes in Blob Storage) to https://ai4edatasetspublicassets.blob.core.windows.net/assets/sentinel-2-l2a-duplicates.parquet.
Beta Was this translation helpful? Give feedback.
All reactions