-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mosaicking #1
Comments
Agreed that it would be better to implement mosaicing as a reduction instead of using GDAL. VRTs in particular have some weird scaling behavior related to the order in which files are read and how they interact with the various GDAL caches - some are per file-handle which can cause memory leaks on VRTs with many files. I think you have the right approach overall; if you can do it natively with numpy/dask then let's use that, and fall back to GDAL as needed.
This is an interesting question. My assumption after giving the repo a look over was that it would support mosaics (although this isn't stated anywhere). I think mostly because Item Collections (basically a list of items) aren't necessarily always stacked perfectly so support of mosaicing is implicit in my opinion. |
Another option would be to combine GDAL + Dask. For example: |
Certainly an option—just as we discussed above, there aren't many benefits to using GDAL for it, and likely a few downsides. |
`stackstac.show` and `stackstac.add_to_map` display Dask-backed DataArrays on ipyleaflet maps. Other changes: * Exposed some handy spatial and miscellaneous operations in the public API (`reproject_array`, `xyztile_of_array`, etc.) * Exposed `stackstac.mosaic`! Closes #1. * Reorganized docs to have an examples subsection and base API reference page * Added visualization notebook from my webinar
From pangeo-data/cog-best-practices#4 (comment):
I didn't add any mosaicking directly (at the GDAL level), since you can actually do it pretty easily with plain dask/numpy. Something like:
As far as I know, there aren't really any advantages to doing the mosaic in GDAL versus in dask. The one advantage GDAL could theoretically have is that it could short-circuit, and stop loading additional datasets as soon as the output image is already fully-filled-in—however, I don't know if GDAL actually implements this logic. And even if it does, the performance gains of early termination would quickly lose out to the cost of loading each dataset serially. Basically, I think you're better off letting dask read everything in parallel, then throwing away some data, compared to worst-case having GDAL read hundreds of datasets in serial.
So short answer: yes, this is focused only on "stacking", because I think of "mosaic" as just one among many reduction operations you might want to do to a stack (mean, median, quality-band mosaic, etc.).
The bigger question is whether offering a
mosaic
function is in scope for this project. Personally, I'd like to be, but it should probably be on an xarray accessor, which starts to bump up against the territory of rioxarray.The text was updated successfully, but these errors were encountered: