Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example notebooks + multiple major changes #1470

Merged
merged 31 commits into from
Sep 21, 2022
Merged

Conversation

AdeelH
Copy link
Collaborator

@AdeelH AdeelH commented Sep 2, 2022

Overview

This PR adds the following 5 notebooks:

In service of the above, it makes the following major code changes:

  • Removes ActivateMixin entirely.
  • Replaces STRTree usage with GeoPandas GeoDataFrame-based spatial joins in ChipClassificaitonLabelSource and RasterizedSource.
    • This also fixes the following warnings:
      • ShapelyDeprecationWarning: STRtree will be changed in 2.0.0 and will not be compatible with versions < 2.
      • ShapelyDeprecationWarning: Setting custom attributes on geometry objects is deprecated, and will raise an AttributeError in Shapely 2.0
  • Removes the mask-to-polygons dependency.
    • Because it blocked bumping up the Shapely version.
    • As part of this, also removes the "vector evaluation" in SS Evaluator which depended on it.
      • It was not generally useful enough.
  • Adds numpy-like array indexing and slicing to RasterSource and LabelSource.
  • Refactors x_shift and y_shift in RasterioSource(Config) into a VectorTransformer.
  • Removes extent_crop added in Add ability to crop raster source extent #1030.
    • The user can now manually specify an extent instead.

Other minor changes:

  • Enhances CRSTransformer so that it can now operate directly on: RV Boxes, Rasterio Windows, and Shapely geoms.
  • Simplifies MultiRasterSource by removing SubRasterSourceConfig and raw_channel_order.
  • Adds StatsTransformer.from_raster_sources() for easy initialization.
  • Enhances RasterioSource so that it only reads the required bands.
  • Disambiguates RasterSource.num_channels into RasterSource.num_channels and RasterSource.num_channels_raw.
  • Adds more progress-bars
    • To geojson util functions and LocalFileSystem.
    • All progress-bars are now tqdm-based.
    • Progress-bars now have a 5 seconds delay. Which means they only show up if processing takes longer than 5 seconds. This gets rid of spurious progress-bars.
  • Makes it easier to get colors in a standard format from ClassConfig.
  • Refactors several getters in RasterSource and other classes into properties.
  • Slightly refactors ChipClassificationLabels.
  • Fixes a bug in buffer_geoms() plus other minor improvements in geojson.py.
  • Misc. Learner tweaks:
    • Make scene_dataset and window_opts optional in GeoDataConfig.
    • Allow calling Learner.train() with custom number of epochs.
  • Misc. Box tweaks.
  • Fixes a CI bug by pinning pyopenssl version.
  • Remove no-longer-needed patch for torch.hub (External Model Definition Functionality Potentially Broken in PyTorch 1.9 #1271).
  • Updates the grep regex in Dockerfile to handle blank lines and comment lines.

Checklist

  • [ ] Added needs-backport label if PR is bug fix that applies to previous minor release
  • Ran scripts/format_code and committed any changes
  • Documentation updated if needed
  • PR has a name that won't get you publicly shamed for vagueness

Notes

Requirements changes:

  • mask-to-polygons removed
  • Shapely bumped to ==1.8.4.
  • GeoPandas and PyGEOS added.

Backward-compatibility breaking changes:

  • Removal of SubRasterSourceConfig.
  • Refactoring of x/y_shift.
  • Make background_class_id required in ChipClassificationLabelSourceConfig if inferring cells.

Testing Instructions

  • See updated unit tests.
  • See notebooks.

Closes #1275
Part of #1460
Part of #1459

@AdeelH AdeelH force-pushed the nb branch 13 times, most recently from 2c0dc46 to e7cb076 Compare September 9, 2022 11:32
@AdeelH AdeelH force-pushed the nb branch 7 times, most recently from 69accd2 to cab9765 Compare September 13, 2022 17:24
@AdeelH AdeelH changed the title [WIP] Add example notebooks Add example notebooks + multiple major changes Sep 13, 2022
- remove get_transformed_window() method from RasterioSource
- remove SubRasterSourceConfig
- update usages in examples and tests
What was previously supposed to be num_channels is now num_channels_raw. num_channels is now defined to be len(channel_order). This is still problematic because there is not guarantee that the raster transformers won't change the number of output channels.
- move minimum necessary code to core/data/utils/vectorization.py
- update usages
- remove other references to mask_to_polygons
- remove unused min_aspect_ratio field from BuildingVectorOutputConfig
- set default denoise radius value to 8
- remove "vector evaluation"
- fix bug in buffer_geoms()
- fix shapely deprecation warning:
```
ShapelyDeprecationWarning: __len__ for multi-part geometries is deprecated and will be removed in Shapely 2.0. Check the length of the `geoms` property instead to get the  number of parts of a multi-part geometry.
```
guard against a sneaky bug
    - https://stackoverflow.com/questions/26320899/why-is-the-empty-dictionary-a-dangerous-default-value-in-python
- allow geojson file to be cached in GeoJSONVectorSource
- vector source performance improvements
    - cache transformed geojson
    - allow geojson utils to skip processing if no eligible geoms
- Add delay of 5 sec to progressbars in other FileSystems. This will cause the bars to only appear if it takes longer than 5 sec.
- doc fixes
- make scene_dataset and window_opts optional in GeoDataConfig
- allow calling Learner.train() with custom number of epochs
- add an extract() function to pipeline.file_system.utils that uses shutil.unpack_archive().
- add an item_limit arg to parse_stac()
These can be passed directly to matplotlib.color.ListedColorMap.
- Set skip_validation=True in torch.hub.load() to avoid the validation step that was the cause of the bug in the first place, since it's of dubious usefulness and still contains an infinite loop.
- update unit tests
Unlike the parent class, RasterioSource applies channel_order before raster_transformers. So the number of output channels is not guaranteed to be equal to `len(channel_order)`.
- RasterSource.get_crs_transformer()
- CRSTransformer: get_affine_transform(), get_image_crs(), get_map_crs()
- ClassConfig.get_null_class_id()
@codecov
Copy link

codecov bot commented Sep 19, 2022

Codecov Report

Merging #1470 (4b022ca) into master (5a1e6f1) will decrease coverage by 1.65%.
The diff coverage is 66.02%.

@@            Coverage Diff             @@
##           master    #1470      +/-   ##
==========================================
- Coverage   73.35%   71.70%   -1.66%     
==========================================
  Files         184      186       +2     
  Lines        8884     8947      +63     
==========================================
- Hits         6517     6415     -102     
- Misses       2367     2532     +165     
Impacted Files Coverage Δ
...stervision_core/rastervision/core/data/__init__.py 100.00% <ø> (ø)
...aluation/semantic_segmentation_evaluator_config.py 40.90% <0.00%> (-7.37%) ⬇️
rastervision_core/rastervision/core/predictor.py 19.40% <ø> (+0.28%) ⬆️
...stervision/core/rv_pipeline/chip_classification.py 40.74% <0.00%> (ø)
.../rastervision/core/rv_pipeline/object_detection.py 21.11% <0.00%> (ø)
...ervision/core/rv_pipeline/semantic_segmentation.py 18.27% <0.00%> (ø)
.../pytorch_backend/pytorch_learner_backend_config.py 66.00% <0.00%> (+1.29%) ⬆️
.../pytorch_learner/dataset/classification_dataset.py 92.59% <ø> (ø)
...ytorch_learner/dataset/object_detection_dataset.py 33.65% <ø> (ø)
...a/label_store/semantic_segmentation_label_store.py 25.12% <7.40%> (-1.22%) ⬇️
... and 66 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Copy link
Contributor

@lewfish lewfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your clarifications made sense. I think we should leave things as is.

@AdeelH AdeelH merged commit 08454a4 into azavea:master Sep 21, 2022
@AdeelH AdeelH deleted the nb branch September 28, 2022 11:19
@AdeelH AdeelH mentioned this pull request Oct 24, 2022
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor to remove mask-to-polygons dependency
2 participants