Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gtc 3025/filter alerts by forest #170

Merged
merged 9 commits into from
Nov 7, 2024

Conversation

solomon-negusse
Copy link
Member

@solomon-negusse solomon-negusse commented Oct 30, 2024

Pull request checklist

Please check if your PR fulfills the following requirements:

  • Make sure you are requesting to pull a topic/feature/bugfix branch (right side). Don't request your master!
  • Make sure you are making a pull request against the develop branch (left side). Also you should start your branch off our develop.
  • Check the commit's or even all commits' message styles matches our requested structure.
  • Check your code additions will fail neither code linting checks nor unit test.

Pull request type

Please check the type of change your PR introduces:

  • Bugfix
  • Feature
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • Documentation content changes
  • Other (please describe):

What is the new behavior?

  • Adds ability to filter DIST alerts layer by forest parameters (tree cover loss, density and height)

Does this introduce a breaking change?

  • Yes
  • No

Other information

@solomon-negusse solomon-negusse marked this pull request as draft October 30, 2024 20:19
@codecov-commenter
Copy link

codecov-commenter commented Oct 31, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 58.82353% with 21 lines in your changes missing coverage. Please review.

Project coverage is 79.85%. Comparing base (bc349c0) to head (693e0c2).

Files with missing lines Patch % Lines
app/routes/titiler/umd_glad_dist_alerts.py 15.00% 17 Missing ⚠️
app/settings/globals.py 66.66% 4 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #170      +/-   ##
==========================================
- Coverage   80.29%   79.85%   -0.45%     
==========================================
  Files          61       61              
  Lines        2005     2050      +45     
==========================================
+ Hits         1610     1637      +27     
- Misses        395      413      +18     
Flag Coverage Δ
unittests 79.85% <58.82%> (-0.45%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@solomon-negusse solomon-negusse marked this pull request as ready for review November 1, 2024 16:42
Copy link
Contributor

@danscales danscales left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I'm just wondering about my questions related to the TCL mask.

# 2018, but has tree cover height (2020) that meets the forest threshold, the pixel meets
# the forest criteria for alerts.
mask *= (
self.tree_cover_loss_data.array[0, :, :] >= self.tree_cover_loss_mask
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the mask be logically negated in this case? I think the mask is True is we want to include the alerts. But we don't want to include an alert if this pixel has a loss year of 2021, for example.

But separately, in Sarah's document, they said that we shouldn't remove alerts for pixels that have loss year of 2022 or 2023, because we actually want those alerts to show up in our history. I.e. the loss may have been the cause of the alert we want to show.

So, it somewhat seems like you only want to ever remove alerts here that have a loss year of exactly 2021. If loss year is <= 2020, then that is covered by the tree cover height, as you said, so you ignore that too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for catching - this needs to be self.tree_cover_loss_data.array[0, :, :] > self.tree_cover_loss_mask.
I included this filter to allow them the flexibility of masking out the later tree cover loss years (2022, 2023) as needed.

Copy link
Member Author

@solomon-negusse solomon-negusse Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And by default, it's going to filter out just 2021. actually didn't implement this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, maybe you can give a comment on the meaning of tree_cover_loss_mask above? So, it's actually the highest loss year that should be used to exclude alerts. And by default we never use TCL from 2020 or less (including 0) to exclude alerts. So, if tree_cover_loss_mask is 2021, then we only use loss year 2021 to exclude alerts.


if tree_cover_loss_cutoff:
with COGReader(
f"s3://{DATA_LAKE_BUCKET}/umd_tree_cover_loss/v1.10.1/raster/epsg-4326/cog/default.tif"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put in a TODO here to fix the version name (to v1.11) once we go from staging to production?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How often will we need to change the code? Should this be a config variable somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TCL is updated once a year I believe, so not that frequent but still good idea to not hardcode it here.

Copy link
Member Author

@solomon-negusse solomon-negusse Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gtempus: I hit what's potentially a terraform bug setting an env variable that's a json string with the dataset versions - same variable that works fine with the docker-compose file. So, for now I have moved out the dataset versions into the global config file where we'll have to update it (693e0c2) as interim solution and opened an issue in terraform aws provider that I hope will get sorted out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good interim solution, @solomon-negusse. Thanks!
It's much better than having to dig into the codebase. 😄
Bummer about the terraform bug. 😦

Copy link
Contributor

@gtempus gtempus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate all the attention to documentation (descriptions and tests), @solomon-negusse! 🚀
I just was wondering about the hardcoded versions in the URL paths of the layers.

if self.tree_cover_loss_mask:
# Tree cover loss before 2020 can't be used to filter out pixels as not forest
# if they had tree cover loss. Instead we use tree cover height taken that year
# is used as source of truth.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the informative comment, Solomon. 💪

Comment on lines +53 to +67
tree_cover_density: Optional[int] = Query(
None,
ge=0,
le=100,
description="Alerts in pixels with tree cover density (in percent) below this threshold won't be displayed. `umd_tree_cover_density_2010` is used for this masking.",
),
tree_cover_height: Optional[int] = Query(
None,
description="Alerts in pixels with tree cover height (in meters) below this threshold won't be displayed. `umd_tree_cover_height_2020` dataset in the API is used for this masking.",
),
tree_cover_loss_cutoff: bool = Query(
False,
ge=2021,
description="""This filter is to be used in conjunction with `tree_cover_density` and `tree_cover_height` filters to detect only alerts in forests, by masking out pixels that have had tree cover loss prior to the alert.""",
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice descriptions here, @solomon-negusse ! 💪


if tree_cover_loss_cutoff:
with COGReader(
f"s3://{DATA_LAKE_BUCKET}/umd_tree_cover_loss/v1.10.1/raster/epsg-4326/cog/default.tif"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How often will we need to change the code? Should this be a config variable somewhere?

@@ -104,3 +149,33 @@ def test_encoded_rgba():

# test high confidence in alpha channel
assert rgba.array[3, 154, 71] == 8


def test_forest_mask():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for continuing to improve the test coverage, @solomon-negusse ! ❤️

Copy link
Contributor

@danscales danscales left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

tree_cover_height_data: Optional[ImageData] = None

# the highest loss year that is used to exclude alerts for
# the purpose of showing only alerts in forests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in a following change you can just add the extra sentence "Should be 2021 or later, since 2020 tree height data will cover tree loss before 2021.".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. I have already added a validator to enforce that in the endpoint itself here too: https://github.com/wri/gfw-tile-cache/blob/gtc-3025/filter-alerts-by-forest/app/routes/titiler/umd_glad_dist_alerts.py#L65

@solomon-negusse solomon-negusse merged commit 2e04a59 into dev Nov 7, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants