Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use resample many_to_many in outlier detection #1260

Merged
merged 2 commits into from
Jun 18, 2024

Conversation

braingram
Copy link
Collaborator

@braingram braingram commented Jun 3, 2024

This PR changes outlier detection to use many_to_many when resampling during median calculation. This PR produces 1 expected difference in regression test results.
https://plwishmaster.stsci.edu:8081/job/RT/job/Roman-Developers-Pull-Requests/805/

With the current code on main the test_level3_mos_pipeline test runs an association with 3 models (each from a different exposure) through the mosaic pipeline. Setting a breakpoint just prior to the call to create_median:

# Perform median combination on set of drizzled mosaics
median_model.data = Quantity(
self.create_median(drizzled_models), unit=median_model.data.unit
)

we can see that len(drizzled_models) == 1. This conflict with the algorithm description in the docs that states:

Each dither position will result in a separate grouped mosaic, so only a single exposure ever contributes to each pixel in these mosaics.

Looking at the outlier detection unit tests I don't see one that both:

Changing test_outlier_do_detection_find_outliers (which introduces and detects outliers) to use resampling is insufficient to show the impact of this PR. Without this PR, the introduced CRs (value=1E5) and 'drizzled' together with empty pixels from the other image (value=0). This produces a single drizzled_model with a value of 5E4 for each PR. The median (N=1) produces the same value, when blotted back to the input image wcs the 5E4 values are far enough below the 1E5 values of the CRs to allow them to be detected. This is true for any value used for the CR (since the input images are noiseless with all 0 error).

Furthermore the test appears to be checking that all introduces CRs (even those introduced in img2) are flagged in img1:

img_1_outlier_output_coords = np.where(step.input_models[0].dq > 0)

returns 10 flagged CRs:

(array([ 5, 15, 25, 35, 45, 65, 75, 85, 95, 99]), array([45, 25, 25,  5, 85, 65, 65,  5, 45,  5]))

whereas only 5 were introduced into img1:

img_1_input_coords = np.array(
[(5, 45), (25, 25), (45, 85), (65, 65), (85, 5)], dtype=[("x", int), ("y", int)]
)

Switching single (so that now many_to_many is used) revealed a few other issues included:

  • drizzled models overwriting each other due to filenames sharing the same base
  • drizzled models output including only the last used group
  • resample background correction check not matching the check in resample

The updated unit test in this PR uses 3 images with the first 2 having CRs, each having 1 "source" and checks that:

  • CRs added to image 0 are flagged in image 0
  • CRs added to image 1 are flagged in image 1
  • no CRs are flagged in image 2

Checklist

  • added entry in CHANGES.rst under the corresponding subsection
  • updated relevant tests
  • updated relevant documentation
  • updated relevant milestone(s)
  • added relevant label(s)
  • ran regression tests, post a link to the Jenkins job below. How to run regression tests on a PR

@braingram braingram force-pushed the single branch 2 times, most recently from 509b088 to 819699a Compare June 3, 2024 14:42
Copy link

codecov bot commented Jun 3, 2024

Codecov Report

Attention: Patch coverage is 66.66667% with 2 lines in your changes missing coverage. Please review.

Project coverage is 79.30%. Comparing base (79d3a30) to head (b923030).
Report is 193 commits behind head on main.

Files with missing lines Patch % Lines
romancal/resample/resample.py 66.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1260      +/-   ##
==========================================
+ Coverage   79.24%   79.30%   +0.06%     
==========================================
  Files         117      117              
  Lines        8075     8065      -10     
==========================================
- Hits         6399     6396       -3     
+ Misses       1676     1669       -7     
Flag Coverage Δ *Carryforward flag
nightly 62.78% <ø> (ø) Carriedforward from 79d3a30

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@schlafly
Copy link
Collaborator

schlafly commented Jun 3, 2024

This looks fine to me, but @mairanteodoro and you should discuss, and I suspect that there are related unit tests that will need updating.

@braingram
Copy link
Collaborator Author

Thanks @schlafly!

@mairanteodoro and I have a meeting tomorrow to discuss the sky subtraction (which is partially handled in this PR and partially in #1233). After updating the flag_outlier unit test for this PR I don't think the sky subtraction was the only issue so there are a number of changes in this PR.

I'll open this PR for review once the unit and regtests finish (I expect at least the mosaic regtest to fail and any others that use outlier detection).

@braingram braingram force-pushed the single branch 4 times, most recently from af5a19e to 54994b4 Compare June 4, 2024 15:42
@github-actions github-actions bot added the dependencies Pull requests that update a dependency file label Jun 4, 2024
@braingram braingram marked this pull request as ready for review June 4, 2024 16:07
@braingram braingram requested a review from a team as a code owner June 4, 2024 16:07
@@ -34,7 +34,7 @@ dependencies = [
"tweakwcs >=0.8.6",
"spherical-geometry >= 1.2.22",
"stsci.imagestats >= 1.6.3",
"drizzle >= 1.13.7",
"drizzle >= 1.14.0",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was needed as an outlier at the very edge of the image wasn't being picked up in drizzle <1.14.0.

@braingram braingram requested a review from mairanteodoro June 4, 2024 16:10
@schlafly
Copy link
Collaborator

schlafly commented Jun 4, 2024

Presumably the regtests change meaningfully with this PR; would you mind including something like the new image, old image, and the difference? Thank you!

@braingram
Copy link
Collaborator Author

Presumably the regtests change meaningfully with this PR; would you mind including something like the new image, old image, and the difference? Thank you!

I think the only regtest that is impacted is the mosaic pipeline test. This uses only 3 images and produces very minor differences in the number of detected outliers. With main:

2024-06-04 16:06:53,489 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 6373 (0.04%)
2024-06-04 16:06:54,670 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 11289 (0.07%)
2024-06-04 16:06:55,898 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 6441 (0.04%)

with this PR:

2024-06-04 16:01:20,422 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 6144 (0.04%)
2024-06-04 16:01:21,795 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 15462 (0.09%)
2024-06-04 16:01:22,888 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 6151 (0.04%)

Overall I don't think the output change is very meaningful given the current regression tests.
Here is the stpreview by <fn> 16 16 output for the truth file:
truth
and the new output with this PR:
new

@schlafly
Copy link
Collaborator

schlafly commented Jun 6, 2024

I agree that I can't see anything in the preview images; the stars are too small and just look like hot pixels. I don't actually think the L2 files are that problematic; some of the structure in the background is from the non-linearity reference files having issues. The spatial gradient is a little unexpected but is probably very low amplitude. By eye it doesn't look to me like we're masking all of the stars, for example, but have you actually looked at the generated outlier masks?

Copy link
Collaborator

@mairanteodoro mairanteodoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks for the fix, @braingram!!

@braingram
Copy link
Collaborator Author

I agree that I can't see anything in the preview images; the stars are too small and just look like hot pixels. I don't actually think the L2 files are that problematic; some of the structure in the background is from the non-linearity reference files having issues. The spatial gradient is a little unexpected but is probably very low amplitude. By eye it doesn't look to me like we're masking all of the stars, for example, but have you actually looked at the generated outlier masks?

I looked at the outlier masks and there aren't large differences. Since main has seen some changes since I made those comparisons they'll need to be updated. The mosaic test only uses 3 images so the median across groups (many_to_many) vs the drizzled combinations of all groups (many_to_one) produces similar "median" data.

Copy link
Collaborator

@schlafly schlafly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks, approving!

@braingram
Copy link
Collaborator Author

@schlafly Is there more you'd like to see before merging?

I re-ran the regtests here:
https://plwishmaster.stsci.edu:8081/blue/organizations/jenkins/RT%2FRoman-Developers-Pull-Requests/detail/Roman-Developers-Pull-Requests/816/tests

Looking at the output image (a cropped region [200:800, 400:1000]) the truth file shows the following (log scaled):
Screen Shot 2024-06-07 at 12 47 35 PM
with this PR the image is similar but wih a few less CRs:
Screen Shot 2024-06-07 at 12 47 41 PM

@braingram
Copy link
Collaborator Author

@schlafly Is there anything else you'd like to see for this PR?

@schlafly
Copy link
Collaborator

No, thanks, I thought I approved, thank you!

@braingram braingram enabled auto-merge June 18, 2024 15:59
@braingram braingram merged commit bcfdb71 into spacetelescope:main Jun 18, 2024
29 of 30 checks passed
@braingram braingram deleted the single branch June 18, 2024 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants