use resample many_to_many in outlier detection #1260

braingram · 2024-06-03T14:39:12Z

This PR changes outlier detection to use many_to_many when resampling during median calculation. This PR produces 1 expected difference in regression test results.
https://plwishmaster.stsci.edu:8081/job/RT/job/Roman-Developers-Pull-Requests/805/

With the current code on main the test_level3_mos_pipeline test runs an association with 3 models (each from a different exposure) through the mosaic pipeline. Setting a breakpoint just prior to the call to create_median:

romancal/romancal/outlier_detection/outlier_detection.py

Lines 108 to 111 in ed6187f

    
           # Perform median combination on set of drizzled mosaics 
        
           median_model.data = Quantity( 
        
               self.create_median(drizzled_models), unit=median_model.data.unit 
        
           )

we can see that len(drizzled_models) == 1. This conflict with the algorithm description in the docs that states:

Each dither position will result in a separate grouped mosaic, so only a single exposure ever contributes to each pixel in these mosaics.

Looking at the outlier detection unit tests I don't see one that both:

runs the step with resample_data=True

introduces outliers
The only unit test that reaches:

romancal/romancal/outlier_detection/outlier_detection.py

Lines 86 to 88 in e559890

    
           resamp = resample.ResampleData( 
        
               self.input_models, single=False, blendheaders=False, **pars 
        
           )

is test_skymatch_always_returns_modelcontainer_with_updated_datamodels which doesn't introduce any outliers and runs the step with all 0 data.

Changing test_outlier_do_detection_find_outliers (which introduces and detects outliers) to use resampling is insufficient to show the impact of this PR. Without this PR, the introduced CRs (value=1E5) and 'drizzled' together with empty pixels from the other image (value=0). This produces a single drizzled_model with a value of 5E4 for each PR. The median (N=1) produces the same value, when blotted back to the input image wcs the 5E4 values are far enough below the 1E5 values of the CRs to allow them to be detected. This is true for any value used for the CR (since the input images are noiseless with all 0 error).

Furthermore the test appears to be checking that all introduces CRs (even those introduced in img2) are flagged in img1:

romancal/romancal/outlier_detection/tests/test_outlier_detection.py

Line 253 in e559890

img_1_outlier_output_coords = np.where(step.input_models[0].dq > 0)

returns 10 flagged CRs:

(array([ 5, 15, 25, 35, 45, 65, 75, 85, 95, 99]), array([45, 25, 25,  5, 85, 65, 65,  5, 45,  5]))

whereas only 5 were introduced into img1:

romancal/romancal/outlier_detection/tests/test_outlier_detection.py

Lines 204 to 206 in e559890

    
           img_1_input_coords = np.array( 
        
               [(5, 45), (25, 25), (45, 85), (65, 65), (85, 5)], dtype=[("x", int), ("y", int)] 
        
           )

Switching single (so that now many_to_many is used) revealed a few other issues included:

drizzled models overwriting each other due to filenames sharing the same base
drizzled models output including only the last used group
resample background correction check not matching the check in resample

The updated unit test in this PR uses 3 images with the first 2 having CRs, each having 1 "source" and checks that:

CRs added to image 0 are flagged in image 0
CRs added to image 1 are flagged in image 1
no CRs are flagged in image 2

Checklist

added entry in CHANGES.rst under the corresponding subsection
updated relevant tests
updated relevant documentation
updated relevant milestone(s)
added relevant label(s)
ran regression tests, post a link to the Jenkins job below. How to run regression tests on a PR

codecov · 2024-06-03T15:02:12Z

Codecov Report

Attention: Patch coverage is 66.66667% with 2 lines in your changes missing coverage. Please review.

Project coverage is 79.30%. Comparing base (79d3a30) to head (b923030).
Report is 193 commits behind head on main.

Files with missing lines	Patch %	Lines
romancal/resample/resample.py	66.66%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1260      +/-   ##
==========================================
+ Coverage   79.24%   79.30%   +0.06%     
==========================================
  Files         117      117              
  Lines        8075     8065      -10     
==========================================
- Hits         6399     6396       -3     
+ Misses       1676     1669       -7

Flag	Coverage Δ		*Carryforward flag
nightly	`62.78% <ø> (ø)`		Carriedforward from 79d3a30

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

schlafly · 2024-06-03T20:11:02Z

This looks fine to me, but @mairanteodoro and you should discuss, and I suspect that there are related unit tests that will need updating.

braingram · 2024-06-03T21:16:36Z

Thanks @schlafly!

@mairanteodoro and I have a meeting tomorrow to discuss the sky subtraction (which is partially handled in this PR and partially in #1233). After updating the flag_outlier unit test for this PR I don't think the sky subtraction was the only issue so there are a number of changes in this PR.

I'll open this PR for review once the unit and regtests finish (I expect at least the mosaic regtest to fail and any others that use outlier detection).

braingram · 2024-06-04T16:07:59Z

pyproject.toml

@@ -34,7 +34,7 @@ dependencies = [
    "tweakwcs >=0.8.6",
    "spherical-geometry >= 1.2.22",
    "stsci.imagestats >= 1.6.3",
-    "drizzle >= 1.13.7",
+    "drizzle >= 1.14.0",


This was needed as an outlier at the very edge of the image wasn't being picked up in drizzle <1.14.0.

schlafly · 2024-06-04T18:32:10Z

Presumably the regtests change meaningfully with this PR; would you mind including something like the new image, old image, and the difference? Thank you!

braingram · 2024-06-04T21:19:08Z

Presumably the regtests change meaningfully with this PR; would you mind including something like the new image, old image, and the difference? Thank you!

I think the only regtest that is impacted is the mosaic pipeline test. This uses only 3 images and produces very minor differences in the number of detected outliers. With main:

2024-06-04 16:06:53,489 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 6373 (0.04%)
2024-06-04 16:06:54,670 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 11289 (0.07%)
2024-06-04 16:06:55,898 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 6441 (0.04%)

with this PR:

2024-06-04 16:01:20,422 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 6144 (0.04%)
2024-06-04 16:01:21,795 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 15462 (0.09%)
2024-06-04 16:01:22,888 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 6151 (0.04%)

Overall I don't think the output change is very meaningful given the current regression tests.
Here is the stpreview by <fn> 16 16 output for the truth file:

and the new output with this PR:

schlafly · 2024-06-06T14:34:14Z

I agree that I can't see anything in the preview images; the stars are too small and just look like hot pixels. I don't actually think the L2 files are that problematic; some of the structure in the background is from the non-linearity reference files having issues. The spatial gradient is a little unexpected but is probably very low amplitude. By eye it doesn't look to me like we're masking all of the stars, for example, but have you actually looked at the generated outlier masks?

mairanteodoro

Looks good to me! Thanks for the fix, @braingram!!

braingram · 2024-06-07T14:09:30Z

I agree that I can't see anything in the preview images; the stars are too small and just look like hot pixels. I don't actually think the L2 files are that problematic; some of the structure in the background is from the non-linearity reference files having issues. The spatial gradient is a little unexpected but is probably very low amplitude. By eye it doesn't look to me like we're masking all of the stars, for example, but have you actually looked at the generated outlier masks?

I looked at the outlier masks and there aren't large differences. Since main has seen some changes since I made those comparisons they'll need to be updated. The mosaic test only uses 3 images so the median across groups (many_to_many) vs the drizzled combinations of all groups (many_to_one) produces similar "median" data.

schlafly

Great, thanks, approving!

braingram · 2024-06-07T16:50:36Z

@schlafly Is there more you'd like to see before merging?

I re-ran the regtests here:
https://plwishmaster.stsci.edu:8081/blue/organizations/jenkins/RT%2FRoman-Developers-Pull-Requests/detail/Roman-Developers-Pull-Requests/816/tests

Looking at the output image (a cropped region [200:800, 400:1000]) the truth file shows the following (log scaled):

with this PR the image is similar but wih a few less CRs:

braingram · 2024-06-18T15:55:09Z

@schlafly Is there anything else you'd like to see for this PR?

schlafly · 2024-06-18T15:57:42Z

No, thanks, I thought I approved, thank you!

braingram force-pushed the single branch 2 times, most recently from 509b088 to 819699a Compare June 3, 2024 14:42

github-actions bot added the testing label Jun 3, 2024

braingram force-pushed the single branch 4 times, most recently from af5a19e to 54994b4 Compare June 4, 2024 15:42

github-actions bot added the dependencies Pull requests that update a dependency file label Jun 4, 2024

braingram marked this pull request as ready for review June 4, 2024 16:07

braingram requested a review from a team as a code owner June 4, 2024 16:07

braingram commented Jun 4, 2024

View reviewed changes

braingram requested a review from mairanteodoro June 4, 2024 16:10

mairanteodoro approved these changes Jun 6, 2024

View reviewed changes

braingram mentioned this pull request Jun 6, 2024

Replace ModelContainer with ModelLibrary #1241

Merged

16 tasks

braingram force-pushed the single branch from 54994b4 to 197f717 Compare June 7, 2024 14:05

schlafly approved these changes Jun 7, 2024

View reviewed changes

braingram force-pushed the single branch from 197f717 to dd57c34 Compare June 14, 2024 19:30

braingram added 2 commits June 18, 2024 11:54

use resample many_to_many in outlier detection

a5f10b9

update outlier detection unit test, fix oldestdeps

b923030

braingram force-pushed the single branch from dd57c34 to b923030 Compare June 18, 2024 15:54

braingram enabled auto-merge June 18, 2024 15:59

braingram merged commit bcfdb71 into spacetelescope:main Jun 18, 2024
29 of 30 checks passed

braingram deleted the single branch June 18, 2024 16:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use resample many_to_many in outlier detection #1260

use resample many_to_many in outlier detection #1260

braingram commented Jun 3, 2024 •

edited

Loading

codecov bot commented Jun 3, 2024 •

edited

Loading

schlafly commented Jun 3, 2024

braingram commented Jun 3, 2024

braingram Jun 4, 2024

schlafly commented Jun 4, 2024

braingram commented Jun 4, 2024

schlafly commented Jun 6, 2024

mairanteodoro left a comment

braingram commented Jun 7, 2024

schlafly left a comment

braingram commented Jun 7, 2024

braingram commented Jun 18, 2024

schlafly commented Jun 18, 2024

	# Perform median combination on set of drizzled mosaics
	median_model.data = Quantity(
	self.create_median(drizzled_models), unit=median_model.data.unit
	)

	resamp = resample.ResampleData(
	self.input_models, single=False, blendheaders=False, **pars
	)

	img_1_input_coords = np.array(
	[(5, 45), (25, 25), (45, 85), (65, 65), (85, 5)], dtype=[("x", int), ("y", int)]
	)

use resample many_to_many in outlier detection #1260

use resample many_to_many in outlier detection #1260

Conversation

braingram commented Jun 3, 2024 • edited Loading

codecov bot commented Jun 3, 2024 • edited Loading

Codecov Report

schlafly commented Jun 3, 2024

braingram commented Jun 3, 2024

braingram Jun 4, 2024

Choose a reason for hiding this comment

schlafly commented Jun 4, 2024

braingram commented Jun 4, 2024

schlafly commented Jun 6, 2024

mairanteodoro left a comment

Choose a reason for hiding this comment

braingram commented Jun 7, 2024

schlafly left a comment

Choose a reason for hiding this comment

braingram commented Jun 7, 2024

braingram commented Jun 18, 2024

schlafly commented Jun 18, 2024

braingram commented Jun 3, 2024 •

edited

Loading

codecov bot commented Jun 3, 2024 •

edited

Loading