Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JP-3576: Implement the Ramp Fitting Likelihood Algorithm #278

Merged
merged 63 commits into from
Sep 19, 2024

Conversation

kmacdonald-stsci
Copy link
Collaborator

Resolves JP-3576

This PR implements the likelihood algorithm for ramp fitting developed by Timothy Brandt in Optimal Fitting and Debiasing for Detectors Read Out Up-the-Ramp.

Checklist

  • added entry in CHANGES.rst (either in Bug Fixes or Changes to API)
  • updated relevant tests
  • updated relevant documentation
  • updated relevant milestone(s)
  • added relevant label(s)

@kmacdonald-stsci kmacdonald-stsci requested a review from a team as a code owner August 7, 2024 18:23
@github-actions github-actions bot added documentation Improvements or additions to documentation ramp_fitting testing labels Aug 7, 2024
Copy link

codecov bot commented Aug 7, 2024

Codecov Report

Attention: Patch coverage is 84.85915% with 129 lines in your changes missing coverage. Please review.

Project coverage is 84.69%. Comparing base (d85f859) to head (375f310).
Report is 144 commits behind head on main.

Files with missing lines Patch % Lines
tests/test_ramp_fitting_likely_fit.py 82.99% 67 Missing ⚠️
src/stcal/ramp_fitting/likely_algo_classes.py 72.56% 31 Missing ⚠️
src/stcal/ramp_fitting/likely_fit.py 91.26% 29 Missing ⚠️
src/stcal/ramp_fitting/ramp_fit.py 81.81% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #278      +/-   ##
==========================================
+ Coverage   84.63%   84.69%   +0.06%     
==========================================
  Files          41       44       +3     
  Lines        7628     8501     +873     
==========================================
+ Hits         6456     7200     +744     
- Misses       1172     1301     +129     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kmacdonald-stsci kmacdonald-stsci changed the title Jp 3576 JP-3576: Implement the Ramp Fitting Likelihood Algorithm Aug 7, 2024
Copy link

@t-brandt t-brandt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments are all minor docstring edits apart from one check to make sure that definitions of read noise are consistently referring to single read (=CDS/sqrt(2)) for all algorithms. I think the answer is yes, but I want to be completely certain.

src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved
src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved
src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved
src/stcal/ramp_fitting/likely_fit.py Show resolved Hide resolved
src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved
src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved
src/stcal/ramp_fitting/ramp_fit.py Outdated Show resolved Hide resolved
CHANGES.rst Outdated Show resolved Hide resolved
@melanieclarke
Copy link
Contributor

melanieclarke commented Sep 3, 2024

@kmacdonald-stsci - in this comment from @t-brandt:
#278 (comment)
he suggests leaving jump as is, and just ignoring previously set jump flags.

It looks to me like the PR on JWST is skipping jump:
https://github.com/spacetelescope/jwst/blob/2452fddf897c617c5b066f791054863c24e36938/jwst/pipeline/calwebb_detector1.py#L87

I'm just starting to look at these code changes, but did the code here get updated to expect previously set jump flags? Or did we decide to skip jump in the detector1 pipeline when algorithm=LIKELY after all?

Copy link
Contributor

@melanieclarke melanieclarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs quite a bit of clean up before it's ready to merge.

Most of it is code style related and documentation clarification, but I have a few significant structural concerns:

  1. Will jump be skipped in detector1 or no? See previous comment - we may need to run jump for 1/f purposes, and ignore the input flags here.
  2. We should add parameter passing structures for the detection threshold - currently nsig, but should be rejection_threshold, I think? And we need a test for that branch of the code.
  3. There is code related to a pedestal calculation that looks unused and untested. If we don't plan to offer it, we should remove it.

Also - I'd like to test this on real data, but will wait for the issue with skipping jump to be settled.

docs/stcal/ramp_fitting/description.rst Outdated Show resolved Hide resolved
docs/stcal/ramp_fitting/description.rst Outdated Show resolved Hide resolved
docs/stcal/ramp_fitting/description.rst Outdated Show resolved Hide resolved
docs/stcal/ramp_fitting/description.rst Outdated Show resolved Hide resolved
docs/stcal/ramp_fitting/description.rst Outdated Show resolved Hide resolved
src/stcal/ramp_fitting/ramp_fit.py Outdated Show resolved Hide resolved
src/stcal/ramp_fitting/ramp_fit.py Outdated Show resolved Hide resolved
tests/test_ramp_fitting_likly_fit.py Outdated Show resolved Hide resolved
tests/test_ramp_fitting_likly_fit.py Outdated Show resolved Hide resolved
tests/test_ramp_fitting_likly_fit.py Outdated Show resolved Hide resolved
@kmacdonald-stsci
Copy link
Collaborator Author

I think this needs quite a bit of clean up before it's ready to merge.

Most of it is code style related and documentation clarification, but I have a few significant structural concerns:

  1. Will jump be skipped in detector1 or no? See previous comment - we may need to run jump for 1/f purposes, and ignore the input flags here.
  2. We should add parameter passing structures for the detection threshold - currently nsig, but should be rejection_threshold, I think? And we need a test for that branch of the code.
  3. There is code related to a pedestal calculation that looks unused and untested. If we don't plan to offer it, we should remove it.

Also - I'd like to test this on real data, but will wait for the issue with skipping jump to be settled.

Of course. As has been discussed many times, this is simply the initial implementation of the likelihood algorithm for ramp fitting. The implementation right now is to make the algorithm available for those interested in investigating it, which includes investigating the pedestal. The details for full integration into the existing pipeline have not been completely resolved, such as the separation of jump detection so it can be used as the preceding step for 1/f before running ramp fitting.

@zacharyburnett
Copy link
Collaborator

would this need a corresponding PR in jwst if it were included here?

@kmacdonald-stsci
Copy link
Collaborator Author

would this need a corresponding PR in jwst if it were included here?

This does have a corresponding JWST PR here: spacetelescope/jwst#8631

Copy link
Contributor

@melanieclarke melanieclarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting closer, thanks for all the updates!

A few lingering clean up issues still, noted below.

docs/stcal/ramp_fitting/description.rst Outdated Show resolved Hide resolved
docs/stcal/ramp_fitting/description.rst Outdated Show resolved Hide resolved
docs/stcal/ramp_fitting/description.rst Outdated Show resolved Hide resolved
docs/stcal/ramp_fitting/description.rst Outdated Show resolved Hide resolved
docs/stcal/ramp_fitting/description.rst Outdated Show resolved Hide resolved
src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved
src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved
src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved
src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved
src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved
@melanieclarke
Copy link
Contributor

melanieclarke commented Sep 9, 2024

Testing this on some real data now to verify the various options are working as expected.

I see significant differences between running with standard jump detection turned on in the pipeline and with it set to skip. With jump on and algorithm=LIKELY, the output is very similar to standard ramp fitting. With jump off and algorithm=LIKELY, the results from ramp fitting are very different (and very poor) a little different.

@kmacdonald-stsci - I expected that the likelihood algorithm should be ignoring any previously set jump flags, so running with and without jump detection on should have identical results, but that's not the case. Can you please investigate?

@t-brandt - I'm not sure what output is expected to look like from the likelihood algorithm, but have you tested the implementation on this branch with real data?

Edit: reinstalling and testing again, I can't reproduce the very poor results I first saw with LIKELY + no jump - I assume something was off in my environment. However, I do still see differences between running with and without standard jump on, so that needs investigation.

@kmacdonald-stsci
Copy link
Collaborator Author

Testing this on some real data now to verify the various options are working as expected.

I see significant differences between running with standard jump detection turned on in the pipeline and with it set to skip. With jump on and algorithm=LIKELY, the output is very similar to standard ramp fitting. With jump off and algorithm=LIKELY, the results from ramp fitting are very different (and very poor) a little different.

@kmacdonald-stsci - I expected that the likelihood algorithm should be ignoring any previously set jump flags, so running with and without jump detection on should have identical results, but that's not the case. Can you please investigate?

@t-brandt - I'm not sure what output is expected to look like from the likelihood algorithm, but have you tested the implementation on this branch with real data?

Edit: reinstalling and testing again, I can't reproduce the very poor results I first saw with LIKELY + no jump - I assume something was off in my environment. However, I do still see differences between running with and without standard jump on, so that needs investigation.

Please send me the data you are using and the specific pixels you are noticing as different.

@drlaw1558
Copy link
Contributor

Finally had a chance to test this out a little. A few quick comments:

  • The likelihood routine may be doing a better job of eliminating CR showers in MIRI data, in at least some cases (jw01236004001_02101_00001_mirifulong).

  • As for @melanieclarke , I see some small differences in results with jump on/off for both jw01236004001_02101_00001_mirifulong and jw01523003001_03102_00001_mirifushort_uncal.fits. E.g., pixel (X,Y)=(547,576) 1-indexed in the jw1523 exposure.

  • I'm seeing an issue when processing jw01264014001_02101_00001_mirifushort_uncal.fits with
    2024-09-12 12:28:56,238 - stpipe.Detector1Pipeline.ramp_fit - WARNING - /Users/dlaw/jwcode/stcal_jwst/stcal/src/stcal/ramp_fitting/likely_fit.py:288: RuntimeWarning: invalid value encountered in multiply
    2024-09-12 12:28:56,238 - stpipe.Detector1Pipeline.ramp_fit - WARNING - new_cts[i_d2] = np.sum(result.countrate_two_omit * droptwo, axis=0)[i_d2]
    Similarly with jw01247667001_02101_00001_mirifulong_uncal.fits, but for both i_d1 and i_d2 on lines 286 and 288.

  • I'm also seeing an issue for jw01247667001_02101_00001_mirifulong_uncal.fits with the combination of individual integration data into the final rate value. This is a bright multi-int data set, and many integrations have NaNs in them which isn't necessarily surprising. However, any NaNs in a single integration are producing NaNs in the final rate image as well as if they're being mean-combined rather than nanmean-combined? See snake-like features near the top left of rate image created from this exposure. Reminds me a lot of JP-3004 but for NaNs.

  • Question perhaps for @t-brandt ; one of the appeals of this approach was runtime, which we expected to be a few seconds or so (depending on the observation of course). Actual runtime in some of the examples above though is factor of 10 longer than the C-based OLS code. We shouldn't optimize right now (let's study science performance gains first), but what's your opinion on the parameter space available for such speed gains in future?

@t-brandt
Copy link

For @drlaw1558: this ramp fitting approach was never designed to outperform an efficient implementation of the Fixsen algorithm (which has a cost linear in the number of groups, with a small prefactor). This approach retains the linear scaling albeit with a larger prefactor. So if both the C implementation and this one are perfectly implemented, I would expect the C version to run faster by a factor of at least 5. This approach would offer significant performance gains if the Fixsen approach were implemented very inefficiently, as was previously the case. The appeal to the likelihood-based approach is the science gain (no bias, optimal SNR, optimal CR sensitivity), especially for long ramps, combined with perfect scaling with ramp size and a modest performance hit--a fixed factor--relative to the Fixsen algorithm. My expectation is that this approach should take 5-10 seconds for 10 groups full frame with CR flagging.

@tapastro
Copy link
Collaborator

Just wanted to check in, @kmacdonald-stsci : have you had a chance to investigate the files that David mentioned above? We're hoping to get this in soon, i.e. prior to CoB Thursday, so I want to make sure we address these issues found during testing.

@kmacdonald-stsci
Copy link
Collaborator Author

Just wanted to check in, @kmacdonald-stsci : have you had a chance to investigate the files that David mentioned above? We're hoping to get this in soon, i.e. prior to CoB Thursday, so I want to make sure we address these issues found during testing.

No. I am still investigating the files Melanie gave me.

@kmacdonald-stsci
Copy link
Collaborator Author

Finally had a chance to test this out a little. A few quick comments:

  • The likelihood routine may be doing a better job of eliminating CR showers in MIRI data, in at least some cases (jw01236004001_02101_00001_mirifulong).
  • As for @melanieclarke , I see some small differences in results with jump on/off for both jw01236004001_02101_00001_mirifulong and jw01523003001_03102_00001_mirifushort_uncal.fits. E.g., pixel (X,Y)=(547,576) 1-indexed in the jw1523 exposure.
  • I'm seeing an issue when processing jw01264014001_02101_00001_mirifushort_uncal.fits with
    2024-09-12 12:28:56,238 - stpipe.Detector1Pipeline.ramp_fit - WARNING - /Users/dlaw/jwcode/stcal_jwst/stcal/src/stcal/ramp_fitting/likely_fit.py:288: RuntimeWarning: invalid value encountered in multiply
    2024-09-12 12:28:56,238 - stpipe.Detector1Pipeline.ramp_fit - WARNING - new_cts[i_d2] = np.sum(result.countrate_two_omit * droptwo, axis=0)[i_d2]
    Similarly with jw01247667001_02101_00001_mirifulong_uncal.fits, but for both i_d1 and i_d2 on lines 286 and 288.
  • I'm also seeing an issue for jw01247667001_02101_00001_mirifulong_uncal.fits with the combination of individual integration data into the final rate value. This is a bright multi-int data set, and many integrations have NaNs in them which isn't necessarily surprising. However, any NaNs in a single integration are producing NaNs in the final rate image as well as if they're being mean-combined rather than nanmean-combined? See snake-like features near the top left of rate image created from this exposure. Reminds me a lot of JP-3004 but for NaNs.
  • Question perhaps for @t-brandt ; one of the appeals of this approach was runtime, which we expected to be a few seconds or so (depending on the observation of course). Actual runtime in some of the examples above though is factor of 10 longer than the C-based OLS code. We shouldn't optimize right now (let's study science performance gains first), but what's your opinion on the parameter space available for such speed gains in future?

I fixed the jump detection differences and suppressed the warnings you were seeing.

The integrations get combined here:

https://github.com/kmacdonald-stsci/stcal/blob/f97d568c4fa5bcb9bebe4b058e231e19557fc068/src/stcal/ramp_fitting/likely_fit.py#L355

@t-brandt is the behavior @drlaw1558 notes above expected with respect to NaN's?

@drlaw1558
Copy link
Contributor

Based on the code link, I think the issue is the np.median and np.sum statements. np.median would probably replace easily with np.nanmedian, although the np.sum calls would be a little trickier. np.nansum returns 0 for all-NaN inputs which isn't desirable either, so we'd want something that ignores NaNs if present but return NaN if all inputs are NaN.

@melanieclarke
Copy link
Contributor

Based on the code link, I think the issue is the np.median and np.sum statements. np.median would probably replace easily with np.nanmedian, although the np.sum calls would be a little trickier. np.nansum returns 0 for all-NaN inputs which isn't desirable either, so we'd want something that ignores NaNs if present but return NaN if all inputs are NaN.

^^^
For the nansum, I usually do a check for all-NaN values in the input and restore them after. E.g.

all_nan = np.all(np.isnan(components), axis=0)
nan_summed = np.nansum(components, axis=0)
nan_summed[all_nan] = np.nan

Copy link
Contributor

@melanieclarke melanieclarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing again, I think all my comments on the code and docs clean up have been addressed, except for one more documentation clarification, below.

Thanks for fixing the jump flags and excessive warnings! I'm testing again on my data examples now.

I do think we should address the issue David pointed out with the NaNs in integration combination, but assuming local tests look good, we should be okay to get this in after that.

docs/stcal/ramp_fitting/description.rst Show resolved Hide resolved
@melanieclarke
Copy link
Contributor

For my local tests, I am still seeing some differences between running with and without the jump step on, but in a lot fewer pixels. Diffs are now in ~2000 pixels total, <1% of the image size.

It looks like they might be snowball related? E.g. file jw02301006001_03101_00001_nrs1_rate.fits, near pixel x,y=493,44. Top is skipping jump, bottom is not skipping jump (default behavior). Left to right are SCI, ERR, DQ extensions.

jw02301006001_03101_00001_nrs1_rate

@t-brandt
Copy link

I am not sure if this comment is still relevant, but my suggestion for the NaNs when combining multiple integrations is to replace NaN values with zeros in the count rate and in the inverse variance. Then, they will receive zero weight when adding the to valid measurements, and if there are no valid measurements, you will wind up with 0/0.

@kmacdonald-stsci
Copy link
Collaborator Author

For my local tests, I am still seeing some differences between running with and without the jump step on, but in a lot fewer pixels. Diffs are now in ~2000 pixels total, <1% of the image size.

It looks like they might be snowball related? E.g. file jw02301006001_03101_00001_nrs1_rate.fits, near pixel x,y=493,44. Top is skipping jump, bottom is not skipping jump (default behavior). Left to right are SCI, ERR, DQ extensions.

jw02301006001_03101_00001_nrs1_rate

My testing is showing zero differences in the science array and the variance arrays.

Screenshot 2024-09-19 at 7 58 54 AM

@melanieclarke
Copy link
Contributor

My testing is showing zero differences

Hmm, okay, I'll install again in a clean environment and test again.

Have you or @drlaw1558 checked that the NaN combination issues with jw01247667001_02101_00001_mirifulong_uncal.fits are now resolved?

@melanieclarke
Copy link
Contributor

I still see the same differences after pulling and reinstalling, but even if it isn't just me, they are small, and there is a workaround -- just don't run the jump step, as is recommended. I think we can leave that for future work.

I also tried reducing jw01247667001_02101_00001_mirifulong_rate.fits with your latest changes, and I don't see excessive NaNs, so I think that's fixed too.

Copy link
Contributor

@melanieclarke melanieclarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All my comments have been addressed. Thanks for all the hard work on getting this ready!

@tapastro tapastro merged commit 448413b into spacetelescope:main Sep 19, 2024
24 of 26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation ramp_fitting testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants