Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JP-3698: Fixed Memory Leak in Ramp Fitting C-Extension #281

Merged
merged 5 commits into from
Aug 23, 2024

Conversation

kmacdonald-stsci
Copy link
Collaborator

@kmacdonald-stsci kmacdonald-stsci commented Aug 17, 2024

…eturned too early, so never decremented reference counts for python objects.

Resolves JP-3698

Closes jwst/#8680
Closes #276

This PR addresses a memory leak found in the ramp fitting C-extension. Added decrement reference counters to objects referenced that were passed from the python code. These missing decrements meant that reference counts were not properly garbage collected.

Attached is a screen shot of the memory usage from memray demonstrating there is no memory leak.

jp_3698_memory_flat

Checklist

  • added entry in CHANGES.rst (either in Bug Fixes or Changes to API)
  • updated relevant tests
  • updated relevant documentation
  • updated relevant milestone(s)
  • added relevant label(s)

…eturned too early, so never decremented reference counts for python objects.
Copy link

codecov bot commented Aug 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.38%. Comparing base (2112773) to head (cd47dc6).
Report is 179 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #281      +/-   ##
==========================================
+ Coverage   84.32%   84.38%   +0.05%     
==========================================
  Files          41       41              
  Lines        7529     7557      +28     
==========================================
+ Hits         6349     6377      +28     
  Misses       1180     1180              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

Copy link
Contributor

@drlaw1558 drlaw1558 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't speak to the details of the C changes, but empirically it looks like this has fixed the memory leak in my tests as well. Previously I saw the memory used grow by about 6-7 GB over the course of processing about 25 files from PID 1523, and with this branch that number is more like 1 GB (which I'd consider within uncertainty given how I've been measuring usage).

@braingram
Copy link
Collaborator

Thanks for the responses.

Changes look good to me. Is there a link to jwst regression tests with this PR?

Do you think it's worth adding a unit test to check for the re-introduction of this problem? Running main the extra references fixed in this PR can be seen (with sys.getrefcount) for data, err, pixeldq groupdq and averagedarkcurrent after a call to ramp_fit_data. I looked for a close unit test that might be modifiable but I'm not seeing any that use OLS_C. Do the unit tests only cover the python version?

@kmacdonald-stsci
Copy link
Collaborator Author

Thanks for the responses.

Changes look good to me. Is there a link to jwst regression tests with this PR?

Do you think it's worth adding a unit test to check for the re-introduction of this problem? Running main the extra references fixed in this PR can be seen (with sys.getrefcount) for data, err, pixeldq groupdq and averagedarkcurrent after a call to ramp_fit_data. I looked for a close unit test that might be modifiable but I'm not seeing any that use OLS_C. Do the unit tests only cover the python version?

I'll think of a unit test that can check the reference counts. I am working on a PR now to change all unit tests to test OLS_C, since it is now the default.

Copy link
Collaborator

@braingram braingram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding and fixing the bug!

…ere causing a memory leak in the ramp fitting C-extension.
@kmacdonald-stsci
Copy link
Collaborator Author

Thanks for the responses.

Changes look good to me. Is there a link to jwst regression tests with this PR?

Do you think it's worth adding a unit test to check for the re-introduction of this problem? Running main the extra references fixed in this PR can be seen (with sys.getrefcount) for data, err, pixeldq groupdq and averagedarkcurrent after a call to ramp_fit_data. I looked for a close unit test that might be modifiable but I'm not seeing any that use OLS_C. Do the unit tests only cover the python version?

I developed a unit test for this. Take a look. It creates a basic RampData class, along with some other necessary objects, then runs only the C-extension, examining the reference count before and after the call for objects of interest, then asserts they must be the same.

@braingram
Copy link
Collaborator

braingram commented Aug 21, 2024

Thanks for the responses.
Changes look good to me. Is there a link to jwst regression tests with this PR?
Do you think it's worth adding a unit test to check for the re-introduction of this problem? Running main the extra references fixed in this PR can be seen (with sys.getrefcount) for data, err, pixeldq groupdq and averagedarkcurrent after a call to ramp_fit_data. I looked for a close unit test that might be modifiable but I'm not seeing any that use OLS_C. Do the unit tests only cover the python version?

I developed a unit test for this. Take a look. It creates a basic RampData class, along with some other necessary objects, then runs only the C-extension, examining the reference count before and after the call for objects of interest, then asserts they must be the same.

Thanks! Does the unit test use OLS_C? I'm not seeing where it overwrites the python default.

EDIT: nevermind... I see where it imports the c extension directly. Looks great!

Copy link
Collaborator

@tapastro tapastro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Ran regression tests just to ensure no unforeseen bugs, which passed (all failures known to be from other issues): https://plwishmaster.stsci.edu:8081/job/RT/job/JWST-Developers-Pull-Requests/1669/

@nden do we need to request review from a Roman rep to get this merged?

@nden
Copy link
Collaborator

nden commented Aug 23, 2024

@tapastro This doesn't need Roman approval. Feel free to merge when good.

@tapastro tapastro merged commit bd5191c into spacetelescope:main Aug 23, 2024
25 of 26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory leak in OLS_C ramp fitting Memory leak in OLS_C ramp fitting.
5 participants