-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JP-2037 and JP-1926: Fixing ramp fitting multiprocessing to work with new STCAL interface. #30
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to these, I may have a few other comments.
src/stcal/ramp_fitting/ols_fit.py
Outdated
new_model.err : ndarray | ||
The output total variance for each pixel, 2-D float32 | ||
rows_per_slice: list | ||
The number of rows in each row. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should that be "The number of rows in each slice" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Thank you for catching that.
new_model.var_rnoise : ndarray | ||
The variance in each pixel due to read noise, 2-D float32 | ||
pool_results: list | ||
The list of return values from ols_ramp_fit_single for each slice. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps state what these values are in the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expanded this docstring.
group_time=ramp_data.group_time, | ||
groupgap=ramp_data.groupgap, | ||
nframes=ramp_data.nframes, | ||
drop_frames1=ramp_data.drop_frames1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this meta data for same for each slice ? If so perhaps it can be moved outside the loop this function is within.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This meta data is needed by each process in multiprocessing, therefore each RampData
class needs this information.
return out_model, int_model, opt_model | ||
|
||
|
||
def create_output_models(input_model, number_of_integrations, save_opt, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has create_output_models( ) been replaced by another function ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. There are no models being returned from ramp fitting anymore. Ramp fitting returns tuples of arrays that the step code in each pipeline will now use to create the desired data model specific to that pipeline.
For multiprocessing, however, output arrays need to be created that are assembled from the sliced return values. The function create_output_info
does this in assemble_pool_results
.
if save_opt: | ||
get_opt_slice(opt_info, opt_slice, current_row_start, nrows) | ||
current_row_start = current_row_start + nrows | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does get_integ_slice( ) assemble the output info from the slices whose output info was assembled from each image by get_image_slice( ) ? Maybe add a bit more documentation to clarify what these functions do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. The sliced (by row) return values are returned from starmap
where each sliced input is used as parameters for ols_ramp_fit_single
. These sliced return values are used by the get_xxxx_slice()
functions to assemble each slice into a single xxxx
product to be returned from ramp fitting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good!
slope[:, :oslope.shape[1], srow:erow, :] = oslope | ||
sigslope[:, :osigslope.shape[1], srow:erow, :] = osigslope | ||
var_poisson[:, :ovar_poisson.shape[1], srow:erow, :] = ovar_poisson | ||
var_rnoise[:, :ovar_rnoise.shape[1], srow:erow, :] = ovar_rnoise | ||
yint[:, :oyint.shape[1], srow:erow, :] = oyint | ||
sigyint[:, :osigyint.shape[1], srow:erow, :] = osigyint | ||
weights[:, :oweights.shape[1], srow:erow, :] = oweights | ||
crmag[:, :ocrmag.shape[1], srow:erow, :] = ocrmag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the use of mnemonics here. Much easier to read code. What's the significance of shape[1]
? Is that number of integrations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The significance of shape[1]
is that it is of variable size. All but the crmag
depend on the maximum number of segments in each ramp. The crmag
size depends on the maximum number of cosmic rays. Since it's possible for each slice to have different maximum sizes, the full optional results arrays will be set to the maximum across all slices. Since it's possible for the different slices to be of different sizes simple assignment can throw an error trying to assign arrays of different sizes.
For example, it is not possible to do
crmag[:, :, srow:erow, :] = ocrmag
if crmag[:, :, srow:erow, :].shape
is (1, 5, 256, 2048), while ocrmag.shape
is (1, 2, 256, 2048).
The shapes must have the same dimensions. By construction the first, third, and fourth dimensions will be the same. The second dimension of crmag
will always be at least as large as the second dimension of ocrmag
, but it may be larger. To ensure that this dimension is the same, it needs to be explicitly applied during assignment.
…=nd.float32, but should be dtype=np.uint32. Changed to allocate an array with the correct dtype.
Codecov Report
@@ Coverage Diff @@
## main #30 +/- ##
=======================================
Coverage ? 10.30%
=======================================
Files ? 13
Lines ? 1563
Branches ? 0
=======================================
Hits ? 161
Misses ? 1402
Partials ? 0 Continue to review full report at Codecov.
|
@kmacdonald-stsci This needs a change log entry and it's ready to be merged. |
I realize this PR has already been merged; nevertheless LGTM. |
Ramp fitting multiprocessing is now working as it did in JWST 1.1.0 with the optional results bug detailed in JP-1926. For JP-1926, the optional results product varies in size depending on the maximum number of segments. When sliced results from each process was being reassembled into the final optional results product returned from ramp fitting, the maximum sizes per slice may be different, so this needed to be accounted for.
All unit tests pass on JWST.