Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mph norm workflow #125

Merged
merged 27 commits into from
Jun 15, 2022
Merged

Mph norm workflow #125

merged 27 commits into from
Jun 15, 2022

Conversation

ngreenwald
Copy link
Member

@ngreenwald ngreenwald commented Jun 7, 2022

If you haven't already, please read through our contributing guidelines before opening your PR

What is the purpose of this PR?

  1. Adds functionality to the set_up_toffy notebook which allows the user to construct a tuning curve for their instrument. This takes in a detector sweep, and produces a tuning curve relating MPH to signal intensity
  2. Creates the 4b_normalize_image_data notebook which walks the user through the normalization process. This assumes the user has already constructed a tuning curve, and will allow them to easily normalize their data
  3. Switches the logic for normalization. Previously, we tried to fit a relationship between mass and MPH within an FOV, which would then allow us to use a small number of MPH values from selected masses to normalize. However, this created more problems than it solved. Instead, we now compute the MPH for every mass in the panel. This is more compute intensive, but it is quite fast and won't be a major bottleneck. We then fit a separate curve for each mass over the course of the run, modeling how the MPH is changing for that mass as a function of run length. This results in very smooth estimates of MPH for each marker in the panel. We then use those adjusted mass-specific MPH values to normalize the data.
  4. Adds plotting and QC outputs to allow the user to visualize the curve fit, as well as get notified for any normalization values which are outside a proscribed range
  5. Updates the 4a_rosetta notebook to have the same default paths as the rest of the notebooks. Also removes the nested for loop structure in favor of processing a single run, simplifying the code

Closes #37

* combine metrics doesn't require bins

* helper function for formatting df

* refactor saving function

* fov helper function

* refactored top level function

* pycodestyle

* use median for small datasets

* update notebook

* close plots

* fix bug in df construction

* sort df by channels

* outlier identification

* plot outliers separately

* remove outlier functionality

* change outlier detection

* updated docstrings

* make test more robust
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

@ackagel ackagel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far. It looks like a lot of my review comments in review-nb are in the works in normalize.py

toffy/normalize.py Outdated Show resolved Hide resolved
toffy/normalize.py Outdated Show resolved Hide resolved
toffy/normalize.py Show resolved Hide resolved
@@ -15,9 +15,10 @@
"id": "e36293c5-aa89-4029-a3fa-e8ea841bb8b5",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't the sensitivity curve generation go in 4b, since it's only used there? Is the idea that putting this here will encourage people to run a sweep before data aq?


Reply via ReviewNB

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's because it only needs to happen once per instrument, not separately for each run. So having it here means it won't be present in the notebook each time people are normalizing

@@ -15,9 +15,10 @@
"id": "e36293c5-aa89-4029-a3fa-e8ea841bb8b5",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this be done programmatically here?


Reply via ReviewNB

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweeps aren't created in their own folder, they're just put into the main /Data folder. So you need to separately identify each FOV from the sweep, which are given the generic names. This was initially what that find_detector_sweeps function was for, but then Erin had a couple sweeps where FOVs were missing, so it gave an error.

We could change it so that it would give a warning when an FOV is missing, rather than an error, and then ask people to list the first FOV and last FOV of their sweep and it would find the rest, but at that point it started to feel like the solution was almost as complicated as the problem. Up to you though, it would be an easy change

templates/1_set_up_toffy.ipynb Show resolved Hide resolved
templates/4b_normalize_image_data.ipynb Show resolved Hide resolved
Copy link
Contributor

@alex-l-kong alex-l-kong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few aesthetic comments mostly in the notebook.

toffy/normalize.py Outdated Show resolved Hide resolved
@ngreenwald ngreenwald merged commit e15f1d1 into main Jun 15, 2022
@ngreenwald ngreenwald deleted the mph_norm_workflow branch June 15, 2022 23:36
@camisowers camisowers added the enhancement New feature or request label Sep 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Image normalization todos
4 participants