-
Notifications
You must be signed in to change notification settings - Fork 2
Calibration
OpenSwath calibrates the RT, m/z and IM values using a set of designated peptides (formerly called "iRT peptides"). The idea is to establish a mapping between the library values of these coordinates and the recorded values in the current run and compute a transformation function to map between the two. Linear calibration is performed by the performRTNormalization function (called by performCalibration_()). This function first calls simpleExtractChromatograms_ to extract the iRT chromatograms and then calls doDataNormalization_() to perform RT, m/z and IM calibration.
Chromatogram Extraction (simpleExtractChromatograms_())
simpleExtractChromatograms_()) extracts chromatogrmas for iRT calibration. It uses similar methods as in the normal chromatogram extraction. Note that there are alternative methods for SONAR which are not discussed here. For more information on extraction please see Chromatogram Extraction page.
- Determine the SWATH which the transition is located in (selectSwathTransitions).
- Note This function is used for PASEF and PRM as well allowing for multiple windows to be associated with an iRT peptide? Later the best feature will be determined.
- For each transition/precursor, compute the coordinates of extraction (prepareExtractionCoordinates_(), prepare_coordinates())
- Extract Chromatograms (extractChromatograms())
- Convert Chromatogram Ptr to MSChromatogram and add metainformation (return_chromatogram())
- Check that chromatograms are not empty, if a chromatogram is empty, warn the user.
After exiting the function the chromatogram can be written out if desired.
Calibration is performed across the RT, m/z and IM axis. First RT calibration is performed using the MRMRTNormalizer module based on methods described in this publication. RT calibration involves iteratively dropping outlier compounds untill a suitable regression (high R2) is computed.
Then m/z and IM calibration is performed with the filtred set of iRT peptides using the SwathMapMassCorrection module.
1. Estimate the RT range of the iRT peptides (estimateRTRange())
This is done by the estimateRTRange() function by iterating through all peptides and storing the peptides with the maximum and minimum RT.
Here input chromatograms are picked to identify RT pairs from the input data using MRMFeatureFinderScoring. Since we do not want to assume any RT information when determining features, the following parameters are adjusted for feature detection:
-
Scores:use_rt_score
- false -
Scores:use_elution_model_score
- false -
stop_report_after_feature
- 1 -
TransitionGroupPicker:PeakPickerMRM:signal_to_noise
- 1.0 -
TransitionGroupPicker:compute_peak_quality
- false (if estimateBestPeptides is off) -
TransitionGroupPicker:compute_peak_quality
- true (if estimateBestPeptides is on) TransitionGroupPicker:minimal_quality
Feature finding is done in a similar manner to the main feature finder, see here page for more details.
4A. Find most likely correct feature for each compound simpleFindBestFeature()
This method calls calls [getBestFeature()] to determine the feature with the highest score. A quality cutoff can be provided to ensure that the best feature returned is above a certain threshold. If a compound has no good features, will return no features.
Store a vector of (exp_rt, theoretical_rt) tuples.
The function/algorithm used for detecting outliers depends on theRTNormalization::outlierMethod
parameter.
iter_residual
or iter_jackknife
- removeOutliersIterative
This method iteratively remove outliers until:
- We have too few points (less points than
-min_coverage
) - The R2 is higher than the upper limit (
-min_rsq
) - Cannot remove any more points (only occurs if
-useIterativeChauvenet
istrue
)- chauvnet's checks are computed using chauvenet
To remove outliers:
- Compute the linear regression
- Compute the R2 of the linear regression
- If R2 <
-min_rsq
execute steps below, otherwise exit loop- compute the residuals
- Determine the outlier to remove 1 point
- If
iter_jackknife
determine which datapoint removal results in the best R2 jackknifeOutlierCandidate_() - If
iter_residual
remove the datapoint with the largest residual residualOutlierCandidate_() - If Chauvenet checking is turned off or Chauvenet conditions met remove the outlier. If not, exit the loops
- If
If after removing all possible outliers the R2 is below the cutoff we throw an exception. If R2 fit is good, we return a filtered vector of (exp_rt, theoretical_rt) tuples.
ransac
- removeOutliersRANSAC
- uses
-RTNormalization::RANSACMaxPercentRTThreshold
,-RTNormalization::RANSACMaxIterations
and-RTNormalization::RANSACSamplingSize
TODO
6. Check whether the found peptides fulfill the binned coverage criteria set by the user (computeBinnedCoverage)
computeBinnedCoverage is only performed if -RTNormalization::estimateBestPeptides
is true
.
-RTNormalization::NrRTBins
-RTNormalization::MinPeptidesPerBin
-RTNormalization::MinBinsFilled
This method proceeds as follows:
- For each (exp_rt, theoretical_rt) tuple, determine what bin the compound is in
- Normalize the theoretical_rt to a value between 0 and 1
- multiply by
-RTNormalization::NrRTBins
- Truncate the value, (convert to integer, get rid of decimal), this is the bin number the peptide is in
- For each bin check if it is filled (has more than
RTNormalization:MinPeptidesPerBin
- If #BinsFilled >=
-RTNormalization::MinBinsFilled
return True
8A. Select the "correct" peaks for m/z correction (e.g. remove those not part of the linear regression)
This involves removing the peaks that were considered "outliers" in the RT regression (not used in RT regression)
8B. Perform m/z calibration correctMZ()(SwathMapMassCorrection)
This function computes a regression for m/z correction. The workflow for this function is as follows:
-
For each transition group
- fetch the peptide meta information
- fetch the best feature associated with the transtion group (findBestFeature)
- In this method, iterate through all features and find the one with the best overall quality getOverallQuality() and fetch its retention time.
- Get the swath maps
- Get the spectrum at this RT fetchSpectrumSwath
- For each transition
- Integrate the (adjusted) spectrum at the position of theoretical mass (adjustExtractionWindow), (integrateWindow)
- Store the data
- theoretical mz = what is stored in the library
- experimental mz is the intensity weighted mz of the transition.
- Weight = log2 of intensity
- Difference in m/z (ppm)
-
Compute the regression (based on
-mz_correction_function
parameter)-
unweighted_regression
- compute unweighted regression computeRegression -
weighted_regression
- compute weighted linear regression computeRegressionWeighted -
quadratic_regression
- compute quadratic regression computeRegression -
weighted_quadratic_regression
- compute weighted quadratic regression (computeRegressionWeighted) -
quadratic_regression_delta_ppm
- compute quadratic regression using ppm as weights (computeRegressionWeighted) -
regression_delta_ppm
- compute linear regression using ppm as weights (computeRegressionWeighted) -
weighted_quadratic_regression_delta_ppm
- compute quadratic regression using ppm and intensity weights (computeRegressionWeighted)
-
-
If in debug mode, compute the sum residual square before and after the regression
-
Replace the swath files with a transforming wrapper SpectrumAcessQuadMZTransforming
8C. Perform IM calibration correctIM()
This function computes a regression for IM correction. The type of regression performed is specified by the -Calibration:im_correction_function
parameter. Currently only linear regression is supported. This function is very similar to the correctMZ() function above.
-
For each transition group
- fetch the best feature associated with the transtion group (findBestFeature)
- In this method, iterate through all features and find the one with the best overall quality getOverallQuality() and fetch its retention time.
- Get the swath maps
- Get the spectrum at this RT fetchSpectrumSwath
- For each transition (including MS1 level if enabled)
- Integrate the (adjusted) spectrum at the position of theoretical mass (adjustExtractionWindow), (integrateWindow)
- Store the data
- theoretical mz = what is stored in the library
- experimental mz is the intensity weighted mz of the transition.
- Weight = log2 of intensity
- Difference in m/z (ppm)
- fetch the best feature associated with the transtion group (findBestFeature)
-
Compute the regression, currently only linear regression is supported (computeRegression()
-
Fit the model (fitModel)
9. Store RT transformation, using the selected model (fitModel)
If there is a non-linear iRT file present, then non-linear calibration is performed after linear calibration.
Non linear calibration uses the simpleExtractChromatograms_ and doDataNormalization_() discussed above. For non linear extraction estimateBestPpetides
is always true and the -rt_extraction_window
is set at 600.