Add minimal terms for ion mobility frames #365

mobiusklein · 2025-01-22T16:27:41Z

Closes #361

This adds a minimal set of terms that were important for #361, omitting special cases and obsolete representations.

Open questions remaining:

Do we need analogs to scan window lower limit + upper for ion mobility?
Need ProteoWizard implementation. Should be straight-forward to add after OBO is updated.

chambm · 2025-01-22T16:34:04Z

Seems reasonable, except what instrument has profile IM values? Waters has 200 bins and their masses are profile; 200 doesn't seem enough to count as profile. Bruker is 1000ish bins but the masses are explicitly centroided at acquisition. I'll be honest I didn't follow much of the linked issue discussion.

mobiusklein · 2025-01-22T17:07:09Z

@chambm What we want to express is whether or not the IM dimension is expected to contain a series of points for an analyte, or whether it has been processed s.t. each point refers to a distinct analyte (or two analytes so close we can't tell them apart). This is the same idea for profile spectrum vs. centroid spectrum.

Put another way, "Can I treat this spectrum as a peak list or do I need to do something to it before I can go off and use them?".

The bin spacing is a valid point for IM peak resolution, but I don't think that's something we can get from the vendor software and will vary in definition by unit and/or vendor.

chambm · 2025-01-22T17:13:49Z

For centroid spectra it's still extremely common to have multiple peaks for a single analyte, e.g. isotopes and charge states. Only a deconvolved and deisotoped centroided spectrum would (potentially?) be one peak per analyte (even then it depends how you define "analyte" I think).

But if I understand what you're saying, when coming straight from ProteoWizard, IM representation would always be ion mobility profile frame?

mobiusklein · 2025-01-22T17:23:53Z

Apologies, I should have said ion, not analyte. But yes, you have the idea, I think.

I'm not aware of any filters in MSConvert that collapse the IM dimension, except maybe scanSumming?

chambm · 2025-01-22T17:53:04Z

ScanSumming just drops it entirely. I'm still not sure about this "centroid/profile" distinction in the IM dimension. I think it needs more consideration from other IM-interested folks.

edeutsch

This seems fine to me, but I am not very close to ion mobility data and software, so I am uncertain what the needs really are.

mobiusklein · 2025-01-24T03:08:27Z

I admit I worked primarily on larger molecules with fairly wide mobilograms/arrival time distributions using Waters Cyclic and Bruker timsTOF data, where frames/cycles do form well-defined peaks, and my approach to both was to gather everything into isotopic pattern fitted groups and look for structure over the mobility and RT dimensions. The example file in the linked issue are single frame IM feature extractions at a single RT.

The same idea is applied to timsTOF data in IonQuant, where m/z-IM-rt abundances are extracted from contiguous observations, but they use identification seeds instead of using traditional feature detection techniques. I think this article states the same idea is used in MaxQuant for timsTOF data.

jspaezp · 2025-02-04T00:46:32Z

Hey there! new to this whole process so here are my 2 cents (as someone who currently writes software for timsTOF data), I think the distinction between the two representations makes a lot of sense, I have certainly struggled with what name to give intermediate representations of data processing where the ion-mobility dimension is ... 'centroided' ... without it being discarded and I also see the process as analogous to centroiding in the m/z dimension.

(graphical illustration to make sure we are talking about the same thing, disregard the cluster index in the right panel)

This is essentially what I implemented here: lazear/sage#166 if anyone is interested ... Right now it all happens in memory and not exported, but I can imagine wanting to export it to disk at some point.

What issues do I see?

It is very common for a single frame to represent multiple pieces of data ... for instance in dda/diaPASEF, a single frame will contain the spectra for multiple isolated precursors (think ... from ims 0.7-0.8 the quad is set to isolate from x-y, in ims 0.8-0.95 from z-w ...). It is unclear to me how/whether the current implementation allows communicating that. I am not sure if the "right" understanding of the frame is as a property of a spectrum, rather than a collection of them (I think this is most likely lack of knowledge of the ontology rather than a real issue with the proposed terms).

Regarding this point:

This seems fine to me, but I am not very close to ion mobility data and software, so I am uncertain what the needs really are.

I certainly feel like it would have the same utility as the profile-centroid distinction in the m/z dimension, is there any reason why you feel that might not be the case?

mobiusklein · 2025-02-04T20:11:36Z

It is very common for a single frame to represent multiple pieces of data ... for instance in dda/diaPASEF, a single frame will contain the spectra for multiple isolated precursors (think ... from ims 0.7-0.8 the quad is set to isolate from x-y, in ims 0.8-0.95 from z-w ...). It is unclear to me how/whether the current implementation allows communicating that. I am not sure if the "right" understanding of the frame is as a property of a spectrum, rather than a collection of them (I think this is most likely lack of knowledge of the ontology rather than a real issue with the proposed terms).

The 3D spectrum / ion mobility frame concept in general doesn't. That behavior is up to the reader to understand from the instrument metadata. That's why ProteoWizard defines "raw" frames differently depending upon what it infers from the TDF file contents, inferred by reading the distinct SQL tables: https://github.com/ProteoWizard/pwiz/blob/8a73de245643eeddb423505f11dd3e841bb78ec4/pwiz_aux/msrc/utility/vendor_api/Bruker/TimsData.cpp#L297-L471. The timsrust library does something similar. For Waters, depending upon the combination of configurations with cyclic IMS, SONAR, and other tricks, you get different topologies, but only the trivial MS1/low energy configuration matches the timsTOF configuration.

mobiusklein · 2025-02-04T21:41:02Z

I have certainly struggled with what name to give intermediate representations of data processing where the ion-mobility dimension is ... 'centroided' ... without it being discarded and I also see the process as analogous to centroiding in the m/z dimension.

@jspaezp I forgot to respond to this point. Are you referring to #361 (comment) 's ion mobility feature frame idea or did you find you were interested in that separately?

add minimal terms for IM frames

2612ef0

mobiusklein requested review from chambm and edeutsch January 22, 2025 16:28

fix duplicate ID

24d0afc

edeutsch approved these changes Jan 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add minimal terms for ion mobility frames #365

Add minimal terms for ion mobility frames #365

mobiusklein commented Jan 22, 2025

chambm commented Jan 22, 2025

mobiusklein commented Jan 22, 2025

chambm commented Jan 22, 2025

mobiusklein commented Jan 22, 2025

chambm commented Jan 22, 2025

edeutsch left a comment

mobiusklein commented Jan 24, 2025

jspaezp commented Feb 4, 2025

mobiusklein commented Feb 4, 2025

mobiusklein commented Feb 4, 2025

Add minimal terms for ion mobility frames #365

Are you sure you want to change the base?

Add minimal terms for ion mobility frames #365

Conversation

mobiusklein commented Jan 22, 2025

chambm commented Jan 22, 2025

mobiusklein commented Jan 22, 2025

chambm commented Jan 22, 2025

mobiusklein commented Jan 22, 2025

chambm commented Jan 22, 2025

edeutsch left a comment

Choose a reason for hiding this comment

mobiusklein commented Jan 24, 2025

jspaezp commented Feb 4, 2025

mobiusklein commented Feb 4, 2025

mobiusklein commented Feb 4, 2025