From 03505ac643e1a08d0615535eac0f9f6280771e33 Mon Sep 17 00:00:00 2001 From: James Bonfield Date: Mon, 28 Oct 2024 10:25:24 +0000 Subject: [PATCH] Explicitly permit eg N+m in MM tag (PR#799) The text already states that an unmodified base of N means we count any base type, but base N code N in the table is a little misleading as to the intention. It was intended to mean any unspecified modification, in the same way C+C is any unspecified C mod, but in this case it's against all bases rather than a specific base type. However that doesn't solve the issue of whether we can record specific mods against any "fundamental" source base. Clarified this by adding an extra line to the table and some text. (However note this doesn't necessarily imply downstream processing tools will not do any compatibility assessment and reject N+m when the SEQ base is a T.) Fixes #785 --- SAMtags.tex | 2 ++ 1 file changed, 2 insertions(+) diff --git a/SAMtags.tex b/SAMtags.tex index eba269c7..422aa375 100644 --- a/SAMtags.tex +++ b/SAMtags.tex @@ -532,6 +532,7 @@ \subsection{Base modifications} An unmodified base of `{\tt N}' means count any base in {\sf SEQ}, not only those of `{\tt N}'. Thus `{\tt N+n,100;}' means the 101st base is Xanthosine (n), irrespective of the sequence composition. +A fundamental base of `{\tt N}' may also be used with a base-specific modification code to force the counting to be applied per base rather than per base-type. The standard code types and their associated ChEBI values are listed below, taken from Viner {\it et al.}% @@ -567,6 +568,7 @@ \subsection{Base modifications} \hline N & n & Xao & Xanthosine & 18107 \\ N & N & & Ambiguity code; any mod & \\ +N & any & & Mod applied to any base & \\ \end{tabular} \end{center}