-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to get the sequence of C region? #244
Comments
TRUST4 only assembles the first portion of C genes, maybe around 200bp. To get those sequences, you also need the AIRR format, and utilize the "sequence" column and the J_align column, where everything after J_align may correspond to C gene. Or you extract the sequences after the "sequence_alignment" portion. I can add a "c_cigar" column in TRUST4 later, which will give you a more accurate range of C gene on the sequence. |
Yes, I get the C gene sequence by processing the airr file: extract the sequence after the sequence contained in the "sequence_alignment" column in the "sequence" column, which is the partial sequence of the C gene. But there are two questions, the first is the sequence of FR4 region (J gene part) is not included in the "sequence"? the second is that with the current version, we can only get a partial sequence of the C gene, right? |
If the assembled contig contains the j gene part, it will be in both sequence and sequence_alignment columns. Right. C gene is much less diverse, so there is no need for full-length C gene assembly to identify it. Just curious, why do you need the full sequence of C gene? |
I forgot to mention that the header i the _annot.fa file in the smartseq wrapper also contains the coordinate for the C gene, which probably is more accurate than using all the sequences after J gene. |
Thanks very much! I got it. The main reason why I want the C gene sequence is to further understand smart-seq data and clear the use of TRUST4. Thank you again for your timely reply! Hey hey^_^ |
Excuse me again. Because I analyzed the TCR information of smart-seq3 through Trust-Smartsep.pl, but in the report and airr files, I do not found the sequence of the constant region C region, so I hope to get your help on how to obtain the sequence of the constant region C region.
The text was updated successfully, but these errors were encountered: