-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not clear how to make a custom DB #112
Comments
Hello, Here are some tips on how build custom database or add sequences to an existing database at #20 from previous MOB-suite versions. The The The Hope this helps. Sorry for the response delay |
Thanks for the help, I have a multi-fasta database (new_database.fasta) with some plasmids in it. I have formatted the taxonomy file in this way with tab delimited spacer, plasmids are from Klebsiella pneumoniae but cannot figure what to type: (PROBLEM1) sample_id organism after that, using "sample_mobtyper_results.txt" I launched the mob cluster, everything in a desktop folder: $ mob_cluster --mode build -f new_database1.fasta -p sample_mobtyper_results.txt -t taxonomy.txt --outdir new_databse1 --num_threads 12 I obtain in this case 10 files in the out_dir clusters.txt At this point I try to use mob_recon but it does not produce an output: (PROBLEM2) $ mob_recon -o prova -i KP1057_ST512_KPC-3.fasta -s KP1057 -d new_databse1/ It re-installs the default ncbi database and it overwrites the clusters.txt file I then re-specified the two files as described in the issue page but the output did not change, so i substituted in the conda folder the cluster.txt file. $ mob_recon -n 12 -o prova577c -i KP577_Complete.fasta -s KP577c --plasmid_db new_database/references_updated.fasta --plasmid_mash_db new_database/references_updated.fasta.msh At this point the output changes but I see that the program assigns contigs that are part of the plasmids (I am checking sequences I know) to chromosome and excludes them. The program is not capable of finishing correctly the analysis. Is what I have done correct? It is a bit tricky and it is not explained in that way so maybe I am doing something wrong. Any suggestion on increase accuracy in mob_recon? I can give you my inputs and database if it may help. Thanks for all, GL |
Good day, Indeed there are small practical aspects that need to be clarified as mob-cluster is scarcely documented as it was more designed for an internal use. The plasmid database building command is correct and generated all necessary files. The most important files are If a given contig is assigned to I am curious why you've got your expected contigs specified as chromosomal and not plasmid. Do they have any replicon and relaxases sequences on them. Check by BLAST against the Have you BLAST those problematic contigs against the entire nucleotide collection and got any plasmid hits (https://blast.ncbi.nlm.nih.gov/Blast.cgi)? The database initialization routine is triggered if the |
How do I make/use a custom DB?
There are some input files that are not specified
% mob_cluster --mode build -f new_plasmids.fasta -p new_plasmids_mobtyper_report.txt -t new_plasmids_host_taxonomy.txt --outdir output_directory
-f new_plasmids.fasta MY PLASMIDS
-p new_plasmids_mobtyper_report.txt MOBTYPER OUTPUT
-t new_plasmids_host_taxonomy.txt WHAT IS THIS?
Then, how do I specify the folder where is the new db created?
Thanks
The text was updated successfully, but these errors were encountered: