-
Notifications
You must be signed in to change notification settings - Fork 8
Index example
Pierre Peterlongo edited this page Oct 27, 2021
·
3 revisions
This example shows how to build a k-mer index from 2 samples, D1 and D2.
data
├── 1.fasta
├── 2.fasta
└── kmtricks.fof
> cat data/kmtricks.fof
D1: data/1.fasta
D2: data/2.fasta
kmtricks pipeline --file ./data/kmtricks.fof \
--run-dir ./index_example \
--kmer-size 31 \
--mode hash:bft:bin \
--hard-min 2 \
--soft-min 3 \
--share-min 1 \
--bloom-size 100000 \
--bf-format howdesbt \
--cpr
-
--hard-min 2
-> All k-mers with an abundance >= 2 are kept. -
--soft-min 3
-> During merging, a k-mer in a sample with an abundance less than 3 will be kept only if it is solid in other sample. -
--share-min 1
-> Keep a non-solid k-mer in a sample if it is solid in one other sample. -
--bloom-size 100000
-> Requested Bloom filter size, final size = ROUND_UP(size/nb_parts, 8) * nb_parts -
--bf-format howdesbt
-> Dump Bloom filters in HowDeSBT format.
If the rescue is not necessary, the parameter --skip-merge
can used to save space and time. In this case, hashes are represented by bit-vector from the counting stage.
kmtricks index --run-dir ./index_example --howde
-
--howde
-> Build a determined brief tree, see kmtricks index for other options.
kmtricks query --run-dir ./index_example --query query.fasta --threshold 0.8 --sort > results.txt
-
--query query.fasta
-> a set of queries. -
--threshold 0.8
-> 80% of query kmers must be present in a leaf top consider a match. -
--sort
-> sorted results with additonal informations.