Skip to content

Releases: bbuchfink/diamond

DIAMOND v2.1.13

23 Jul 11:06
Compare
Choose a tag to compare
  • Fixed an invalid error message for the cluster, deepclust and linclust workflows.
  • Added the option --oid-output to output ordinal IDs instead of accessions for the clustering workflows, reducing their memory use.
  • Added support for using the --multiprocessing feature on Windows.
  • Using --multiprocessing requires explicitly setting --parallel-tmpdir.
  • Fixed a bug that could cause a crash when the --target-indexed option was used.
  • As of now, a macOS binary is available for the GitHub release, supporting both x86 and Apple silicon CPUs. Using BLAST databases is also supported.
  • Added compatibility with later CMake versions (tested up to v4.0.3).
  • Added CMake option -DCROSS_COMPILE to disable auto-detection of host architecture.
  • Added compilation script to produce macOS fat binary.

DIAMOND v2.1.12

02 Jun 16:20
Compare
Choose a tag to compare
  • Added support for the new NCBI taxonomic ranks "cellular root", "acellular root", "domain" and "realm".
  • Added support for using BLAST databases to the Bioconda release (thanks @mencian).
  • Fixed compiler errors for Clang 20.
  • Enabled transitive closure computation in earlier clustering rounds and for bi-directional coverage clustering.
  • Fixed an issue that could cause hits to be partially lost in frameshift alignment mode when they occured in both query strands for the same target.
  • Fixed an error parsing FASTQ files when quality value lines started with the @ character.
  • Fixed a compiler error on macOS.

DIAMOND v2.1.11

25 Jan 09:37
Compare
Choose a tag to compare
  • Improved the performance and sensitivity of the cluster, deepclust and linclust workflows.
  • The --faster mode will by default use a minimizer sketch of fixed size per sequence instead of window-based minimizers.
  • Added the option --sketch-size to enable seeding using a minimizer sketch of the given size per sequence.
  • Cascaded clustering and iterated search will by default use the --fast mode with linearization in the second round.
  • The --round-coverage parameter is now also applied to uni-directional coverage clustering.
  • Cluster output files will correctly contain carriage returns on Windows.
  • Fixed generation of the Docker container against the latest version of the NCBI toolkit.
  • Fixed a bug that caused target coordinates not to be reported correctly in the tabular format in frameshift alignment mode.
  • Added the options --ungapped-evalue and --ungapped-evalue-short to set e-value thresholds for the ungapped hit filter.
  • Linearization of search or clustering rounds is limited to seeds of weight >= 10.
  • Fixed an issue that could cause an array size overflow error when using very large .dmnd databases with taxonomic annotation.
  • Fixed a bug that caused query letters to be printed as ARND instead of ACGT in the view workflow.
  • Fixed a bug that caused using paired end input files to malfunction with an error message.
  • Fixed a bug that could produce clustering errors when clustering at sequence identities >= 50% and processing the database in multiple super blocks.
  • Fixed a bug that could cause a crash in global ranking mode.
  • Accession parsing rules applied to database sequence accessions for the purpose of matching them to accessions in the taxonomy mapping file are now by default also applied to the accessions in the mapping file (disable using --no-parse-seqids).
  • Fixed an issue that could cause increased memory use in the hash join stage.
  • Added support for FASTA headers containing multiple sequence IDs separated by blank spaces (so far only the \1 character was supported as a separator).
  • Fixed an issue that could cause hanging or crashes in the Computing alignments stage.
  • --linsearch can now be used in conjunction with --iterate.
  • Fixed a compiler error for GCC 4.8.5.
  • Fixed a compiler error on Solaris.
  • Fixed compiler errors on systems that do not support the sysinfo function.
  • Fixed Bus error occuring on Sparc systems.
  • Compilation on Sparc systems can be performed without setting -DX86=OFF.
  • Fixed two issues that could cause increased memory use in the computing alignments stage.
  • Fixed a bug that caused superfluous quote characters in the JSON output format.
  • Linear search modes will by default use full-matrix extension.
  • Fixed an issue that could cause reduced performance in the masking sequences stage.
  • Fixed a bug that could cause a crash when using mutual coverage thresholds in blastx mode.
  • Fixed a bug that could cause a crash when the --include-lineage option was used.
  • When reading protein sequences that unexpectedly only contain DNA letters, an error message is only produced if the first 10 sequences in the input file all exhibit the problem.
  • Fixed a bug that caused setting --top 100 not to function correctly.
  • Fixed a bug that caused target coordinates not to be reported correctly in the output of the realign workflow.
  • Fixed a bug that did not permit using the --memory-limit/-M option for the realign workflow.
  • Fixed an issue that could cause non-deterministic output in frameshift alignment mode.
  • Fixed a bug that could cause a crash when using the XML output format in the view workflow.
  • Fixed an issue that could cause non-deterministic output for identically-scoring HSPs in the same target.
  • Disabled the default use of increased coverage and identity cutoffs in earlier clustering rounds.
  • Optimized the performance of the extension stage when coverage or approximate identity filters are used.
  • Optimized the performance of the extension stage when not using output fields that require alignment traceback.
  • Fixed an issue that could cause an incorrect order of cascaded clustering rounds.

DIAMOND v2.1.10

19 Oct 12:10
Compare
Choose a tag to compare
  • Fixed a bug that could cause a crash when using a bi-directional coverage cutoff in query-indexed mode.
  • Fixed a bug that caused the --include-lineage option to malfunction for targets with no taxonomic assignment available.

DIAMOND v2.1.9

31 Jan 14:20
Compare
Choose a tag to compare
  • Corrected the prefix of the query length field for the SAM format.
  • Added the size modifiers 'T', 'M' and 'K' for the --memory-limit/-M option.
  • Added the option --mutual-cover to cluster sequences by mutual coverage percentage of the cluster representative and member sequence.
  • Added the option --symmetric for computing greedy vertex cover with symmetric edges.
  • Fixed an issue that caused the --approx-id option and the approx_pident output field not to work correctly when using the --anchored-swipe option.
  • Added the option --no-reassign to prevent reassignment to closest representative for the greedy vertex cover and clustering workflows.
  • Added the option --connected-component-depth to activate clustering of connected components at a given maximum depth for the greedy vertex cover and the clustering workflows.
  • Fixed a compiler error for Clang v17.
  • Improved search performance when searching with mutual coverage threshold by filtering for sequence length ratio.
  • Added the sensitivity mode --shapes-30x10 with sensitivity approximately equivalent to --mid-sensitive.
  • Added the options --round-coverage and --round-approx-id to set per round cutoffs for cascaded clustering.
  • The CMake switch -DKEEP_TARGET_ID is now obsolete and the corresponding function is always available.
  • Added the option --include-lineage to the taxonomic classification format to include taxonomic lineage in the output.
  • Added native support for the ARM NEON instruction set (contributed by @althonos).

DIAMOND v2.1.8

21 Jun 07:48
Compare
Choose a tag to compare
  • Fixed an issue that could cause reduced performance when running in query-indexed mode.
  • Added support for the JSON output format (option -f json-flat).
  • Added the option --sam-query-len to output query length in SAM format.

DIAMOND v2.1.7

31 May 12:25
Compare
Choose a tag to compare
  • Fixed a bug that caused taxonomy names not to be loaded correctly for the makedb workflow.
  • Fixed a bug that caused a crash when using the --target-indexed option.
  • Fixed an error when using the --tmpdir option for the makedb workflow.
  • Added a warning message when sequence accessions are shortened due to parsing rules for the makedb workflow.
  • Added the option --no-parse-seqids to disable parsing of sequence accessions.
  • Changed the command line help to print options separated by command.
  • Fixed an issue that the --ignore-warnings option could not be used for the makedb workflow.

DIAMOND v2.1.6

18 Mar 11:13
Compare
Choose a tag to compare
  • Fixed compatibility issues on older systems without support for AVX2.
  • Fixed linker errors when compiled with -DX86=OFF.
  • Fixed a compiler error on macOS systems.
  • Fixed a bug that could cause missing tags in the XML output format and unaligned queries not to be reported correctly.
  • Fixed a bug that caused the PAF output format not to work correctly.

DIAMOND v2.1.5

10 Mar 09:40
Compare
Choose a tag to compare
  • Disabled the use of frequency based seed masking when using the linear-time search feature with respect to the targets.
  • Fixed a bug that caused a Database file is not a BLAST database error message for the prepdb workflow.
  • Fixed a bug that caused a segmentation fault when using BLAST databases.
  • Added line numbers for error messages when reading taxonomy mapping files.
  • Fixed a bug that could cause a crash when using the greedy-vertex-cover workflow without the --out and --centroid-out options.
  • Fixed a bug that caused the greedy-vertex-cover workflow to only produce a trivial clustering.
  • Fixed a bug that caused the last codon of the -2 reading frame to be translated incorrectly.
  • Reduced the memory use of the clustering workflow.
  • Updated the bundled NCBI toolkit to the latest version.

DIAMOND v2.1.4

27 Feb 13:23
Compare
Choose a tag to compare
  • Leading spaces are now trimmed and tabulator characters escaped as \t in sequence titles, and a warning message is produced.
  • Blank sequence titles are now replaced by N/A, and a warning message is produced.
  • Fixed a bug that could cause a Traceback error in certain cases.
  • Fixed a bug that caused the qlen and score output fields not to be reported correctly for the realign workflow.
  • Added an error message when using unsupported output fields for the realign workflow.
  • Fixed an issue that could cause a Missing fields in input line error when clustering.
  • Optimized the performance of the linclust workflow.
  • Reduced the memory use of the clustering workflow.
  • Fixed a bug that caused using standard input as the query not to work.