Skip to content

Releases: proycon/colibri-core

v2.4.10

05 Dec 14:48
Compare
Choose a tag to compare

Important bugfix release:

  • Fixes data-clipping bug on loading large corpora in memory (used by indexed patternmodels) #41

(All users are urged to upgrade!)

v2.4.9

23 May 17:29
Compare
Choose a tag to compare
  • Added metadata
  • macOS fix

v2.4.8

01 Mar 10:40
Compare
Choose a tag to compare
  • Minor update: made setup.py more robust for manual installation mode (without compiling C++ lib) (v2.4.7 was skipped)

v2.4.6

07 Sep 15:34
Compare
Choose a tag to compare
  • fix: colibri-classencode -t (threshold) behaviour was wrong (was interpreted as +1)

v2.4.5

21 Feb 13:27
Compare
Choose a tag to compare
  • Refactored alignment model
  • added BasicPatternAlignmentModel
  • Major cleanup of warnings and possible issues (thanks to @kosloot)

v2.4.4

02 Dec 13:19
Compare
Choose a tag to compare
  • Bugfix: fixes covered token count per category/n (issue #26)
  • New feature: colibri-patternmodeller has a--simplereport (-r) option that generates a report without coverage information (more limited but a lot faster)

v2.4.3

19 Aug 14:59
Compare
Choose a tag to compare

v2.4.2 was prematurely released, one minor test was corrupt. Fixed now in this release.

v2.4.2

19 Aug 14:26
Compare
Choose a tag to compare

Bugfix release, fixes issue #25

v2.4.1

15 Jun 11:45
Compare
Choose a tag to compare

Minor fix release prior to paper publication:

  • Python 2.7 compatibility fix
  • Updated python tutorial
  • Added benchmarks

v2.4.0

02 Jun 09:35
Compare
Choose a tag to compare

Various fixes:

  • Speed up in ngrams() computation (issue #21)
  • Performance fix for processing long lines
  • Pattern.instanceof()should be faster and is now available from Python too
  • Attempt to fix compilation issue on certain platforms (issue #22), unconfirmed

New features:

  • Implemented new filtering mechanism that supports actively checking whether patterns are instances of a limited set of specified skipgrams, or a superset of specified ngrams.
  • Implemented ignorenewlines option in class encoding. Useful if you have source text split by for instance sentences (one per line), but want a model that crosses sentence boundaries.
  • Implemented vocabulary import for the class encoding stage (issue #2)