Skip to content

Releases: LanguageMachines/foliautils

v0.23

16 Dec 13:13
Compare
Choose a tag to compare
  • adapted to most recent libfolia (2,21) and ticcutils
  • updated to C++17
  • numereous code refactorings
  • updated GitHub CI

v0.22

27 May 14:37
Compare
Choose a tag to compare
  • requires libfolia 2.19 or higher
  • fixes github actions on MacOSX
  • small code updates and improvements

v0.21

26 Apr 10:14
Compare
Choose a tag to compare
  • a lot off code changes. Many regarding hyphens
  • added an extract_final_hyphen function, used by several programs
    FoLiA-abby, FoLiA-page and FoLiA-txt
  • FoLiA-txt: filter out ZWNJ characters. Avoid spurious LineBreaks
  • fix for #68
  • FoLiA-idf was not quite working. Fixed
  • FoliA-page: removed --sent option. updated man page
    lots of other fixes too
  • added experimental FoLiA-merge program:
    merging lemma/pos information into FoLiA files
  • more and better tests added
  • updates in README.MD

v0.20

13 Mar 08:59
Compare
Choose a tag to compare

[Ko van der Sloot]

  • Fix in FoLiA-txt. A <t-hbr> signals a newline, so adding an extra <br/> is
    not correct

v0.19

21 Feb 10:13
Compare
Choose a tag to compare

[Ko van der Sloot]

  • general C++ cleanup and refactoring
  • Some fixes for building on Mac OSX
  • FoLiA-txt:
    • now we handle soft-hyphens
    • modifications to solve #67
      --remove-end-hyphens is the default now. We create <t-hbr> nodes
    • modifications for proycon/foliapy#25
    • Unicode awareness
  • FoLiA-2text:
    • added a --restore-formatting option, which outputs the text inside
      <t-hspace> and <t-hbr> nodes
  • FoLiA-abby:
    • handling of soft-hyphens
    • fixes for <br/> and <t-hbr>
    • preserve original spaces in <t-hspace>'s text
  • FoLiA-correct: small fix in program logic.

v0.18

22 Jul 17:50
Compare
Choose a tag to compare

[Ko van der Sloot]

  • FoLiA-page: only add LineBreak annotation when needed
  • added more tests to make check
  • adapted and fixed tests
  • fixed the ugly problem of temporally disabling text checking.
  • start using the "system" foliadiff
  • fix declarations

[Maarten van Gompel]

  • FoLiA-page: added a --nomarkup parameter to revert to the old behaviour, and an extra --nostrings parameter to omit the strings #65
  • added a note for the --sent option #65
  • Added some comments for the ugly disable set_checktext patch, I don't like this but it seems needed (underlying libfolia issue?) #65
  • Add linebreaks and t-str to the paragraph text (currently fails text validation)
  • added Dockerfile and instructions
  • codemeta.json: updated according to (proposed) CLARIAH requirements (CLARIAH/clariah-plus#38)

v0.17

12 Jul 11:44
Compare
Choose a tag to compare
  • needs libfolia 2.9 or above
  • replaced TravisCI by GitHub actions
  • FoLiA-correct:
    • fixex a problem with correcting FoLia with both p and s nodes
    • added support for the FoLiA 'tag' feature
    • clearer error messages
    • fixed bugs in HEMP handling
    • better handling of Ucto's ABBREVIATION* tokens
    • fixed corrections when a word has 'space="no"'.
    • some smaller fixes
    • added more tests
  • FoLiA-clean:
    • improved, using new features from libfolia 2.9
  • FoLiA-2text:
    • replaced '--original' parameter by a '--correction-handling' parameter
    • implemented a --honour-tags option, to interpret tag="token" tags
    • some improvement in output-file naming
  • FoLiA-abby:
    • complete reworked the code
    • added '-S' and '-C' as alternatives for '--setname' and '--classname'
    • added a --keephyphens option
    • added a --addbreaks option
    • addes option --addmetrics to optionally add positional info to the
      paragraphs
    • improved handling of '-' (Hyphen)
    • add 'font_properties', 'font_id' and 'font_style' as a feature node
    • improved handling of text with spaces at 'unexpected' locations
  • all modules:
    • Code refactoring and cleaning
    • added and improved tests
    • adapted man pages

v0.16

07 Jan 11:35
Compare
Choose a tag to compare

[Ko vd Sloot]

  • requires libfolia 2.7 or above
  • provenance data is better for a lot of modules
  • added better checking on invalid NCnames in some modules.
  • FoLiA-abby:
    • a lot of refactoring and additions to handle font/style information
  • FoLiA-pm:
    • Notes are handled correctly now
    • fixed error in xlink attributes
  • FoLiA-page:
    • more types of Page files are handled now
    • fixed annotation declarations
    • fixed offset calculation (due to change in FoLiA's opinion on those)
    • page number is added as a
      node and in the metadata
    • added a --trusttokens option. This means that Word items in the Page file
      are added as Word's in the FoLiA, embedded in Sentences.
    • added a --norefs option to avoid adding references to the original texts
  • FoLiA-correct:
    • make sure that the default is to run on 1 thread
    • added a --rebase-inputclass option
  • FoLiA-alto:
    • the -t option was not always handled correctly

[Maarten van Gompel]

  • FoLiA-benchmark: guard against compiler optimisation #48

v0.15

15 Sep 11:33
Compare
Choose a tag to compare

[Maarten van Gompel]

  • FoLiA-txt: check if a string is empty after normalisation (fix for #46)

[Ko vd Sloot]

  • folia-correct: fix one-off error in hemp handling (when no hemp was found) #45
  • some refactoring
    • centralized definition of XML_PARSER_OPTIONS
  • bugfix in threading

v0.14

15 Apr 10:56
Compare
Choose a tag to compare

[Martin Reynaert]

  • updated man pages

[Ko vd Sloot]

  • added man pages
  • revised usage() in many modules
  • the default separator in FoLiA-stats is '_' now
  • fix for: #37
  • fix for: #41
  • adapted to changes in libfolia
  • many small code refactorings
  • FoLiA-correct is improved a lot, allowing ngram corrections in FoLiA
  • FoLiA-stats accepts a 'word_in_doc' mode now
  • FoLiA-alto by default created nodes now. use --oldstring to get
  • improved a lot in tests/
  • many small fixes