Skip to content

v1-0-1

Compare
Choose a tag to compare
@simon-clematide simon-clematide released this 25 Nov 10:39
· 76 commits to main since this release
  • fix: POS tagging of lb was buggy (all tags set to X). This has been fixed.
  • feat: Generate log files for each newspaper/year pair and upload it to s3.
  • feat: Support agreed nameing convention for output files.
  • feat: Process directly from s3 input data, on-the-fly mirroring per newspaper for
    slim builds
  • note: no change to spaCy pipelines apart from lb POS tag mapping