Skip to content

4.0.0

Compare
Choose a tag to compare
@stchris stchris released this 15 Oct 10:24
· 105 commits to develop since this release
4.0.0
d452633

Hello Aleph community! We’re excited to announce Aleph 4.0.0, a release focused on powerful new features, performance improvements, and expanded options for investigation sharing and user metrics. In addition, this release includes a few other small enhancements, bug fixes and dependency upgrades.

🚀 Bigger Changes 🚀

  • RabbitMQ based task queueing backend
    • Configurable AlephWorker Stages
    • Priority Buckets for Processing
    • System Status Page Enhancements
  • Updated Prometheus Metrics
  • Documentation Restructure and Enhancements
  • Improved Error Handling in Elasticsearch Upgrades

As always, we’d love to hear your feedback to keep improving. Feel free to reach out and share your thoughts!


What's Changed

Features

RabbitMQ

4.0.0 introduces a change to the way background tasks are scheduled. Previously Aleph used a Redis-based task queue, which was well designed but showed its limitations with large payloads and a risk of data loss. RabbitMQ queues are persisted to disk, but the flexibility in the way messages are queued, routed and fetched allows for certain optimizations which Aleph benefits from because of the widely varying degree of task loads.

Migration notes from Redis to RabbitMQ

Due to the significant changes in terms of task status persistence, switching between Aleph versions with RabbitMQ and Redis-based task queues requires some manual steps in order to ensure data consistency.

Perform the following steps every time you are either upgrading to a version with the RabbitMQ task queue or rolling back to the Redis-based task queue:

  1. Let all pending jobs run to completion (check the status page).
  2. Put Aleph into maintenance mode.
  3. Stop all workers (worker, ingest-file processes).
  4. (optional) Save the current state of redis in case you want to roll back using the BGSAVE command.
  5. Clear Redis (by issuing FLUSHDB from redis-cli from the redis container). If you get the error message "Unknown command FLUSHDB" then this command is disabled and you can resort to this shell invocation: echo 'KEYS *' | redis-cli | grep -v '^aleph:' | sed 's/^/DEL /' | redis-cli.
  6. (optional, if previous versions had conflicting RabbitMQ queue settings) Delete existing queues using rabbitmqctl delete queue {ingest,pruneentity,updateentity,exportxref,analyze,flushmapping,reingest,exportsearch,index,xref,reindex,loadmapping}. NOTE: queues are named after the stages found in ALEPH_WORKER_STAGES.
  7. Perform the upgrade or rollback to the desired version of Aleph.
  8. Ensure that all expected processes have started correctly.

Related changes:

Prometheus metrics

We have extended the Prometheus metrics exposed by Aleph to provide more information about active users and the data in your Aleph instance. For example, you can now query for the number of active users within the past 30 days or the number of investigations related to a particular language. For details about the available metrics please refer to the metrics reference in the technical documentation.

Sharing investigations

Due to the sensitive nature of dataset access we have made some changes to the way datasets are shared, no longer allowing email addresses to autocomplete. This means one needs to know the exact email address of another user if they want to share an investigation.

  • Feature: Allow sharing of investigations by @tillprochaska in #3865
  • Remove sharing options from create investigation screen by @stchris in #3862
  • Multiple small UX enhancements related to investigation sharing/user suggestion component by @tillprochaska in #3868

Other new features

Bug fixes and other changes

Documentation updates

Dependency updates

Full Changelog: 3.17.0...4.0.0