[FEATURE] Automatic task distribution #5000

nataliaElv · 2024-06-12T08:39:28Z

Backend

Give feedback

SDK

Give feedback

UI

Give feedback

No tasks being tracked yet.

Options

The text was updated successfully, but these errors were encountered:

…nly (#5148) # Description Add changes to `responses_submitted` relationship to avoid problems with existent `responses` relationship and avoid a warning message that SQLAlchemy was reporting. Refs #5000 **Type of change** - Improvement (change adding some improvement to an existing functionality) **How Has This Been Tested** - [x] Warning is not showing anymore. - [x] Test are passing. **Checklist** - I added relevant documentation - follows the style guidelines of this project - I did a self-review of my code - I made corresponding changes to the documentation - I confirm My changes generate no new warnings - I have added tests that prove my fix is effective or that my feature works - I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

…riable setting (#5213) # Description On the distribution task testing effort meeting we found some errors raised by SQLite when the database was locked because too many writes were executed concurrently. Increasing the timeout for SQLite reduces this problem, specifically changing the connection `timeout` parameter from the default value of `5` seconds to `30` seconds make the problem totally disappear locally. In this PR I have added a new `ARGILLA_DATABASE_SQLITE_TIMEOUT` environment variable that allow us to set a value for this. With a default value of `15` seconds. The idea is to test this change on Spaces using a value of `30` seconds. Refs #5000 **Type of change** - Improvement (change adding some improvement to an existing functionality) **How Has This Been Tested** - Tested locally using SQLite and running 20 parallel response creations in bulk. **Checklist** - I added relevant documentation - I followed the style guidelines of this project - I did a self-review of my code - I made corresponding changes to the documentation - I confirm My changes generate no new warnings - I have added tests that prove my fix is effective or that my feature works - I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

…_DATABASE_POSTGRESQL_MAX_OVERFLOW` (#5220) # Description After testing a high number of concurrent requests using PostgreSQL I received the following error: ``` QueuePool limit of size 5 overflow 10 reached, connection timed out, timeout 30.00 ``` This PR add the following two environment variables so we can configure the pool size and max overflow. Refs #5000 **Type of change** - Improvement (change adding some improvement to an existing functionality) **How Has This Been Tested** - [x] Manually testing with PostgreSQL. **Checklist** - I added relevant documentation - I followed the style guidelines of this project - I did a self-review of my code - I made corresponding changes to the documentation - I confirm My changes generate no new warnings - I have added tests that prove my fix is effective or that my feature works - I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

# Description After investigate timeouts for PostgreSQL I have found that timeouts should not affect errors when a SERIALIZABLE transactions is rollbacked because another concurrent update error is raised. So the only way to support concurrent updates with PostgreSQL and SERIALIZABLE transactions is to capture errors and retry the transaction. This code has the following changes: * Start using `backoff` library to retry any of the possible CRUD context functions updating responses and record statuses, using SERIALIZABLE database sessions. * This change has the side effect of working with PostgreSQL and SQLite at the same time. * I have set a fixed time of 15 seconds as maximum time for retrying with exponential backoff. * I have moved search engine updates outside of the transaction block. * This should mitigate errors on high concurrency scenarios for PostgreSQL and SQLite: * For SQLite we have the additional setting to set a timeout if necessary. * I have changed `DEFAULT_DATABASE_SQLITE_TIMEOUT` value to `5` seconds so the backoff logic will handle possible problems with locked database errors and SQLite. Refs #5000 **Type of change** - Improvement (change adding some improvement to an existing functionality) **How Has This Been Tested** - [x] Manually testing with PostgreSQL and SQLite, running benchmarks using 20 concurrent requests. - [x] Running test suite for PostgreSQL and SQLite. **Checklist** - I added relevant documentation - I followed the style guidelines of this project - I did a self-review of my code - I made corresponding changes to the documentation - I confirm My changes generate no new warnings - I have added tests that prove my fix is effective or that my feature works - I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

nataliaElv added the type: enhancement Indicates new feature requests label Jun 12, 2024

nataliaElv added this to the v2.1.0 milestone Jun 12, 2024

This was referenced Jul 1, 2024

feat: automatic distribution task #5136

Merged

feat: improve Records responses_submitted relationship to be view only #5148

Merged

nataliaElv modified the milestones: v2.1.0, v2.0.0 Jul 9, 2024

jfcalvo mentioned this issue Jul 11, 2024

improvement: add new ARGILLA_DATABASE_SQLITE_TIMEOUT environment variable setting #5213

Merged

jfcalvo mentioned this issue Jul 12, 2024

improvement: add ARGILLA_DATABASE_POSTGRESQL_POOL_SIZE and ARGILLA_DATABASE_POSTGRESQL_MAX_OVERFLOW #5220

Merged

1 task

jfcalvo mentioned this issue Jul 15, 2024

improvement: capture and retry database concurrent update errors #5227

Merged

2 tasks

jfcalvo closed this as completed in #5136 Jul 18, 2024

jfcalvo closed this as completed in d30c429 Jul 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Automatic task distribution #5000

[FEATURE] Automatic task distribution #5000

nataliaElv commented Jun 12, 2024 •

edited by jfcalvo

Loading

Backend

SDK

UI

[FEATURE] Automatic task distribution #5000

[FEATURE] Automatic task distribution #5000

Comments

nataliaElv commented Jun 12, 2024 • edited by jfcalvo Loading

Backend

SDK

UI

nataliaElv commented Jun 12, 2024 •

edited by jfcalvo

Loading