Async task state tracking #148

nathanielrindlaub · 2024-02-12T21:22:07Z

We now have a growing number of potentially long-running tasks that Lambdas are not well suited to support, especially if the user is expecting a synchronous response. These include:

Bulk upload
Deleting large numbers of images
Deleting large numbers of labels (high priority, very expensive & hard to optimize with bulkWrite)
Creating large numbers of labels (currently can handle ~15k new labels w/o async)
Updating camera serial numbers
Getting View stats
Exporting annotations
Updating deployments

@ingalls recommends that we consider creating a consistent pattern for tracking these async tasks in the DB (creating a collection or collections that we update when the state of one of these processes changes and a consistent query pattern for accessing them). I think it's a great idea... right now we have 2 entirely different ways of checking the bulk upload state and the annotation state, and for most of the others we haven't yet implemented spinning the tasks off on separate infrastructure; instead, we only support them at pretty low thresholds.

For the tasks we haven't yet broken out to run async, we have a couple options:

Using AWS Batch (same process as batch upload) w/ Fargate. This would be more expensive and have a much longer cold-start, but it would mean that we wouldn’t be limited to Lambda’s 15 min limit
Create SQSs + Lambda workers to pull messages off and process them in separate Lambdas. Nick's main concern here is how do we make the UX make sense for tasks that run longer than 15 mins (and would require re-prompting/initiating by the user)

The text was updated successfully, but these errors were encountered:

postfalk · 2024-02-12T21:59:37Z

I think that is a good idea. I have a similar design with my Lens to Salesforce workflow.

nathanielrindlaub · 2024-03-13T17:04:11Z

@ingalls, I was just made aware that the process of creating, updating, or deleting deployments on cameras with large numbers of images (like 20k) will timeout, and the deployment update will fail. This is because ProjectModel.reMapImagesToDeps() has to iterate over every image in the camera to make sure it's assigned to the right deployment.

So once we get the task tracking in place I think our two priorities should be:

making getStats async
making reMapImagesToDeps (or all deployment CRUD ops) async

nathanielrindlaub · 2024-03-27T22:57:32Z

I think this is just about done. We've migrated getStats, exportData, exportImageErrors, andcreate/update/deleteDeployments to the new async task lambda. We decided against using the task pattern for batches as they are much more involved and deeply integrated in the rest of the code base. It's worth, however, looking at whether we can now move some of the other time-consuming operations (Deleting large numbers of images, Deleting large numbers of labels, Merging large numbers of labels) to async and increasing or removing the allowable operation threshold.

Final punchlist:

review access patterns / interfaces between new task handler the various DB Models they interface with. Right now there's some inconsistency and semi-confusing function names (see: Async UpdateDeployment #168 (comment))
review frontend task slice and see if there are any opportunities to streamline or abstract it. At the very least, generalize getTaskFailure action handler so it works for all task types.

nathanielrindlaub mentioned this issue Feb 12, 2024

Stream images from DB into getStats reducer #67

Closed

nathanielrindlaub mentioned this issue Mar 12, 2024

TaskMan #159

Merged

1 task

nathanielrindlaub mentioned this issue Mar 19, 2024

Get stats refactor tnc-ca-geo/animl-frontend#212

Merged

nathanielrindlaub mentioned this issue Apr 4, 2024

Async cleanup #173

Merged

nathanielrindlaub closed this as completed in #173 Apr 4, 2024

nathanielrindlaub mentioned this issue Aug 13, 2024

Bulk image deletion tnc-ca-geo/animl-frontend#227

Open

nathanielrindlaub reopened this Sep 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Async task state tracking #148

Async task state tracking #148

nathanielrindlaub commented Feb 12, 2024 •

edited

Loading

postfalk commented Feb 12, 2024 via email •

edited by nathanielrindlaub

Loading

nathanielrindlaub commented Mar 13, 2024

nathanielrindlaub commented Mar 27, 2024 •

edited

Loading

Async task state tracking #148

Async task state tracking #148

Comments

nathanielrindlaub commented Feb 12, 2024 • edited Loading

postfalk commented Feb 12, 2024 via email • edited by nathanielrindlaub Loading

nathanielrindlaub commented Mar 13, 2024

nathanielrindlaub commented Mar 27, 2024 • edited Loading

nathanielrindlaub commented Feb 12, 2024 •

edited

Loading

postfalk commented Feb 12, 2024 via email •

edited by nathanielrindlaub

Loading

nathanielrindlaub commented Mar 27, 2024 •

edited

Loading