You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add exponential retries to parallel mode calls (#216)
Replace our parallel mode retry logic with the `backoff` library. This
gives us exponential backoff, retryable error codes, etc with just a
decorator, which really cleans up the code.
Changes:
* Refactor `partition_file_via_api` and move the request with backoff to
`call_api`
* Add `backoff` as a dependency and `pip compile`
* Make sure we don't dump api parameters on every parallel call
* Don't allow internal calls to bypass the 503 low memory gate (Should
be handle in the retries like everything else)
To test this, try adding an HTTPException to the code.
Add a non-retryable exception in `partition_pdf_splits`:
```
# If it's small enough, just process locally
# (Some kwargs need to be renamed for local partition)
if len(pdf_pages) <= pages_per_pdf:
raise HTTPException(status_code=400)
```
When you run this and send a file, you'll get the 400 back immediately:
```
export UNSTRUCTURED_PARALLEL_MODE_ENABLED=true
export UNSTRUCTURED_PARALLEL_MODE_URL=http://localhost:8000/general/v0/general
export UNSTRUCTURED_PARALLEL_NUM_THREADS=1
make run-web-app
curl -X POST 'http://localhost:8000/general/v0/general' --form files=@sample-docs/layout-parser-paper.pdf
{"detail":"Bad Request"}
```
Now, return a 500 error instead and run again. In this case you'll get a
server error, but in the logs you should see that the retries happened:
```
Giving up call_api(...) after 3 tries (fastapi.exceptions.HTTPException)
```
0 commit comments