Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support partial/incomplete batch job results #430

Closed
soxofaan opened this issue Nov 29, 2021 · 5 comments · Fixed by #433
Closed

Support partial/incomplete batch job results #430

soxofaan opened this issue Nov 29, 2021 · 5 comments · Fixed by #433
Assignees
Labels
Milestone

Comments

@soxofaan
Copy link
Member

openeo-api/openapi.yaml

Lines 3044 to 3045 in f303d65

If processing has not finished yet requests to this endpoint MUST be
rejected with openEO error `JobNotFinished`.

The API currently requires that a batch job is fully finished before results can be requested with /jobs/{job_id}/results.
In the context of very large batch jobs I think it could be useful to relax this and already allow listing of incomplete results (properly indicating that result is incomplete of course).

I'm not sure what is the best option: changing the behavior of the existing endpoint /jobs/{job_id}/results, or adding a parameter to enable this partial listing, or adding a new endpoint, or ...

re: https://github.com/openEOPlatform/architecture-docs/issues/12

@soxofaan
Copy link
Member Author

In the context of very large batch jobs .. it could be useful to ... allow listing of incomplete results

FYI: in the "large area" feature, we aim to split a very large job in smaller jobs at the level of the aggregator, and distribute this work as separate "sub" batch jobs on one or more back-ends. It would be frustrating if partial results would not be accessible because one "sub" batch jobs fails or is stuck.

@m-mohr
Copy link
Member

m-mohr commented Nov 29, 2021

Adding this to the existing endpoint sounds reasonable although it could be an issue that existing clients could start downloading and propagating partial results as "complete" as the semantic for providing partial results was not present before. My first idea was using status code 206 instead of 200, but that doesn't fully fit. How would you envision this to work with things like start_and_wait in the clients?

@m-mohr m-mohr self-assigned this Nov 29, 2021
@soxofaan
Copy link
Member Author

My current thought is to have explicit opt-in to view partial results to stay backward compatible (because it would behaviorally be a breaking change):

  • default:
    • job is fully finished: 200 with asset listing
    • not finished: 400 JobNotFinished error
  • user/client explicitly enabled partial listing:
    • job is fully finished: 200 with asset listing and a field to indicate that result is complete
    • not finished: 200 with asset listing and a field to indicate that result is incomplete

opt-in could be with a request parameter partial that is false by default

I'm not sure about the 206 status for partial results, as that seems to be more about returning a targeted subrange of the result, chosen by the user request. But I'm not that familiar with usage in the wild of 206, so I don't have strong opinion against it either.

@m-mohr
Copy link
Member

m-mohr commented Nov 29, 2021

Yeah, I think a parameter works for me...

@m-mohr
Copy link
Member

m-mohr commented Nov 30, 2021

A first draft is now available in PR #433

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants