Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

job outputs: support single file vs multiple file results better #135

Closed
soxofaan opened this issue Apr 3, 2020 · 4 comments
Closed

job outputs: support single file vs multiple file results better #135

soxofaan opened this issue Apr 3, 2020 · 4 comments

Comments

@soxofaan
Copy link
Member

soxofaan commented Apr 3, 2020

The current client has some assumptions here and there that a job has a single result, e.g.

def download_results(self, target: Union[str, pathlib.Path]) -> pathlib.Path:
""" Download job results."""
results_url = "/jobs/{}/results".format(self.job_id)
r = self.connection.get(results_url, expected_status=200)
links = r.json()["links"]
if len(links) != 1:
# TODO handle download of multiple files?
raise OpenEoClientException("Expected 1 result file to download, but got {c}".format(c=len(links)))

def execute_batch(
self,
outputfile: Union[str, pathlib.Path], out_format: str = None,

We should find something to also support download a multi-file result set without complicating the client API too much

soxofaan added a commit to soxofaan/openeo-python-client that referenced this issue Apr 3, 2020
@soxofaan
Copy link
Member Author

soxofaan commented Apr 3, 2020

in #128 we already made the difference more explicit in the RESTJob class:

  • RESTJob.download_result("file.tiff") to download a single file (fails when result has multiple files)
  • RESTJob.download_results("folder") to download all files from a result to a folder

Also did some finetuning to make it easier for user to do either of these directly from datacube (which only has single file support in execute_batch() at the moment):

  • cube.send_job().start_and_wait().download_result("file.tiff")
  • cube.send_job().start_and_wait().download_results("folder")

@soxofaan
Copy link
Member Author

under EP-3739 I added JobResults and ResultAsset classes to make clear distinction between the job results (GET /job/{job_id}/results and the individual assets in these results. This gives more flexibility to the end user to do what's best for them.

Initial documentation: https://open-eo.github.io/openeo-python-client/batch_jobs.html#download-batch-job-results

recommended flow:

results = job.get_results()
results.download_files("path/to/folder")

if user knows there is just one asset in result:

results.download_file("path/to/result.tiff")

@soxofaan
Copy link
Member Author

ResultAsset also supports load_json and load_bytes to get result directly in memory and skip download to file.
It should be straightforward to also provide something like load_numpy_array (from geotiff or netcdf asset)

@soxofaan
Copy link
Member Author

I think we can close this ticket now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant