Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix gee example #128

Closed
wants to merge 6 commits into from
Closed

Fix gee example #128

wants to merge 6 commits into from

Conversation

bgoesswe
Copy link
Member

@bgoesswe bgoesswe commented Apr 1, 2020

  • Fixed GEE example for the 1.0.0rc2 version of it.

  • Made some changes in the client to be compliant with 1.0.0rc2.

  • Added the possibility to apply complex apply process graphs.

  • Implemented multiple file result download.

@bgoesswe bgoesswe requested a review from soxofaan April 1, 2020 14:31
@@ -52,7 +52,14 @@ def bands(self) -> List[Band]:

def _get_bands(self) -> List[Band]:
# TODO refactor this specification specific processing away in a subclass?
# First try `properties/eo:bands`
# First try `summaries/eo:bands`
eo_bands = self.get('summaries', 'eo:bands')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, I didn't know about this summaries field as replacement for the properties field.
Maybe we can drop support for properties in favor of summaries. I have no idea if it is necessary to keep supporting that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, also "cube:dimensions" is now one layer up in the dictionary, I updated that in my last commit.

if self._api_version.at_least("1.0.0"):
request = {"process": {"process_graph": graph}}
else:
request = {"process_graph": graph}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reference for self: Open-EO/openeo-python-driver#33

"title": title,
"description": description,
"plan": plan,
"budget": budget
}

if self._api_version.at_least("1.0.0"):
pg_meta["process"] = {"process_graph": process_graph}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reference for self: Open-EO/openeo-python-driver#33

if isinstance(process, str):
arguments["process_graph"] = PGNode(
process_id=process,
arguments={data_argument: {"from_parameter": "x"}})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for 0.4 it has to be from_argument I think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes true, done in latest commit

))

def reduce_temporal_simple(self, process_id="max") -> 'DataCube':
def reduce_temporal_simple(self, process_id="max", dim_abbr="temporal") -> 'DataCube':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use dimension instead of dim_abbr, as used in other places in DataCube methods

also, picking the correct dimension should be done automatically instead of manually by user: see #116. Can you put a comment # TODO #116 determine dimension based on datacube metadata here as well?

@@ -711,6 +725,10 @@ def merge(self, other: 'DataCube', overlap_resolver: PGNode = None) -> 'DataCube
'cube2': {'from_node': other._pg},
}
if overlap_resolver:
# TODO: for 1.0.0 support
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add #125 in this TODO line? makes it easier to find 1.0 related TODOs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or wait: as far as I understand the spec, there is no need for additional process level here. I think this is correct:

...
    "overlap_resolver": {
        "process_graph": {
            "max": { "process_id": "max" ....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are right, I removed the comments

process_id=func,
arguments={"data": {"from_parameter": "data"}}
)}
# TODO: Might be necessary for 1.0.0rc2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above: from my understanding of spec this additional "process" level is not necessary

""" Download job results."""
def download_results(self, target: Union[str, pathlib.Path] = None) -> list:
"""
Download job results to the target folder path.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a confusing API (especially the interplay between target and how many files there are in the result)

maybe we should provide instead:

  • download_result to download a single result: fails when target is a folder or there are multiple results
  • download_results is more generic: target should always be a folder

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now there are these two functions in the job class.

return target

file_list = []

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put a line target = Path(target or Path.cwd()) here, then you don't have to do if target anymore in the code below

handle.write(block)
return target

file_list = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it more logical to use a "to download" dict (mapping hrefs to local paths) instead of a list file_list?

@soxofaan
Copy link
Member

soxofaan commented Apr 2, 2020

to address all "process"/"process_graph" changes I created dedicated ticket #129

@soxofaan
Copy link
Member

soxofaan commented Apr 2, 2020

Inspired by this PR, I worked out the "process"/"process_graph" issue a bit more in PR #131
Unfortunately this will conflict with this (PR #128 )

If you want @bgoesswe , I can merge #131 and rebase #128 to get a view on the remaining issues

@bgoesswe
Copy link
Member Author

bgoesswe commented Apr 2, 2020

Yes sure that is a good idea.

soxofaan pushed a commit to soxofaan/openeo-python-client that referenced this pull request Apr 3, 2020
soxofaan added a commit to soxofaan/openeo-python-client that referenced this pull request Apr 3, 2020
soxofaan added a commit to soxofaan/openeo-python-client that referenced this pull request Apr 3, 2020
- based on Open-EO#128
- more flexible file/folder handling
- automatically create parent folder
- support multiple result files: `download_results` vs `download_result`
- add unit tests
soxofaan pushed a commit to soxofaan/openeo-python-client that referenced this pull request Apr 3, 2020
soxofaan added a commit to soxofaan/openeo-python-client that referenced this pull request Apr 3, 2020
soxofaan added a commit that referenced this pull request Apr 3, 2020
@soxofaan
Copy link
Member

soxofaan commented Apr 3, 2020

FYI: I just merged a rework of this PR in master

@soxofaan
Copy link
Member

soxofaan commented Apr 3, 2020

Hi @bgoesswe I think I merged now all your changes (with some finetuning here and there) into master (including commit added TUW-UseCase to examples).
Can you check if the your use cases work still on master as expected?

Because of the finetuning, this PR #128 is conflicting, and I think it is best to close it.

If you have more additions it is probably best to start with a new branch starting from master

@soxofaan
Copy link
Member

soxofaan commented Apr 3, 2020

another thing (one of the finetunings I did), is simplification of this pattern:

job = datacube.send_job()
print(job.job_id)
print(job.start_job())
print(job.describe_job())
time.sleep(30)
print(job.download_results("/tmp/"))

you can now just do

job = datacube.send_job()
res = job.start_and_wait().download_results("/tmp")
print(res)

instead of a hardcoded sleep, it will do a poll loop until job is finished

@bgoesswe
Copy link
Member Author

bgoesswe commented Apr 3, 2020

Yes works fine, thanks !

@soxofaan soxofaan deleted the fix_gee_example branch April 15, 2020 15:51
soxofaan added a commit to soxofaan/openeo-python-client that referenced this pull request May 5, 2020
…pectral_bands", "temporal")

- bump version so users can stick to older client version if dimension name checks are too aggressive for their backend
- recommended temporal dimension name is "t"
- recommended bands dimension name is "bands"

other refs: Open-EO#123, Open-EO#132, Open-EO#128
soxofaan added a commit that referenced this pull request May 11, 2020
…, "temporal")

- bump version so users can stick to older client version if dimension name checks are too aggressive for their backend
- recommended temporal dimension name is "t"
- recommended bands dimension name is "bands"

other refs: #123, #132, #128
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants