Skip to content

Commit

Permalink
Merge pull request #12 from Open-EO/add_example
Browse files Browse the repository at this point in the history
Add example
  • Loading branch information
claxn authored Apr 22, 2020
2 parents dac35bc + e51a282 commit 544c7b0
Show file tree
Hide file tree
Showing 6 changed files with 264 additions and 59 deletions.
23 changes: 18 additions & 5 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,22 @@
Changelog
=========

Version 0.1
===========
Version 1.0.0
=============

- Feature A added
- FIX: nasty bug #1729 fixed
- add your changes here!
- Restructuring of graph classes and module setup. The following things changed in terms of the code:
- renamed `node.graph` to `node.content`
- all operations on a graph (dependencies, ancestors, lineage, ...) return now a subgraph
- a graph has two new properties: `ids` and `nodes`. `ids` are the node IDs and `nodes` the nodes. Both are views
- `nnodes` was removed and can be replaced by calling `len(graph)`
- new class method `from_list` converts a list of nodes to a graph
- `__getitem__` method in the graph class supports indexing by integer and node ID
- `get_node_by_name` method in the graph class returns the first node matching a given name
- `nodes_at_same_level` in the graph class was renamed and adapted to `find_siblings` (all nodes having the same parent)
- Additional tests


Version 0.0.1
=============

- First release for the openEO API 0.4
218 changes: 218 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,218 @@
# openeo-pg-parser-python

This package allows to parse an *openEO* process graph (JSON) to a traversable Python object (`graph`), describing process dependencies and contents.


## Installation

### Install miniconda and clone repository

```
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
bash miniconda.sh -b -p $HOME/miniconda
export PATH="$HOME/miniconda/bin:$PATH"
git clone https://github.com/Open-EO/openeo-pg-parser-python.git
cd openeo-pg-parser-python
```

### Create the conda environment

```
conda env create -f conda_environment.yml
```

### Install package in the conda environment

```
source activate openeo-pg-parser-python
python setup.py install
```

Change 'install' with 'develop' if you plan to further develop the package.


## Example

Here, we show how an *openEO* process graph can be translated into a `graph` object.
An exemplary process graph is stored in a file named *"process_graph_example.json"* and is given below:
```json
{
"s2a": {
"process_id": "load_collection",
"process_description": "Loading S2A data.",
"arguments": {
"id": "CGS_SENTINEL2_RADIOMETRY_V102_001",
"spatial_extent": {
"north": 48.40,
"south": 47.90,
"east": 16.84,
"west": 15.96
},
"temporal_extent": ["2017-09-05", "2017-10-01"]
}
},
"ndvi": {
"process_id": "ndvi",
"process_description": "Calculate NDVI.",
"arguments": {
"data": {"from_node": "s2a"},
"name": "ndvi"
}
},
"min_time": {
"process_id": "reduce",
"process_description": "Take the minimum value in the time series.",
"arguments": {
"data": {"from_node": "ndvi"},
"dimension": "temporal",
"reducer": {
"callback": {
"process_id": "min",
"process_description": "Calculate minimum",
"arguments": {
"data": {"from_argument": "data"}
},
"result": true
}
}
}
},
"output": {
"process_id": "save_result",
"description": "Save to disk",
"arguments": {
"data": {"from_node": "min_time"},
"format": "Gtiff"
}
}
}
```
To translate the JSON file into a python object, use:
```python
from openeo_pg_parser_python.translate import translate_process_graph

pg_filepath = "process_graph_example.json"
process_graph = translate_process_graph(pg_filepath)
```
If you print the `graph` you get the information contained in each node:
```
Node ID: s2a_0
Node Name: s2a
{'arguments': {'id': 'CGS_SENTINEL2_RADIOMETRY_V102_001',
'spatial_extent': {'east': 16.84,
'north': 48.4,
'south': 47.9,
'west': 15.96},
'temporal_extent': ['2017-09-05', '2017-10-01']},
'process_description': 'Loading S2A data.',
'process_id': 'load_collection'}
Node ID: ndvi_1
Node Name: ndvi
{'arguments': {'data': {'from_node': 's2a_0'}, 'name': 'ndvi'},
'process_description': 'Calculate NDVI.',
'process_id': 'ndvi'}
Node ID: min_time_2
Node Name: min_time
{'arguments': {'data': {'from_node': 'ndvi_1'},
'dimension': 'temporal',
'reducer': {'from_node': 'callback_3'}},
'process_description': 'Take the minimum value in the time series.',
'process_id': 'reduce'}
Node ID: callback_3
Node Name: callback
{'arguments': {'data': {'from_node': 'ndvi_1'}},
'process_description': 'Calculate minimum',
'process_id': 'min',
'result': True}
Node ID: output_4
Node Name: output
{'arguments': {'data': {'from_node': 'min_time_2'}, 'format': 'Gtiff'},
'description': 'Save to disk',
'process_id': 'save_result'}
```
It also possible to sort the process graph by the dependency of each node
with:
```python
sorted_process_graph = process_graph.sort(by='dependency')
```
```
Node ID: s2a_0
Node Name: s2a
{'arguments': {'id': 'CGS_SENTINEL2_RADIOMETRY_V102_001',
'spatial_extent': {'east': 16.84,
'north': 48.4,
'south': 47.9,
'west': 15.96},
'temporal_extent': ['2017-09-05', '2017-10-01']},
'process_description': 'Loading S2A data.',
'process_id': 'load_collection'}
Node ID: ndvi_1
Node Name: ndvi
{'arguments': {'data': {'from_node': 's2a_0'}, 'name': 'ndvi'},
'process_description': 'Calculate NDVI.',
'process_id': 'ndvi'}
Node ID: callback_3
Node Name: callback
{'arguments': {'data': {'from_node': 'ndvi_1'}},
'process_description': 'Calculate minimum',
'process_id': 'min',
'result': True}
Node ID: min_time_2
Node Name: min_time
{'arguments': {'data': {'from_node': 'ndvi_1'},
'dimension': 'temporal',
'reducer': {'from_node': 'callback_3'}},
'process_description': 'Take the minimum value in the time series.',
'process_id': 'reduce'}
Node ID: output_4
Node Name: output
{'arguments': {'data': {'from_node': 'min_time_2'}, 'format': 'Gtiff'},
'description': 'Save to disk',
'process_id': 'save_result'}
```
If you are interested in a specific node, you can use Python indexing:
```python
print(sorted_process_graph['min_time_2'])
```
which results in:
```
Node ID: min_time_2
Node Name: min_time
{'arguments': {'data': {'from_node': 'ndvi_1'},
'dimension': 'temporal',
'reducer': {'from_node': 'callback_3'}},
'process_description': 'Take the minimum value in the time series.',
'process_id': 'reduce'}
```
A node has also offers access to its ancestors/parents/dependencies:
```python
print(sorted_process_graph['min_time_2'].dependencies)
```

```
Node ID: ndvi_1
Node Name: ndvi
{'arguments': {'data': {'from_node': 's2a_0'}, 'name': 'ndvi'},
'process_description': 'Calculate NDVI.',
'process_id': 'ndvi'}
Node ID: callback_3
Node Name: callback
{'arguments': {'data': {'from_node': 'ndvi_1'}},
'process_description': 'Calculate minimum',
'process_id': 'min',
'result': True}
```

## Note

This project has been set up using PyScaffold 3.1. For details and usage
information on PyScaffold see https://pyscaffold.org/.
45 changes: 0 additions & 45 deletions README.rst

This file was deleted.

4 changes: 2 additions & 2 deletions src/openeo_pg_parser_python/graph.py
Original file line number Diff line number Diff line change
Expand Up @@ -441,10 +441,10 @@ def sort(self, by='dependency'):
nodes_ordered = []
if by == "dependency":
for node in self.nodes:
insert_idx = 0
insert_idx = len(nodes_ordered)
for node_dependency in node.dependencies:
for idx, node_ordered in enumerate(nodes_ordered):
if (idx >= insert_idx) and (node_dependency.id == node_ordered.id):
if (idx <= insert_idx) and (node_dependency.id == node_ordered.id):
insert_idx = idx + 1 # place the node after the dependency
nodes_ordered.insert(insert_idx, node)
else:
Expand Down
14 changes: 7 additions & 7 deletions src/openeo_pg_parser_python/translate.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ def walk_process_graph(process_graph, nodes, node_ids=None, level=0, prev_level=
if node_ids:
filtered_node_ids = [prev_node_id for prev_node_id in node_ids if prev_node_id]
parent_node = nodes[filtered_node_ids[-1]]
edge_nodes = [parent_node, node]
edge_nodes = [node, parent_node] # for a callback the parent node comes after the node
edge_id = "_".join([edge_node.id for edge_node in edge_nodes])
edge_name = "callback"
edge = Edge(id=edge_id, name=edge_name, nodes=edge_nodes)
Expand Down Expand Up @@ -302,7 +302,7 @@ def adjust_from_arguments(process_graph):
for node in process_graph.nodes:
keys_lineage = find_node_inputs(node.content, "from_argument")
for key_lineage in keys_lineage:
nodes_lineage = process_graph.lineage(node, link="callback")
nodes_lineage = process_graph.lineage(node, link="callback", ancestors=False) # for callbacks the input lineage is inverted
if nodes_lineage:
root_node = nodes_lineage[-1]
node_other = root_node.parent('data')
Expand Down Expand Up @@ -335,12 +335,12 @@ def adjust_callbacks(process_graph):
"""

for node in process_graph.nodes:
node_descendants = node.descendants(link="callback")
if node_descendants:
node_ancestors = node.ancestors(link="callback") # for a callback the lineage is inverted, thus the ancestors
if node_ancestors:
node_result = None
for node_descendant in node_descendants:
if ("result" in node_descendant.content.keys()) and node_descendant.content['result']:
node_result = node_descendant
for node_ancestor in node_ancestors:
if ("result" in node_ancestor.content.keys()) and node_ancestor.content['result']:
node_result = node_ancestor
break
if node_result:
node.content = replace_callback(node.content, {'from_node': node_result.id})
Expand Down
19 changes: 19 additions & 0 deletions tests/test_graph.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import os
import unittest
from openeo_pg_parser_python.translate import translate_process_graph

from tests import PG_FOLDER


def test_sort_process_graph():
""" Tests sorting of a process graph. """

graph = translate_process_graph(os.path.join(PG_FOLDER, "test_1.json"))
assert list(graph.ids) == ["s2a_0", "ndvi_1", "min_time_2", "callback_3", "output_4"]

sorted_graph = graph.sort(by='dependency')
assert list(sorted_graph.ids) == ["s2a_0", "ndvi_1", "callback_3", "min_time_2", "output_4"]


if __name__ == '__main__':
unittest.main()

0 comments on commit 544c7b0

Please sign in to comment.