Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add driver for GeoDataFrame #750

Merged
merged 28 commits into from
Feb 9, 2024
Merged

Add driver for GeoDataFrame #750

merged 28 commits into from
Feb 9, 2024

Conversation

Jaapel
Copy link
Contributor

@Jaapel Jaapel commented Jan 24, 2024

Issue addressed

Adresses for GeoDataFrame #720

Explanation

Since we want to work with multiple sources of data, we seperate functionality that the DataAdapter class was responsible for and split it until multiple classes. I implemented this just for the GeoDataFrame data type within HydroMT for a single driver (PyogrioDriver).
I created a Driver class for loading the data. This class is specific for one DataAdapter. This class should be extendable for plugins and data sources that we do not want to maintain (like custom APIs).
I created a MetaDataResolver class for resolving the URI (think naming convention, but later also a STAC catalog or a GeoNetwork). This resolver is generic for all DataAdapters. This class should be extendable to incorporate different ways of discovering data from a single URI.
I created a DataSource for loading items from the DataCatalog and verifying whether they are correct. These are specific for every DataAdapter.

Checklist

  • Updated tests or added new tests
  • Branch is up to date with v1
  • Tests & pre-commit hooks pass
  • Updated documentation if needed
  • Updated changelog.rst if needed
  • For predefined catalogs: update the catalog version in the file itself, the references in data/predefined_catalogs.yml, and the changelog in data/changelog.rst

Additional Notes (optional)

Add any additional notes or information that may be helpful.

@Jaapel Jaapel added GeoDataFrame issue related to GeoDataFrame adapter or processing V1 labels Jan 24, 2024
@Jaapel Jaapel self-assigned this Jan 24, 2024
Copy link
Contributor

@savente93 savente93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work Jaap, I think the proof of concept looks really nice. I left a lot of comments, many of them pedantic, but they're just suggestions. In general I think this is going to a really good place!

hydromt/data_adapter_v1/geodataframe_adapter.py Outdated Show resolved Hide resolved
hydromt/data_adapter_v1/geodataframe_adapter.py Outdated Show resolved Hide resolved
hydromt/data_adapter_v1/geodataframe_adapter.py Outdated Show resolved Hide resolved
hydromt/typing.py Outdated Show resolved Hide resolved
hydromt/data_sources/data_source.py Show resolved Hide resolved
hydromt/metadata_resolvers/resolver_plugin.py Outdated Show resolved Hide resolved
Comment on lines 33 to 42
def _validate_bbox(
bbox: tuple[float, float, float, float],
) -> tuple[float, float, float, float]:
assert (
bbox[0] < bbox[2]
), f"bbox minx: '{bbox[0]}' should be less than maxx: '{bbox[2]}'."
assert (
bbox[1] < bbox[3]
), f"bbox miny: '{bbox[1]}' should be less than maxy: '{bbox[3]}'."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to validate a crs here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could make a BboxWithProjection type or something and use it.

tests/data_sources/test_geodataframe_data_source.py Outdated Show resolved Hide resolved
tests/drivers/test_pyogrio_driver.py Show resolved Hide resolved
@Jaapel Jaapel marked this pull request as ready for review February 7, 2024 16:10
@Jaapel Jaapel requested a review from savente93 February 8, 2024 14:44
@Jaapel
Copy link
Contributor Author

Jaapel commented Feb 8, 2024

@savente93 I notice that the tests are not passing due to "mamba not found" on windows test nodes. Any ideas?

@savente93
Copy link
Contributor

@Jaapel yeah, I still haven't had time to properly configure the windows tests, that's one me. You can ignore that until I fixed that.

Copy link
Contributor

@savente93 savente93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work Jaap. I didn't look through al lthe tests in detail, but I'm very impressed by them. I think the real proff of the pudding will be in the integration, so for now I think this is great. I left a couple of small comments, but overal I'm really liking the direction this is going in!

hydromt/data_sources/geodataframe.py Show resolved Hide resolved
hydromt/data_sources/data_source.py Show resolved Hide resolved
hydromt/data_sources/geodataframe.py Show resolved Hide resolved

from .geodataframe_driver import GeoDataFrameDriver

GEOM_TYPES = Union[Geom, BaseGeometry]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to typing/type_def.py please (or make a TODO for it)

Comment on lines 214 to 217
geom: GPD_TYPES | None = None,
bbox: Bbox | None = None,
buffer: float = 0.0,
crs: CRS | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.10 syntax, I'm surprised this got through the tests.

crs: Optional[CRS] = None,
predicate: str = "intersects",
logger: Optional[Logger] = None,
# handle_nodata: NoDataStrategy = NoDataStrategy.RAISE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this still be commented out? if so is there a TODO for resolving this?

Comment on lines +32 to +38
if "{" in uri:
uri_expanded = ""
for literal_text, key, fmt, _ in Formatter().parse(uri):
uri_expanded += literal_text
if key is None:
continue
key_str = "{" + f"{key}:{fmt}" + "}" if fmt else "{" + key + "}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally fine to do this later, but I'm wondeirng if we can replace this whole code block with a regex.

Comment on lines +144 to +146
markers = [
"integration: marks tests as being integration tests"
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we also want one for unit tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could!

Comment on lines +352 to +371
@pytest.fixture()
def mock_driver(geodf: gpd.GeoDataFrame) -> GeoDataFrameDriver:
class MockGeoDataFrameDriver(GeoDataFrameDriver):
def read(self, *args, **kwargs) -> gpd.GeoDataFrame:
return geodf

driver = MockGeoDataFrameDriver.model_validate({})
return driver


@pytest.fixture()
def mock_resolver() -> MetaDataResolver:
class MockMetaDataResolver(MetaDataResolver):
def resolve(self, uri, *args, **kwargs):
return [uri]

resolver = MockMetaDataResolver.model_validate({})
return resolver


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice!

@Jaapel Jaapel merged commit cf65a1d into v1 Feb 9, 2024
4 of 8 checks passed
@Jaapel Jaapel deleted the add_driver branch February 9, 2024 13:33
@savente93 savente93 linked an issue Apr 8, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GeoDataFrame issue related to GeoDataFrame adapter or processing V1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Simplify data drivers
2 participants