Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve geometry nan support / default ZM support #47034

Closed
wants to merge 7 commits into from

Conversation

m-kuhn
Copy link
Member

@m-kuhn m-kuhn commented Jan 27, 2022

Description

  • We export nan into wkt but did not support parsing the same -> we now support nan for Z and M in wkt parsing
  • When upcasting geometries to Z/M counterparts, 0.0 was hardcoded value. This extends the API to provide default values for these dimensions

@github-actions github-actions bot added this to the 3.24.0 milestone Jan 27, 2022
@rouault
Copy link
Contributor

rouault commented Jan 27, 2022

* We export `nan` into wkt

just to note that this is an extension to the WKT spec and that neither OGR or PostGIS supports reading that.

@m-kuhn
Copy link
Member Author

m-kuhn commented Jan 27, 2022

I am aware of that.
Is there a more correct way to store NULL values for Z into a 3D table?

@3nids

This comment has been minimized.

@3nids

This comment has been minimized.

@m-kuhn

This comment has been minimized.

@3nids

This comment has been minimized.

@nyalldawson
Copy link
Collaborator

just to note that this is an extension to the WKT spec and that neither OGR or PostGIS supports reading that.

And it causes a lot of annoying bugs as a result 😨

@m-kuhn
Copy link
Member Author

m-kuhn commented Jan 28, 2022

This PR does not export more nan. It just brings our parsing capabilities on par with what we export already. So it effectively reduces those bugs. Worst case this is dead code at some point because no one exports nan any more 😅

I'd still be interested in hearing proposals for a more correct z null handling.

@m-kuhn
Copy link
Member Author

m-kuhn commented Jan 29, 2022

@nyalldawson are you ok with this fix or would you prefer a different approach?

@@ -6292,6 +6292,13 @@ def coerce_to_wkt(wkt, type, defaultZ=None, defaultM=None):
QgsWkbTypes.MultiLineString),
['MultiLineString ((1 1, 1 2, 2 2, 1 1),(3 3, 4 3, 4 4, 3 3))'])

# Missing z values
self.assertEqual(coerce_to_wkt('Point (1 1)', QgsWkbTypes.PointZ), ['PointZ (1 1 nan)'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, PostGIS uses 0 as the missing value when forcing dimensions:

strk=# select ST_AsEWKT(ST_Force3DZ('POINT(1 1)'::geometry)) force_z, ST_AsEWKT(ST_Force3DM('POINT(1 1)'::geometry)) force_m;
   force_z    |    force_m    
--------------+---------------
 POINT(1 1 0) | POINTM(1 1 0)
(1 row)

PostGIS will consider geometries with NaN values as invalid:

strk=# select valid, reason, ST_AsEWKT(location)  FROM ST_IsValidDetail('POINT(0 NaN 0)'::geometry);
 valid |       reason       |   st_asewkt    
-------+--------------------+----------------
 f     | Invalid Coordinate | POINT(0 NaN 0)
(1 row)

PostGIS will NOT catch NaN in Z as being invalid (but I've no idea how many warms will be out their cans):

strk=# select valid, reason, ST_AsEWKT(location)  FROM ST_IsValidDetail('POINT(0 0 NaN)'::geometry);
 valid | reason | st_asewkt 
-------+--------+-----------
 t     |        | 
(1 row)

All of the above happens with POSTGIS="3.3.0dev 3.2.0-347-gf26b2ce71"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And Postgis doesn't allow NaN in WKT (at reading):

# SELECT St_AsText(ST_MakePoint(1, 2, 'NaN'));
     st_astext
-------------------
 POINT Z (1 2 NaN)

But:

# SELECT ST_GeomFromText('POINTZ(1 2 NaN)');
ERREUR:  parse error - invalid geometry
ASTUCE : "POINTZ(1 2 NaN)" <-- parse error at position 15 within geometry

# SELECT St_AsText(ST_GeomFromText(ST_AsText(ST_MakePoint(1, 2, 'NaN'))));
ERREUR:  parse error - invalid geometry
ASTUCE : "POINT Z (1 2 NaN)" <-- parse error at position 17 within geometry

@nyalldawson
Copy link
Collaborator

@m-kuhn

are you ok with this fix or would you prefer a different approach?

I've nothing against this fix (since it makes reading more robust), but I do think it was a mistake to introduce nan support in z/m values in the first place and I believe that we should remove support for writing these (at least for the OGR and Postgres providers, possibly other providers too). I've run into many customer issues with datasets/workflows which break where the fault has eventually been tracked down to nan z values in a feature. (On both OGR and postgres datasets, and on custom workflows where geometries are exported as WKT and then sent to some other API).

@nyalldawson
Copy link
Collaborator

@rouault

I believe that we should remove support for writing these (at least for the OGR ... provider)

What's your thoughts here? Is writing nan values to OGR feature coordinates something that is ever safe to do?

@rouault
Copy link
Contributor

rouault commented Jan 30, 2022

What's your thoughts here? Is writing nan values to OGR feature coordinates something that is ever safe to do?

It is difficult to assess globally how all parts of OGR would behave when feed with NaN coordinates... OGRPoint is ready for that for the POINT EMPTY representation from WKB. For other cases, this must be mostly untested.

Perhaps instead of NaN, -DOUBLE_MAX could be used ? The good old shapefile spec says "Floating point numbers must be numeric values. Positive infinity, negative infinity, and
Not-a-Number (NaN) values are not allowed in shapefiles. Nevertheless, shapefiles
support the concept of "no data" values, but they are currently used only for measures.
Any floating point number smaller than –1e38 is considered by a shapefile reader to
represent a "no data" value."

@m-kuhn
Copy link
Member Author

m-kuhn commented Jan 31, 2022

I have also experienced such issues and this is an attempt to improve things.

Keeping nan as some magic value (-DOUBLE_MAX) inside QGIS (and cosmetically improving it e.g. in the node tool) is a good option. It will still have issues with things like calculating the avg height of features and the readability of wkt.

If external interfaces are the main concern here, we could adjust things there

  • In providers: map nan to -DOUBLE_MAX
  • In wkt export: cut dimensions with nan (PointZ(1 2 nan) -> Point(1 2))

This would still allow us to be "correct" inside QGIS but vastly improve the compatibility with the outside world. And if a provider supports a different concept for null z we can adjust the compatibility interface specifically.

@lbartoletti
Copy link
Member

I think there is a lack of information into the OGC spec. But I agree that NaN should only be available for POINT EMPTY and I like the Shapefile spec: no NaN or +/-Inf.

BUT, QGIS doesn't constraints the input of the coordinates. We have started to add constraints, but there is still work to do (and it could break things IIRC).

Today, a QgsPoint will return "EMPTY" if X and Y are NaN:

p = QgsPoint(math.nan, math.nan, math.nan, math.nan, QgsWkbTypes.PointZM)
p.asWkt()
#'PointZM EMPTY'
p.setX(1)
p.asWkt()
#'PointZM EMPTY'
p.setY(2)
p.asWkt()
#'PointZM (1 2 nan nan)'

So, Is the last line should be Point(1 2)? And after:

p.setM(4)
p.asWkt()
#'PointZM (1 2 nan 4)'

Is p should be PointM (1 2 4)?

And for the other geometries?

l = QgsLineString([(1, math.nan), (2, 3)])
l.asWkt()
# 'LineString (1 nan, 2 3)'
l = QgsLineString([(1, 3, math.nan), (2, 3, 5)])
l.asWkt()
# 'LineStringZ (1 3 nan, 2 3 5)'
l = QgsLineString([(math.nan, math.nan), (math.nan, math.nan)])
l.asWkt()
# 'LineString (nan nan, nan nan)'

Is it acceptable? I guess not.

@nyalldawson
Copy link
Collaborator

@m-kuhn

If external interfaces are the main concern here, we could adjust things there
In providers: map nan to -DOUBLE_MAX

I agree that this is a good approach -- then we can adjust individual providers depending on what value makes most sense for that provider (e.g. if a provider DOES support nan z/m we can leave things unchanged).

We could potentially also do the opposite when reading features in for that provider, and eg for OGR provider automatically flip -DOUBLE_MAX z/m values to nan.

In wkt export: cut dimensions with nan (PointZ(1 2 nan) -> Point(1 2))

This one concerns me a little, in that the round trip geom->wkt->geom we result in a different dimensioned geometry. I think a more flexible approach might be to add a flags argument to QgsAbstractGeometry/QgsGeometry::asWkt to allow the caller to control the behavior, with flags for dropping dimensions, using nan, or using -doublemax (and we could default to -doublemax)

@m-kuhn
Copy link
Member Author

m-kuhn commented Feb 1, 2022

That sounds good to me.

Dropping the dimensions might be a bad idea, as @lbartoletti pointed out correctly with LineStringZ(1 2 3, 3 2 nan, 4 5 6) will potentially drop valuable information, so I'd say we just stick to either nan or -double::max.

Thanks everyone for all the good feedback in here.

@github-actions
Copy link

The QGIS project highly values your contribution and would love to see this work merged! Unfortunately this PR has not had any activity in the last 14 days and is being automatically marked as "stale". If you think this pull request should be merged, please check

  • that all unit tests are passing

  • that all comments by reviewers have been addressed

  • that there is enough information for reviewers, in particular

    • link to any issues which this pull request fixes

    • add a description of workflows which this pull request fixes

    • add screenshots if applicable

  • that you have written unit tests where possible
    In case you should have any uncertainty, please leave a comment and we will be happy to help you proceed with this pull request.
    If there is no further activity on this pull request, it will be closed in a week.

@github-actions github-actions bot added the stale Uh oh! Seems this work is abandoned, and the PR is about to close. label Feb 16, 2022
@nyalldawson nyalldawson modified the milestones: 3.24.0, 3.26 (Feature) Feb 20, 2022
@nyalldawson nyalldawson removed the stale Uh oh! Seems this work is abandoned, and the PR is about to close. label Feb 20, 2022
@github-actions
Copy link

github-actions bot commented Mar 7, 2022

The QGIS project highly values your contribution and would love to see this work merged! Unfortunately this PR has not had any activity in the last 14 days and is being automatically marked as "stale". If you think this pull request should be merged, please check

  • that all unit tests are passing

  • that all comments by reviewers have been addressed

  • that there is enough information for reviewers, in particular

    • link to any issues which this pull request fixes

    • add a description of workflows which this pull request fixes

    • add screenshots if applicable

  • that you have written unit tests where possible
    In case you should have any uncertainty, please leave a comment and we will be happy to help you proceed with this pull request.
    If there is no further activity on this pull request, it will be closed in a week.

@github-actions github-actions bot added the stale Uh oh! Seems this work is abandoned, and the PR is about to close. label Mar 7, 2022
@m-kuhn
Copy link
Member Author

m-kuhn commented Feb 17, 2023

I think this is worth merging as it improves reading nan from wkt support.

@github-actions github-actions bot removed the stale Uh oh! Seems this work is abandoned, and the PR is about to close. label Feb 17, 2023
@nirvn
Copy link
Contributor

nirvn commented Feb 17, 2023

I would very much like to see this merged for 3.30 (and backported to 3.28). Getting null geometry from a PointMZ( 1 2 nan nan ) is quite bad and this PR fixes that.

Co-authored-by: Denis Rouzaud <denis.rouzaud@gmail.com>
@github-actions
Copy link

github-actions bot commented Mar 4, 2023

The QGIS project highly values your contribution and would love to see this work merged! Unfortunately this PR has not had any activity in the last 14 days and is being automatically marked as "stale". If you think this pull request should be merged, please check

  • that all unit tests are passing

  • that all comments by reviewers have been addressed

  • that there is enough information for reviewers, in particular

    • link to any issues which this pull request fixes

    • add a description of workflows which this pull request fixes

    • add screenshots if applicable

  • that you have written unit tests where possible
    In case you should have any uncertainty, please leave a comment and we will be happy to help you proceed with this pull request.
    If there is no further activity on this pull request, it will be closed in a week.

@github-actions github-actions bot added the stale Uh oh! Seems this work is abandoned, and the PR is about to close. label Mar 4, 2023
Copy link
Contributor

@strk strk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, thanks !

@strk
Copy link
Contributor

strk commented Mar 6, 2023

This is the CI issue:

======================================================================
FAIL: testCoerce (__main__.TestQgsGeometry)
Test coerce function
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/QGIS/tests/src/python/test_qgsgeometry.py", line 6426, in testCoerce
    self.assertEqual(coerce_to_wkt('PointZ (1 1 nan)', QgsWkbTypes.PointZM), ['PointZM (1 1 nan nan)'])
AssertionError: Lists differ: [] != ['PointZM (1 1 nan nan)']

Second list contains 1 additional elements.
First extra element 0:
'PointZM (1 1 nan nan)'

- []
+ ['PointZM (1 1 nan nan)']

----------------------------------------------------------------------

Should be fixed before merge.
Please also consider adding more tests with ZM inputs and Z-only and M-only outputs

@github-actions github-actions bot removed the stale Uh oh! Seems this work is abandoned, and the PR is about to close. label Mar 6, 2023
@github-actions
Copy link

The QGIS project highly values your contribution and would love to see this work merged! Unfortunately this PR has not had any activity in the last 14 days and is being automatically marked as "stale". If you think this pull request should be merged, please check

  • that all unit tests are passing

  • that all comments by reviewers have been addressed

  • that there is enough information for reviewers, in particular

    • link to any issues which this pull request fixes

    • add a description of workflows which this pull request fixes

    • add screenshots if applicable

  • that you have written unit tests where possible
    In case you should have any uncertainty, please leave a comment and we will be happy to help you proceed with this pull request.
    If there is no further activity on this pull request, it will be closed in a week.

@github-actions github-actions bot added the stale Uh oh! Seems this work is abandoned, and the PR is about to close. label Mar 21, 2023
@YoannQDQ YoannQDQ modified the milestones: 3.26 (Feature), 3.32.0 Mar 22, 2023
@github-actions github-actions bot removed the stale Uh oh! Seems this work is abandoned, and the PR is about to close. label Mar 23, 2023
@github-actions
Copy link

github-actions bot commented Apr 7, 2023

The QGIS project highly values your contribution and would love to see this work merged! Unfortunately this PR has not had any activity in the last 14 days and is being automatically marked as "stale". If you think this pull request should be merged, please check

  • that all unit tests are passing

  • that all comments by reviewers have been addressed

  • that there is enough information for reviewers, in particular

    • link to any issues which this pull request fixes

    • add a description of workflows which this pull request fixes

    • add screenshots if applicable

  • that you have written unit tests where possible
    In case you should have any uncertainty, please leave a comment and we will be happy to help you proceed with this pull request.
    If there is no further activity on this pull request, it will be closed in a week.

@github-actions github-actions bot added the stale Uh oh! Seems this work is abandoned, and the PR is about to close. label Apr 7, 2023
@github-actions
Copy link

While we hate to see this happen, this PR has been automatically closed because it has not had any activity in the last 21 days. If this pull request should be reconsidered, please follow the guidelines in the previous comment and reopen this pull request. Or, if you have any further questions, just ask! We love to help, and if there's anything the QGIS project can do to help push this PR forward please let us know how we can assist.

@github-actions github-actions bot closed this Apr 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Uh oh! Seems this work is abandoned, and the PR is about to close.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants