-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve geometry nan support / default ZM support #47034
Conversation
just to note that this is an extension to the WKT spec and that neither OGR or PostGIS supports reading that. |
I am aware of that. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
And it causes a lot of annoying bugs as a result 😨 |
This PR does not export more nan. It just brings our parsing capabilities on par with what we export already. So it effectively reduces those bugs. Worst case this is dead code at some point because no one exports nan any more 😅 I'd still be interested in hearing proposals for a more correct z null handling. |
@nyalldawson are you ok with this fix or would you prefer a different approach? |
tests/src/python/test_qgsgeometry.py
Outdated
@@ -6292,6 +6292,13 @@ def coerce_to_wkt(wkt, type, defaultZ=None, defaultM=None): | |||
QgsWkbTypes.MultiLineString), | |||
['MultiLineString ((1 1, 1 2, 2 2, 1 1),(3 3, 4 3, 4 4, 3 3))']) | |||
|
|||
# Missing z values | |||
self.assertEqual(coerce_to_wkt('Point (1 1)', QgsWkbTypes.PointZ), ['PointZ (1 1 nan)']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, PostGIS uses 0 as the missing value when forcing dimensions:
strk=# select ST_AsEWKT(ST_Force3DZ('POINT(1 1)'::geometry)) force_z, ST_AsEWKT(ST_Force3DM('POINT(1 1)'::geometry)) force_m;
force_z | force_m
--------------+---------------
POINT(1 1 0) | POINTM(1 1 0)
(1 row)
PostGIS will consider geometries with NaN values as invalid:
strk=# select valid, reason, ST_AsEWKT(location) FROM ST_IsValidDetail('POINT(0 NaN 0)'::geometry);
valid | reason | st_asewkt
-------+--------------------+----------------
f | Invalid Coordinate | POINT(0 NaN 0)
(1 row)
PostGIS will NOT catch NaN in Z as being invalid (but I've no idea how many warms will be out their cans):
strk=# select valid, reason, ST_AsEWKT(location) FROM ST_IsValidDetail('POINT(0 0 NaN)'::geometry);
valid | reason | st_asewkt
-------+--------+-----------
t | |
(1 row)
All of the above happens with POSTGIS="3.3.0dev 3.2.0-347-gf26b2ce71"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And Postgis doesn't allow NaN in WKT (at reading):
# SELECT St_AsText(ST_MakePoint(1, 2, 'NaN'));
st_astext
-------------------
POINT Z (1 2 NaN)
But:
# SELECT ST_GeomFromText('POINTZ(1 2 NaN)');
ERREUR: parse error - invalid geometry
ASTUCE : "POINTZ(1 2 NaN)" <-- parse error at position 15 within geometry
# SELECT St_AsText(ST_GeomFromText(ST_AsText(ST_MakePoint(1, 2, 'NaN'))));
ERREUR: parse error - invalid geometry
ASTUCE : "POINT Z (1 2 NaN)" <-- parse error at position 17 within geometry
I've nothing against this fix (since it makes reading more robust), but I do think it was a mistake to introduce nan support in z/m values in the first place and I believe that we should remove support for writing these (at least for the OGR and Postgres providers, possibly other providers too). I've run into many customer issues with datasets/workflows which break where the fault has eventually been tracked down to nan z values in a feature. (On both OGR and postgres datasets, and on custom workflows where geometries are exported as WKT and then sent to some other API). |
What's your thoughts here? Is writing nan values to OGR feature coordinates something that is ever safe to do? |
It is difficult to assess globally how all parts of OGR would behave when feed with NaN coordinates... OGRPoint is ready for that for the POINT EMPTY representation from WKB. For other cases, this must be mostly untested. Perhaps instead of NaN, -DOUBLE_MAX could be used ? The good old shapefile spec says "Floating point numbers must be numeric values. Positive infinity, negative infinity, and |
I have also experienced such issues and this is an attempt to improve things. Keeping nan as some magic value (-DOUBLE_MAX) inside QGIS (and cosmetically improving it e.g. in the node tool) is a good option. It will still have issues with things like calculating the avg height of features and the readability of wkt. If external interfaces are the main concern here, we could adjust things there
This would still allow us to be "correct" inside QGIS but vastly improve the compatibility with the outside world. And if a provider supports a different concept for null z we can adjust the compatibility interface specifically. |
I think there is a lack of information into the OGC spec. But I agree that NaN should only be available for BUT, QGIS doesn't constraints the input of the coordinates. We have started to add constraints, but there is still work to do (and it could break things IIRC). Today, a QgsPoint will return "EMPTY" if X and Y are NaN: p = QgsPoint(math.nan, math.nan, math.nan, math.nan, QgsWkbTypes.PointZM)
p.asWkt()
#'PointZM EMPTY'
p.setX(1)
p.asWkt()
#'PointZM EMPTY'
p.setY(2)
p.asWkt()
#'PointZM (1 2 nan nan)' So, Is the last line should be p.setM(4)
p.asWkt()
#'PointZM (1 2 nan 4)' Is p should be And for the other geometries? l = QgsLineString([(1, math.nan), (2, 3)])
l.asWkt()
# 'LineString (1 nan, 2 3)'
l = QgsLineString([(1, 3, math.nan), (2, 3, 5)])
l.asWkt()
# 'LineStringZ (1 3 nan, 2 3 5)'
l = QgsLineString([(math.nan, math.nan), (math.nan, math.nan)])
l.asWkt()
# 'LineString (nan nan, nan nan)' Is it acceptable? I guess not. |
I agree that this is a good approach -- then we can adjust individual providers depending on what value makes most sense for that provider (e.g. if a provider DOES support nan z/m we can leave things unchanged). We could potentially also do the opposite when reading features in for that provider, and eg for OGR provider automatically flip -DOUBLE_MAX z/m values to nan.
This one concerns me a little, in that the round trip geom->wkt->geom we result in a different dimensioned geometry. I think a more flexible approach might be to add a flags argument to QgsAbstractGeometry/QgsGeometry::asWkt to allow the caller to control the behavior, with flags for dropping dimensions, using nan, or using -doublemax (and we could default to -doublemax) |
That sounds good to me. Dropping the dimensions might be a bad idea, as @lbartoletti pointed out correctly with Thanks everyone for all the good feedback in here. |
The QGIS project highly values your contribution and would love to see this work merged! Unfortunately this PR has not had any activity in the last 14 days and is being automatically marked as "stale". If you think this pull request should be merged, please check
|
The QGIS project highly values your contribution and would love to see this work merged! Unfortunately this PR has not had any activity in the last 14 days and is being automatically marked as "stale". If you think this pull request should be merged, please check
|
I think this is worth merging as it improves reading nan from wkt support. |
I would very much like to see this merged for 3.30 (and backported to 3.28). Getting null geometry from a |
Co-authored-by: Denis Rouzaud <denis.rouzaud@gmail.com>
The QGIS project highly values your contribution and would love to see this work merged! Unfortunately this PR has not had any activity in the last 14 days and is being automatically marked as "stale". If you think this pull request should be merged, please check
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me, thanks !
This is the CI issue:
Should be fixed before merge. |
The QGIS project highly values your contribution and would love to see this work merged! Unfortunately this PR has not had any activity in the last 14 days and is being automatically marked as "stale". If you think this pull request should be merged, please check
|
The QGIS project highly values your contribution and would love to see this work merged! Unfortunately this PR has not had any activity in the last 14 days and is being automatically marked as "stale". If you think this pull request should be merged, please check
|
While we hate to see this happen, this PR has been automatically closed because it has not had any activity in the last 21 days. If this pull request should be reconsidered, please follow the guidelines in the previous comment and reopen this pull request. Or, if you have any further questions, just ask! We love to help, and if there's anything the QGIS project can do to help push this PR forward please let us know how we can assist. |
Description
nan
into wkt but did not support parsing the same -> we now support nan for Z and M in wkt parsing