Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

st_read() fails to read mixture of XY & XYZ points generated by GPS #1683

Closed
courtiol opened this issue Jun 1, 2021 · 9 comments
Closed

Comments

@courtiol
Copy link

courtiol commented Jun 1, 2021

Hi all, I am experiencing a little problem reading some GPS data with {sf} or {rgdal}.

These GPS points produced by a GPS Garmin 64S (+ the interface GPS Trackmaker Pro) cannot be read out-of-the-box:

> unzip("Map_190627.zip")
> sf::st_read("Map_190627.gtm", layer = "Map_190627_waypoints")
Reading layer `Map_190627_waypoints' from data source `Map_190627.gtm' using driver `GPSTrackMaker'
Error in CPL_get_z_range(unclass(obj)[sel], 0) : 
  z error - expecting three coordinates

The underlying cause is that some points have an elevation attached to them and others do not.

[[133]]
POINT (35.5722 -3.186427)

[[134]]
POINT (34.94196 -3.140191)

[[135]]
POINT Z (35.56935 -3.14396 1743.033)

[[136]]
POINT Z (35.60778 -3.16088 1831.849)

This creates z_range.Set() to crash:

Browse[4]> sys.calls()
[[1]]
sf::st_read("~/Desktop/Map_190627.gtm", layer = "Map_190627_waypoints")

[[2]]
st_read.character("~/Desktop/Map_190627.gtm", layer = "Map_190627_waypoints")

[[3]]
process_cpl_read_ogr(x, quiet, check_ring_dir = check_ring_dir, 
    stringsAsFactors = stringsAsFactors, geometry_column = geometry_column, 
    ...)

[[4]]
st_sfc(geom[[i]], crs = attr(geom[[i]], "crs"))

[[5]]
z_range.Set(lst)

This is because the call is wrapped within the following test:

sf/R/sfc.R

Lines 102 to 110 in 95255c3

u <- unique(sfg_classes[1L,])
if( "XYZM" %in% u ) {
attr(lst, "z_range") = compute_z_range(lst)
attr(lst, "m_range") = compute_m_range(lst)
} else if ( "XYZ" %in% u ) {
attr(lst, "z_range") = compute_z_range(lst)
} else if ("XYM" %in% u ) {
attr(lst, "m_range") = compute_m_range(lst)
}

We have 2 classes here:

Browse[2]> u <- unique(sfg_classes[1L, ])
Browse[2]> sfg_classes[1L, ]
  [1] "XYZ" "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY" 
 [27] "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY" 
 [53] "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY" 
 [79] "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY" 
[105] "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY"  "XY" 
[131] "XY"  "XY"  "XY"  "XY"  "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ"
[157] "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ"
[183] "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ" "XYZ"
[209] "XYZ" "XYZ"
Browse[2]> u
[1] "XYZ" "XY" 

But compute_z_range() cannot handle that. Ultimately, the crash happens in C after the following R call:

.Call("_sf_CPL_get_z_range", PACKAGE = "sf", sf, depth)

And in C there is a check for dimensions:

sf/src/zm_range.cpp

Lines 43 to 54 in 95255c3

case 0: // points:
for (decltype(n) i = 0; i < n; i++) {
Rcpp::NumericVector pt = sf[i];
int pos = get_z_position(pt);
if (i == 0) {
bb[0] = pt[pos];
bb[1] = pt[pos];
} else {
bb[0] = std::min(pt[pos],bb[0]);
bb[1] = std::max(pt[pos],bb[1]);
}
}

but the following check cannot be met:

sf/src/zm_range.cpp

Lines 28 to 33 in 95255c3

int get_z_position(Rcpp::NumericMatrix& nm) {
if( nm.ncol() < 3 ) {
Rcpp::stop("z error - expecting three columns;");
}
return 2;
}

I suspect there may be complexity about handling multiple classes.

I am also surprise that the device (or Trackmaker Pro) generates such heterogeneity (perhaps it comes from poor reception...?).

Yet, it would be great to be able to get the XY coordinates for all points, even if that would imply dropping all Z readings.

Perhaps it could be done automatically with a warning displayed, or perhaps an option could be passed all the way up to sf::st_read() could make this possible.

@courtiol
Copy link
Author

courtiol commented Jun 1, 2021

Humm, the user that generated the GPS file just understood the root of the issue: some points in the GPS were recorded by the GPS directly based on the current location, others were entered manually. This does not change the sf issue though.

@edzer
Copy link
Member

edzer commented Jun 1, 2021

Ah, this is #1592. @dcooley would you mind finishing the change suggested there to zm_range.cpp into a PR?

@rsbivand
Copy link
Member

rsbivand commented Jun 1, 2021

Please also note for my GDAL 3.3.0:

> o <- rgdal::readOGR("Map_190627.gtm", layer = "Map_190627_waypoints")
Error in ogrInfo(dsn = dsn, layer = layer, encoding = encoding, use_iconv = use_iconv,  : 
  Cannot open data source
> o <- sf::st_read("Map_190627.gtm", layer = "Map_190627_waypoints")
Error: Cannot open "/home/rsb/tmp/bigshape/Map_190627.gtm"; The source could be corrupt or not supported. See `st_drivers()` for a list of supported formats.
In addition: Warning message:
In CPL_read_ogr(dsn, layer, query, as.character(options), quiet,  :
  GDAL Error 1: Driver GTM is considered for removal in GDAL 3.5. You are invited to convert any dataset in that format to another more common one .If you need this driver in future GDAL versions, create a ticket at https://github.com/OSGeo/gdal (look first for an existing one first) to explain how critical it is for you (but the GDAL project may still remove it), and to enable it now, set the GDAL_ENABLE_DEPRECATED_DRIVER_GTM configuration option / environment variable to YES

Setting the environment variable before starting R gives:

> Sys.getenv("GDAL_ENABLE_DEPRECATED_DRIVER_GTM")
[1] "YES"
> o <- sf::st_read("Map_190627.gtm", layer = "Map_190627_waypoints")
Reading layer `Map_190627_waypoints' from data source 
  `/home/rsb/tmp/bigshape/Map_190627.gtm' using driver `GPSTrackMaker'
Error in CPL_get_z_range(unclass(obj)[sel], 0) : 
  z error - expecting three coordinates
> o <- rgdal::ogrInfo("Map_190627.gtm", layer = "Map_190627_waypoints")
Error in rgdal::ogrInfo("Map_190627.gtm", layer = "Map_190627_waypoints") : 
  Multiple # dimensions: 2:3

So fixing this should not focus on this driver, which will definitely be dropped fairly soon.

@rsbivand
Copy link
Member

rsbivand commented Jun 1, 2021

rgdal in R-forge rev. 1133 (now building):

> Sys.getenv("GDAL_ENABLE_DEPRECATED_DRIVER_GTM")
[1] "YES"
> o <- rgdal::readOGR("Map_190627.gtm", layer = "Map_190627_waypoints", D3_if_2D3D_points=TRUE)
OGR data source with driver: GPSTrackMaker 
Source: "/home/rsb/tmp/bigshape/Map_190627.gtm", layer: "Map_190627_waypoints"
with 210 features
It has 4 fields
Warning messages:
1: In ogrInfo(dsn = dsn, layer = layer, encoding = encoding, use_iconv = use_iconv,  :
  Accepting mixed 2D and 3D as 3D
2: In rgdal::readOGR("Map_190627.gtm", layer = "Map_190627_waypoints",  :
  Setting any missing third dimension to 0
> o <- rgdal::readOGR("Map_190627.gtm", layer = "Map_190627_waypoints", D3_if_2D3D_points=TRUE, missing_3D=-99999)
OGR data source with driver: GPSTrackMaker 
Source: "/home/rsb/tmp/bigshape/Map_190627.gtm", layer: "Map_190627_waypoints"
with 210 features
It has 4 fields
Warning messages:
1: In ogrInfo(dsn = dsn, layer = layer, encoding = encoding, use_iconv = use_iconv,  :
  Accepting mixed 2D and 3D as 3D
2: In rgdal::readOGR("Map_190627.gtm", layer = "Map_190627_waypoints",  :
  Setting any missing third dimension to -99999
> o1 <- rgdal::readOGR("Map_190627.gtm", layer = "Map_190627_waypoints", D3_if_2D3D_points=TRUE, pointDropZ=TRUE)
OGR data source with driver: GPSTrackMaker 
Source: "/home/rsb/tmp/bigshape/Map_190627.gtm", layer: "Map_190627_waypoints"
with 210 features
It has 4 fields
Warning messages:
1: In ogrInfo(dsn = dsn, layer = layer, encoding = encoding, use_iconv = use_iconv,  :
  Accepting mixed 2D and 3D as 3D
2: In rgdal::readOGR("Map_190627.gtm", layer = "Map_190627_waypoints",  :
  Dropping mixed third dimension

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Sep 20, 2021
# Please note that **rgdal** will be  retired by the end of 2023, plan
  transition to sf/stars/terra  functions using GDAL and  PROJ at your
  earliest convenience.

# Version 1.5-27 (development, rev. 1149-)
* Correcting logic error in check for MXE UCRT builds (temporary)

# Version 1.5-26 (2021-09-15, rev. 1141-1148)
* Mute use of PROJ CDN for MXE UCRT builds (temporary)

* Run autoupdate on configure.ac to handle obsolete AC_HELP_STRING etc.

# Version 1.5-25 (2021-09-08, rev. 1122-1140)
* Add environment variable access to --with-data-copy by
  PROJ_GDAL_DATA_COPY (r-spatial/sf#1605)

* Adaptations for PROJ 8.

* Handle mixed 2D/3D in readOGR():
  (r-spatial/sf#1683)

* Add tweak for UCRT builds.

* Thin examples for spTransform().

# Version 1.5-23 (2021-02-03, rev. 1120-1121)
* Further fallout after removing valgrind issues.

# Version 1.5-22 (2021-02-02, rev. 1106-1119)
* Attempt to remove further valgrind leak in proj6.cpp: PROJcopyEPSG()
  and in ogr_proj.cpp, both wrongly placed object destructors.

* Modified roundtripping all declared projections in ?project examples
  because some listed projections for PROJ >= 5 provoke valgrind
  leakages by returning very large out-of-scope values for input
  coordinates (0, 0); inversion of these is not attempted; some listed
  projections are not projections.

# Version 1.5-21 (2021-01-27, rev. 1093-1105)
* Suggest **rgeos** to write pre-SFS multipolygon objects to avoid
  unpleasant workaround.

* Try to eliminate current valgrind leaks, starting from
  (r-spatial/gstat#82).

* Try to increase robustness to installation with early PROJ 6
  versions, which often lack functionality found necessary later (for
  example visualization order); the code had assumed that this
  function always was available and behaved as it now does. There are
  now graceful failures when not available.

# Version 1.5-19 (2021-01-05, rev. 1083-1092)
* Dan Baston: raster speedups

* PROJ 7.2.1 includes a bug-fix for `+proj=ob_tran` cases that
  required changes in detection and handling of target/source CRS
  reversal,
  https://lists.osgeo.org/pipermail/proj/2020-December/009999.html

# Version 1.5-18 (2020-10-13, rev. 1071-1082)
* condition `tests/test_enforce_xy.R` on PROJ >= 6 and GDAL >= 3
  (email BDR, I forgot to re-check with PROJ-5.2.0/GDAL-2.2.4).

* Adaptation to EPSG 10 database format started (from PROJ 7.2);
  choose DB columns by name not number in vignette.

# Version 1.5-17 (2020-10-08, rev. 1051-1070)
* `"CRS"` instantiation now prefers PROJ: use
  `rgdal::set_prefer_proj(FALSE)` to return to earlier behaviour. It
  seems that access from C/C++ code to mechanisms in PROJ offers more
  depth than going through GDAL to PROJ. This `"CRS"` instantiation in
  **sp** and **raster**; Proj4 and WKT2 strings may differ depending
  on whether instantiation is straight from PROJ or goes via
  GDAL. Confirmed with multiple reverse dependency checks over almost
  900 CRAN packages.

* By default use PROJ function to extract the source CRS from a
  `"BOUNDCRS"`. When `+towgs84=` is given, PROJ and GDAL see the
  apparent source Proj4 string as implicitly implying a coordinate
  operation transforming to target WGS84, leading to the WKT2
  representation being a `"BOUNDCRS"`, not a `"PROJCRS"` or
  `"GEOGCRS"`, and thus causing misunderstandings later in searching
  for the most accurate coordinate operation for a transformation. May
  be controlled by setting the `get_source_if_boundcrs=` in
  `sp::CRS()` from **sp** 1.4-4 (2020-10-07). Confirmed with multiple
  reverse dependency checks over almost 900 CRAN packages.

* Add support for instantiating from `"OGC:CRS84"` to provide a
  guaranteed GIS/visualization axis order WGS84 instantiator
  (preferred to `"EPSG:4326"`).

* Permit empty string in `SRS_string=` argument to `sp::CRS()` and
  functions called by it.

* Use GDAL `ORSIsProjected()` instead of simply looking for
  `"+proj=longlat"` in the Proj4 string representation where possible.

# Version 1.5-16 (2020-08-07, rev. 1047-1050)
* Typo in C code; use `try()` around Area-of-Interest calculation for
  coordinate operations (email BDR, I forgot to re-check with
  PROJ-5.2.0/GDAL-2.2.4).

# Version 1.5-15 (2020-08-04, rev. 1020-1046)
* Add support for instantiating from `"ESRI:"`.

* Add Area-of-Interest to coordinate operation search (reduces the
  number of candidates found in many cases), and use in
  `rgdal::spTransform()` by default (`use_aoi=FALSE` to suppress);
  illustrate in vignette
  https://cran.r-project.org/web/packages/rgdal/vignettes/CRS_projections_transformations.html.

* Harden to condition on PROJ functions only available from 6.2.0;
  block `"+proj=ob_tran` tests for PROJ 6.0.0-6.1.1.

* Support PROJ CDN https://cdn.proj.org for on-demand download of
  transformation grids if requested by user; document in vignette
  https://cran.r-project.org/web/packages/rgdal/vignettes/CRS_projections_transformations.html.

# Version 1.5-12 (2020-06-26, rev. 1007-1019)
* Further corrections to `configure.ac` for older PROJ/GDAL versions

# Version 1.5-10 (2020-06-09, rev. 991-1006)
* Corrections to `configure.ac` for older PROJ/GDAL versions

# Version 1.5-8 (2020-05-28, rev. 846-990)
* Released to match **sp** 1.4.0 (2020-02-21) to 1.4-2 (2020-05-20)
  following months of development adapting to breaking changes in the
  external libraries used here: PROJ and GDAL; see also
  https://cran.r-project.org/web/packages/sp/news.html.

* Expose `options("rgdal_show_exportToProj4_warnings"="none")` to mute
  Proj4 string degradation warnings.

* Add new vignette
  https://cran.r-project.org/web/packages/rgdal/vignettes/CRS_projections_transformations.html.

* CRAN Windows binary uses PROJ >= 6 and GDAL >= 3

* Add PROJ-based CRS comparison: `compare_CRS()`.

* `project()` and `spTransform()` use WKT2 comment if available,
  fallback to Proj4 representation if not.

* List coordinate operations (based on pyproj code): `list_coordOps()`.

* Add `enforce_xy=` arguments to try to ensure that only
  GIS/visualization axis order is present.

* Add `"CRS"` object comment carrying WKT2 (2019) multiline string
  representation on read operations.

* Use `"CRS"` object comment carrying WKT2 (2019) multiline string
  representation on write operations.
@jacpete
Copy link

jacpete commented Oct 19, 2021

I am just wondering if this issue is still being worked on or if time has let it go by the wayside. If it is still a priority, I am providing another, longer reprex where I encountered this same issue while working with geojson data.

I am trying to grab some data from an ESRI REST API in a geojson format using this URL. I just grab two features as a reprex: OBJECTID's 8003 and 8004.

geojsonURL <- "https://jcgis.jacksongov.org/arcgis/rest/services/Cadastral/Parcel_Viewer_Layers/MapServer/18/query?where=1%3D1&text=&objectIds=8003,8004&time=&geometry=&geometryType=esriGeometryEnvelope&inSR=&spatialRel=esriSpatialRelIntersects&distance=&units=esriSRUnit_Foot&relationParam=&outFields=&returnGeometry=true&returnTrueCurves=false&maxAllowableOffset=&geometryPrecision=&outSR=&havingClause=&returnIdsOnly=false&returnCountOnly=false&orderByFields=&groupByFieldsForStatistics=&outStatistics=&returnZ=false&returnM=false&gdbVersion=&historicMoment=&returnDistinctValues=false&resultOffset=&resultRecordCount=&returnExtentOnly=false&datumTransformation=&parameterValues=&rangeValues=&quantizationParameters=&featureEncoding=esriDefault&f=geojson"

I can get the response with:

response <- httr::content(httr::POST(geojsonURL), as = "text")
response
# [1] "{\"type\":\"FeatureCollection\",\"features\":[{\"type\":\"Feature\",\"geometry\":{\"type\":\"Polygon\",\"coordinates\":[[[-94.194675366432776,39.166084477255353,-100000],[-94.194124464465304,39.166052999913724,-100000],[-94.194008737625296,39.166960647543149,-100000],[-94.194006122634036,39.166981160898921,-100000],[-94.194709654437375,39.167011102033314,-100000],[-94.194775544569055,39.166090202131059,-100000],[-94.194675366432776,39.166084477255353,-100000]]]},\"properties\":{\"name\":\"07-500-02-07-01-0-00-000\"}},{\"type\":\"Feature\",\"geometry\":{\"type\":\"Polygon\",\"coordinates\":[[[-94.195627735946516,39.139730834511894,null],[-94.195600997416108,39.140124809738701,null],[-94.195628419106058,39.140101021511889,null],[-94.195659543360506,39.140082337037093,null],[-94.195693431199146,39.140069320001835,null],[-94.195729060267695,39.140062363114012,null],[-94.195765355681161,39.140061676254334,null],[-94.195801222451692,39.140067280144507,null],[-94.19583557852313,39.140079005722093,null],[-94.195867387415205,39.140096499240876,null],[-94.196138676581825,39.139750789056656,null],[-94.195961086486776,39.139743854574135,null],[-94.195627735946516,39.139730834511894,null]]]},\"properties\":{\"name\":\"07-830-02-16-00-0-00-000\"}}]}"

You can see that 8003 has a numeric Z value (-100000 in all points) and 8004 returns 'null' for the z coordinate. Well if I were to just send this response or the url itself to sf::st_read(), it would get read 8004 as an empty geometry, because the 'null' is not an OGC standard (damnit ESRI...). All coordinate values need to be numeric and only the first 2 long/lat have to be there, z is optional.

sf::st_read(geojsonURL)
sf::st_read(response)
# replacing null geometries with empty geometries
# Simple feature collection with 2 features and 170 fields (with 1 geometry empty)
# Geometry type: POLYGON
# Dimension:     XY, XYZ
# Bounding box:  xmin: -94.194775544569 ymin: 39.1660529999137 xmax: -94.194006122634 ymax: 39.1670111020333
# z_range:       zmin: -1e+05 zmax: -1e+05
# Geodetic CRS:  WGS 84

Luckily enough I can clean up ESRI's encoding mistake for sf to work correctly. Because I can't find a good jsonpath extension built in R, I used {jsonlite} and some lapply's to edit the coordinates as strings. At first I wrote this:

#ESRI will enter null in the vertical/Z value when converting to geoJSON,
#  but that is not standard OGC practice and will cause it to error on read
#  from sf. This sequences through and removes the third value from the
#  coordinate vectors if it is 'null'.
responseEdit <- jsonlite::fromJSON(response, simplifyDataFrame = FALSE, simplifyVector = FALSE,
                               digits = NA)
responseEdit[['features']] <- lapply(responseEdit[['features']], function(feature) {
  feature[['geometry']][['coordinates']] <-
    lapply(feature[['geometry']][['coordinates']], function(featureCoordinate) {
      coordsAsString = jsonlite::toJSON(featureCoordinate, auto_unbox = TRUE, null = 'null', digits = NA)
      if (grepl("\\[-?[0-9]+\\.?[0-9]*,-?[0-9\\.]+,null\\]", coordsAsString)) {
        coordsAsString <- gsub(",null", "", coordsAsString)
      }
      jsonlite::fromJSON(coordsAsString, simplifyDataFrame = FALSE, simplifyVector = FALSE, digits = NA)
    })
  feature
})
responseEditFixed <- as.character(jsonlite::toJSON(responseEdit, auto_unbox = TRUE, null = 'null', digits = NA))
responseEditFixed
# [1] "{\"type\":\"FeatureCollection\",\"features\":[{\"type\":\"Feature\",\"geometry\":{\"type\":\"Polygon\",\"coordinates\":[[[-94.1946753664328,39.1660844772554,-100000],[-94.1941244644653,39.1660529999137,-100000],[-94.1940087376253,39.1669606475431,-100000],[-94.194006122634,39.1669811608989,-100000],[-94.1947096544374,39.1670111020333,-100000],[-94.1947755445691,39.1660902021311,-100000],[-94.1946753664328,39.1660844772554,-100000]]]},\"properties\":{\"name\":\"07-500-02-07-01-0-00-000\"}},{\"type\":\"Feature\",\"geometry\":{\"type\":\"Polygon\",\"coordinates\":[[[-94.1956277359465,39.1397308345119],[-94.1956009974161,39.1401248097387],[-94.1956284191061,39.1401010215119],[-94.1956595433605,39.1400823370371],[-94.1956934311991,39.1400693200018],[-94.1957290602677,39.140062363114],[-94.1957653556812,39.1400616762543],[-94.1958012224517,39.1400672801445],[-94.1958355785231,39.1400790057221],[-94.1958673874152,39.1400964992409],[-94.1961386765818,39.1397507890567],[-94.1959610864868,39.1397438545741],[-94.1956277359465,39.1397308345119]]]},\"properties\":{\"name\":\"07-830-02-16-00-0-00-000\"}}]}"

But what I am effectively doing is making it so the 8003 shape has 3 vector coordinates and the 8004 shape has 2 vector coordinates.
When I ran sf::st_read() I get the error discussed in this issue:

sf::st_read(responseEditFixed)
#Error in CPL_get_z_range(obj, 2) : z error - expecting three columns;

For now I have the coordsAsString <- gsub(",null", "", coordsAsString) set to coordsAsString <- gsub(",null", ",0", coordsAsString) to replace the 'null' with a '0', but its only a temporary workaround until this issue gets fixed.

jacpete added a commit to jacpete/esri2sf that referenced this issue Oct 19, 2021
@dcooley
Copy link
Contributor

dcooley commented Oct 19, 2021

is null a valid coordiante in geojson?

geojsonsf::geojson_sf('{"type":"Point", "coordinates":[1,null]}')

because the 'null' is not an OGC standard

ah, you said this.


In that case, this will work for you

geojsonsf::geojson_sf(responseEditFixed)

@jacpete
Copy link

jacpete commented Oct 19, 2021

I see you confirmed above but I will leave the standards linked below for easier review. I will try out the geojsonsf function. Hadn't heard of that package until today.

All from https://datatracker.ietf.org/doc/html/rfc7946:

It seems the whole array for a geometry can be null, but says nothing about a certain element in the position vector:

A GeoJSON Geometry object of any type other than "GeometryCollection"
has a member with the name "coordinates". The value of the
"coordinates" member is an array. The structure of the elements in
this array is determined by the type of geometry. GeoJSON processors
MAY interpret Geometry objects with empty "coordinates" arrays as null
objects. 3.1

and

A Feature object has a member with the name "geometry". The value of
the geometry member SHALL be either a Geometry object as defined above
or, in the case that the Feature is unlocated, a JSON null value. 3.2

About the Position vector:

It specifies the array is in decimal degrees, but mentions nothing about
allowing a null value.

A position is an array of numbers. There MUST be two or more elements.
The first two elements are longitude and latitude, or easting and
northing, precisely in that order and using decimal numbers. Altitude
or elevation MAY be included as an optional third element. 3.1.1

About the Z value:

This is the reason I set my temp value to 0 (assumed to be on the
ellipsoid if missing)

An OPTIONAL third-position element SHALL be the height in meters above
or below the WGS 84 reference ellipsoid. In the absence of elevation
values, applications sensitive to height or depth SHOULD interpret
positions as being at local ground or sea level. 4

@dcooley
Copy link
Contributor

dcooley commented Oct 19, 2021

If you want to avoid setting the null to zero and would rather omit it, then geojsonsf can handle this

geojson <- '{
"type":"FeatureCollection",
"features":[
 {
  "type":"Feature",
  "geometry":{
    "type":"Point",
    "coordinates":[1,1,1]
    }
  },
{
  "type":"Feature",
  "geometry":{
    "type":"Point",
    "coordinates":[1,1]
    }
  }
]}'

geojsonsf::geojson_sf(geojson)  

Simple feature collection with 2 features and 0 fields
Geometry type: POINT
Dimension:     XY, XYZ
Bounding box:  xmin: 1 ymin: 1 xmax: 1 ymax: 1
z_range:       zmin: 1 zmax: 1
Geodetic CRS:  WGS 84
         geometry
1 POINT Z (1 1 1)
2     POINT (1 1)

@jacpete
Copy link

jacpete commented Oct 19, 2021

So it seems I am still kicking the can with this error. For a little background I am working on the development of a package to read ESRI REST API data to sf objects. However it is common that their REST services have a limit on the number of features you can pull in a single query. For example I am working with an API that has over 300,000 records, but I can only pull 2,000 at a time. The package is set up to iterate through all the chunks and pull the data down one query at a time. Inefficient, I know, but there is a wealth of data stored in ESRI's REST Services. It seems the geojsonsf::geojson_sf() function works great, but with the way my function works, it pulls down the data in batches and then runs an lapply to turn the geojson's to sf dataframes and then uses sf to bind the dataframes together. Example below:

#inputs
url = "https://jcgis.jacksongov.org/arcgis/rest/services/Cadastral/Parcel_Viewer_Layers/MapServer/18"
idsToPull <- 1:100

#Split pulls into chunks server can handle
maxRC <- 24 #retrieved from server (max record count)
idSplits <- split(idsToPull, seq_along(idsToPull) %/% maxRC)

#Get list of geojson returns
getGeoJson <- function(ids, url) {
  query <- list(objectIds = paste(ids, collapse = ","),
                outSR = 4326,
                datumTransformation = 108190, f = "geojson")
  
  responseRaw <- httr::content(httr::POST(paste(url, "query", sep = "/"), body = query, encode = "form",
                                          config = httr::config(ssl_verifypeer = FALSE)), as = "text")
  
  response <- jsonlite::fromJSON(responseRaw, simplifyDataFrame = FALSE, simplifyVector = FALSE,
                                 digits = NA)
  
  response[['features']] <- lapply(response[['features']], function(feature) {
    feature[['geometry']][['coordinates']] <-
      lapply(feature[['geometry']][['coordinates']], function(featureCoordinate) {
        coordsAsString = jsonlite::toJSON(featureCoordinate, auto_unbox = TRUE, null = 'null', digits = NA)
        if (grepl("\\[-?[0-9]+\\.?[0-9]*,-?[0-9\\.]+,null\\]", coordsAsString)) {
          coordsAsString <- gsub(",null", "", coordsAsString)
        }
        jsonlite::fromJSON(coordsAsString, simplifyDataFrame = FALSE, simplifyVector = FALSE, digits = NA)
      })
    feature
  })
  responseRawFixed <- as.character(jsonlite::toJSON(response, auto_unbox = TRUE, null = 'null', digits = NA))
  responseRawFixed
}

geoJsonList <- lapply(idSplits, getGeoJson, url)


#Format to sf
sfList <- lapply(geoJsonList, function(x) geojsonsf::geojson_sf(x))
sfdf <- do.call('rbind', sfList) 

On my machine, when running this nothing fails until I try to print the final sfdf to console:

> sfdf
Simple feature collection with 100 features and 1 field
Geometry type: GEOMETRY
Dimension:     XY, XYZ
Bounding box:  xmin: -94.60691 ymin: 39.07817 xmax: -94.55062 ymax: 39.10612
z_range:       zmin: -1e+05 zmax: -1e+05
Geodetic CRS:  WGS 84
First 10 features:
  Error in CPL_get_z_range(obj, 2) : z error - expecting three columns;
> print(sfdf)
Simple feature collection with 100 features and 1 field
Geometry type: GEOMETRY
Dimension:     XY, XYZ
Bounding box:  xmin: -94.60691 ymin: 39.07817 xmax: -94.55062 ymax: 39.10612
z_range:       zmin: -1e+05 zmax: -1e+05
Geodetic CRS:  WGS 84
First 10 features:
Error in CPL_get_z_range(obj, 2) : z error - expecting three columns;

Session info:

sessionInfo()
# R version 4.1.1 (2021-08-10)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Pop!_OS 21.04
# 
# Matrix products: default
# BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
# LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
# 
# locale:
# [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8      
# [8] LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
# 
# attached base packages:
# [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
# [1] esri2sf_0.1.1  testthat_3.1.0
# 
# loaded via a namespace (and not attached):
# [1] Rcpp_1.0.7         prettyunits_1.1.1  class_7.3-19       ps_1.6.0           rprojroot_2.0.2    digest_0.6.27      utf8_1.2.2         R6_2.5.1           RSQLite_2.2.8      evaluate_0.14     
# [11] e1071_1.7-8        httr_1.4.2         pillar_1.6.2       rlang_0.4.11       curl_4.3.2         rstudioapi_0.13    callr_3.7.0        blob_1.2.2         rmarkdown_2.11     desc_1.3.0        
# [21] devtools_2.4.2     stringr_1.4.0      selectr_0.4-2      bit_4.0.4          proxy_0.4-26       xfun_0.25          compiler_4.1.1     pkgconfig_2.0.3    pkgbuild_1.2.0     htmltools_0.5.2   
# [31] tidyselect_1.1.1   tibble_3.1.4       fansi_0.5.0        crayon_1.4.1       dplyr_1.0.7        withr_2.4.2        sf_1.0-2           grid_4.1.1         jsonlite_1.7.2     lifecycle_1.0.0   
# [41] DBI_1.1.1          magrittr_2.0.1     units_0.7-2        KernSmooth_2.23-20 cli_3.0.1          stringi_1.7.4      cachem_1.0.6       pbapply_1.5-0      fs_1.5.0           remotes_2.4.0     
# [51] xml2_1.3.2         ellipsis_0.3.2     generics_0.1.0     vctrs_0.3.8        geojsonsf_2.0.1    tools_4.1.1        bit64_4.0.5        glue_1.4.2         purrr_0.3.4        yaml_2.2.1        
# [61] processx_3.5.2     pkgload_1.2.1      parallel_4.1.1     fastmap_1.1.0      sessioninfo_1.1.1  classInt_0.4-3     rvest_1.0.1        memoise_2.0.0      knitr_1.33         usethis_2.0.1     



library(sf)
# Linking to GEOS 3.9.0, GDAL 3.2.2, PROJ 7.2.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants