Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: failure to parse esriFieldTypeBlob (arcpbf) #6

Closed
JWilliamsonArch opened this issue Aug 20, 2024 · 14 comments
Closed

bug: failure to parse esriFieldTypeBlob (arcpbf) #6

JWilliamsonArch opened this issue Aug 20, 2024 · 14 comments
Labels
bug Something isn't working

Comments

@JWilliamsonArch
Copy link

Hi!

I have noticed that there is an {arcgislayers} bug in arc_select() which prevents it from loading and opening valid ArcGIS Rest Services.

Description
The package, {arcgislayers}, cannot load ArcGIS Rest services with the command arc_select() . The rest service is valid and opens with other software.

Reproducible Example

library(arcgis)
library(tidyverse)
library(sf)

furl <- "https://geonb.snb.ca/arcgis/rest/services/GeoNB_SNB_Municipal_Information/MapServer/1"

Municipal_Roads <- arc_open(furl) %>% 
  arc_select(.)

Version:
R version 2024.04.2+764 "Chocolate Cosmos"
Running under: Windows 11 Pro

Expected behavior

The features should load into R as sf linestring features.

The features fail to load, and the following error message is printed:
"Error in multi_resp_process_(resps) :
User function panicked: multi_resp_process_"

@JosiahParry
Copy link
Collaborator

From what i can tell, there are 10614 features in this dataset. I cannot be sure how detailed they are either.

library(arcgis)

flayer <- arc_open("https://geonb.snb.ca/arcgis/rest/services/GeoNB_SNB_Municipal_Information/MapServer/1")

arc_select(flayer, returnCountOnly = "true") |> sum()

Lets say you have 1000 vertices on average for each of these vertices. Each is an double value (64 bit float). Then that comes out to 1.27GB of memory for the geometry alone fs::fs_bytes(64 * 2 * 1000 * 10614). That doesn't take into account the json that is used to transfer the data or the object IDs etc.

SO I think what is happening is that you are exhausting your memory and also possibly being rate limited by the feature service.

Do you need the entire feature service in memory? Or can you filter it down and limit the fields that you need?

@JWilliamsonArch
Copy link
Author

Thanks for your response, and I'm sorry about my late reply.
I used QGIS to download and export this dataset instead.

Discussions about the memory required to handle this dataset seem premature, as vectors like this road network are common in many professional GIS situations. This particular vector dataset is not extraordinarily large, and often, it is necessary to view the whole table.
My bug report was specific, but the problem is general.

Are there any settings in my R environment that I could change that might allow this package to handle this dataset type?
I will also look for other R solutions to this problem.

@elipousson
Copy link

It looks like there is a specific issue with the "SE_ANNO_CA" column:

library(arcgis)
#> Attaching core arcgis packages:
#> → arcgisutils v0.3.0
#> → arcgislayers v0.3.0.9000
#> → arcgisgeocode v0.2.1
#> → arcgisplaces v0.1.0
library(tidyverse)
library(sf)
#> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE

furl <- "https://geonb.snb.ca/arcgis/rest/services/GeoNB_SNB_Municipal_Information/MapServer/1"

Municipal_Roads <- arc_open(furl) %>% 
  arc_select(
    fields =  "SE_ANNO_CA"
  )
#> Iterating ■■■■■■                            17% | ETA:  6s
#> Iterating ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■  100% | ETA:  0s
#> Error in multi_resp_process_(resps): User function panicked: multi_resp_process_

Created on 2024-09-12 with reprex v2.1.1

@JosiahParry JosiahParry added the bug Something isn't working label Sep 13, 2024
@JosiahParry
Copy link
Collaborator

Oh interesting! This is a blob field! This should actually be a pretty easy fix. I genuinely didn't expect to ever run into blob data. We can see below that the blob type isn't handled. This just needs to be captured as a raw vector.

// FieldType::EsriFieldTypeBlob => todo!(),

@JosiahParry
Copy link
Collaborator

This is partially addressed in this branch: https://github.com/R-ArcGIS/arcpbf/tree/blob.

I am not able to find any public feature services with non-null blob field types so I am unsure how to process it. At present this will detect any non-null blob entries and provide a warning message if they are encountered.

library(arcgislayers)

furl <- "https://geonb.snb.ca/arcgis/rest/services/GeoNB_SNB_Municipal_Information/MapServer/1"

arc_open(furl) |> 
  arc_select(
    fields =  "SE_ANNO_CA"
  )
#> Iterating ■■■■■■                            17% | ETA:  6s
#> Iterating ■■■■■■■■■■■                       33% | ETA:  3s
#> Iterating ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■  100% | ETA:  0s
#> Simple feature collection with 10614 features and 1 field
#> Geometry type: MULTILINESTRING
#> Dimension:     XY
#> Bounding box:  xmin: 2311937 ymin: 7288878 xmax: 2710576 ymax: 7674195
#> Projected CRS: NAD83(CSRS) / New Brunswick Stereographic
#> First 10 features:
#>    SE_ANNO_CA                       geometry
#> 1             MULTILINESTRING ((2537588 7...
#> 2             MULTILINESTRING ((2375380 7...
#> 3             MULTILINESTRING ((2589807 7...
#> 4             MULTILINESTRING ((2493805 7...
#> 5             MULTILINESTRING ((2557756 7...
#> 6             MULTILINESTRING ((2477038 7...
#> 7             MULTILINESTRING ((2627394 7...
#> 8             MULTILINESTRING ((2475265 7...
#> 9             MULTILINESTRING ((2524728 7...
#> 10            MULTILINESTRING ((2425289 7...

@JosiahParry JosiahParry changed the title REST Services will not load- {arcgislayers} bug in arc_select bug: failure to parse esriFieldTypeBlob (arcpbf) Sep 13, 2024
@JosiahParry
Copy link
Collaborator

Note that {arcpbf} is in its 8th day pending CRAN manual checks. This will only be addressed following a decision from CRAN. Moving to the {arcpbf} repo

@JosiahParry JosiahParry transferred this issue from R-ArcGIS/arcgislayers Sep 13, 2024
@JWilliamsonArch
Copy link
Author

I found that this package could pull down other similar-sized datasets, so the problem does seem to be with the esri blobs as a data type.

I also found that the arcpullr package* can download this file, even with the Esri blob in place, which is worth mentioning for other people running into this issue.

@JosiahParry
Copy link
Collaborator

It may also be worth noting that arcpullr is orders of magnitude slower than arcgislayers.
It works because in this feature layer the blobs are entirely null—no actual binary data. arcpullr uses json whereas arcpbf uses protocol buffers which are ~1/10th of the memory foot print. They are also processed in Rust.

@JosiahParry
Copy link
Collaborator

Thank you @elipousson ! i'll try exploring these

@JosiahParry
Copy link
Collaborator

These feature services unfortunately return empty strings. Right now the blob branch has handling behavior for these which behaves like so:

``` r
library(arcgislayers)
furl <- "https://services1.arcgis.com/UWYHeuuJISiGmgXx/ArcGIS/rest/services/911_Calls_For_Services_2024/FeatureServer/0"

arc_open(furl) |> 
  arc_select(fields = c("callKey", "HashedRecord"), n_max= 100, crs = 4326) |> 
  head()
#> Blob types not supported.
#> Please report an issue at https://github.com/R-ArcGIS/arcpbf/issues
#> Provide the FeatureService URL if possible
#> Simple feature collection with 6 features and 2 fields (with 6 geometries empty)
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
#> Geodetic CRS:  WGS 84
#>            callKey HashedRecord    geometry
#> 1 10CC9DA965929F12              POINT EMPTY
#> 2 10CC9DA96592A321              POINT EMPTY
#> 3 10CC9DA96592B7B9              POINT EMPTY
#> 4 10CC9DA96592BCBA              POINT EMPTY
#> 5 10CC9DA96592DFE2              POINT EMPTY
#> 6 10CC9DA96592E370              POINT EMPTY

Would this suffice you think?

@elipousson
Copy link

Yep! I don't need anything in the blob field – I just need the data and the error in the main branch prevents me from accessing these layers. I'm guessing these are created automatically at some point in the process but I don't know much about how these fields work.

@JosiahParry
Copy link
Collaborator

Nor do I. I haven't found anyone who does either 😅 . I'll merge the branch.

@JosiahParry
Copy link
Collaborator

JosiahParry commented Oct 29, 2024

Closed by #10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants