-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #156 from digital-land/gs/add-QA-processes
Gs/add qa processes
- Loading branch information
Showing
7 changed files
with
218 additions
and
98 deletions.
There are no files selected for viewing
32 changes: 32 additions & 0 deletions
32
docs/data-operations-manual/Explanation/Key-Concepts/Endpoint-types.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# Endpoint URL Types and Plugins | ||
|
||
The pipeline can collect data published in a wide range of different formats, which means there is a lot of variety in the types of URLs we might add as endpoints. Broadly, however, endpoints typically fall into one of the following two categories: | ||
- Hosted file - these will usually be URL which ends in something like `.json` or `.csv` | ||
- Standards compliant web server - these will usually be identifiable by parts of the URL like `MapServer` or `FeatureServer`, or sections that look like query parameters, like `?service=WFS&version=1.0.0` | ||
|
||
## Data formats of resources that can be processed** | ||
- Geojson (preferred for geospatial data because this mandates Coordinate Reference System WGS84) | ||
- CSV text files containing WKT format geometry | ||
- PDF-A | ||
- Shapefiles | ||
- Excel files containing WKT format geometry (xls, xlsx, xlsm, xlsb, odf, ods, odt) | ||
- Mapinfo | ||
- Zip files containing Mapinfo or Shapefiles | ||
- GML | ||
- Geopackage | ||
- OGC Web Feature Service | ||
- ESRI ArcGIS REST service output in GeoJSON format | ||
|
||
**Hosted files** | ||
These can typically be added as they are with no problems. The pipeline can read most common formats and will transform them into the csv format it needs if they’re not already supplied as csv. | ||
|
||
**Web servers** | ||
Web server endpoints usually provide some flexibility around the format that data is returned in. The data provider may have shared a correctly configured URL which returns valid data, or they may have just provided a link to the server service directory, which does not itself contain data we can process. | ||
|
||
E.g. this URL from Canterbury provides information on a number of different planning-related layers available from their ArcGIS server: | ||
`https://mapping.canterbury.gov.uk/arcgis/rest/services/External/Planning_Constraints_New/MapServer` | ||
|
||
Depending on the endpoint, it may be necessary to either **edit the URL** to return valid data, or **use a plugin** to make sure the data is processed correctly. A plugin is typically needed for an API endpoint if the collector needs to paginate (e.g. the ArcGIS API typically limits to 1,000 records per fetch) or strip unnecessary content from the response (e.g. WFS feeds can sometimes contain access timestamps which can result in a new resource being created each day the collector runs). | ||
|
||
>[!NOTE] | ||
> Wherever possible, we prefer to collect from a URL which returns data in an open standard such as geojson or WFS, instead of the ArcGIS service directory page. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
Oops, something went wrong.