Skip to content

Commit

Permalink
Convert NASIS database access from RODBC to DBI/odbc (#149)
Browse files Browse the repository at this point in the history
* Convert NASIS-related queries from RODBC->DBI #146

* remove require RODBC

* roxygen for fetchNASIS

* Color data NA moist state handling

* use_sqlite = FALSE is the default for .openNASISchannel

* Move VARCHAR(MAX) fields to end of query, per MSSQL specs

* fix getHzErrorsNASIS

* Validations and tests RE: #149 #146

* Move to fetchNASIS tests (skip on remote)

* Add skip() and better handling of missing DB/no data

* Use local_NASIS_defined everywhere w/ odbc::odbcListDataSources

* cherry-pick: make a proper interface to sqlite NASIS queries

cherry-pick: 🧬 Stitch SQLite data source API up to fetchNASIS

cherry-pick: Update demo

cherry-pick: Default code use 25cm, as proposed

Validate/fix NASIS methods

* Change name of path argument

* fetchNASIS: Remove RIGHT JOIN in geomorphic features query and turn off the MSSQL specific syntax used for `d.rf.data.v2`

* Method for cloning local NASIS tables into static SQLite file (#154)

* Method for cloning local NASIS tables into static SQLite file #149

* local_NASIS_defined: add static_path arg and use RSQLite::dbCanConnect

* Add selected set and static path to getHzErrorsNASIS

* get_cosoilmoist / get_vegplot should use selected set argument

* Special handling for local_NASIS_defined (does not take SS arg)

* Add static_path to local_NASIS_defined

* Roxygen: markdown = TRUE

Broken docs

md-doc: Fix single bracket semantics

* Update docs

* forgot to commit this one

* NEWS / version 2.6.x

* Rd2roxygen initial conversion; old in ./manbak

* fix \href{}

* passing check_man

* Rd2roxygen (#162)

* Rd2roxygen initial conversion; old in ./manbak

* fix \href{}

* passing check_man

* fixes for R CMD check

* deprecate old manpages

* Revert existing roxygen back to human-made

* get_extended_data_from_NASIS_db: artifact data query should respect SS=FALSE

* get extended NASIS photo text notes: check for paths >260 chars

* proper sequence to get SQLite pedon snapshot fetchNASIS-ready

* Use checkHzDepthLogic (fast=TRUE) in fetchNASIS_pedons

* .fetchNASIS_pedons: Use data.table for extended data processing

* get_extended_data_from_NASIS_db: Replace plyr::join

* dbQueryNASIS: vectorize/test; dbConnectNASIS: add NASIS() alias

* docs

* Add test for local NASIS DBI issues

* Fix for get_cosoilmoist_from_NASISWebReport example

* Fix for list output of createStaticNASIS

* Close RODBC connection used in tests

* Testing some pedon_table_column checks @jskovlin

* aqp::union has been removed from namespace

* test-fetchKSSL: hide txtProgressBar when running tests

* Refactoring utils.R for data.table in fetchNASIS flattening

* Docs

* Oops DBI/odbc/RSQLite back in Imports

* Missing comma

* Docs

* Mopping up

* Move driver packages (odbc, RSQLite) to Suggests

* get_cotext_from_NASIS_db: Add support for static_path argument and use dbQueryNASIS

* get_cotext_from_NASIS_db: check for try-error

* .formatParentMaterialString: Return NA_character_ for conformal data.frame with NULL data

* Remove dangling require RODBC

* fetchNASIS: extended data flattening handle NULL table contents

* createStaticNASIS: better default arguments

* Fixes for selected set argument  (found by drop all _View_1 tables in a static DB)

* Updates to nasisDBI "demo" that runs all the NASIS methods by all the methods

* Remove requireNamespace("RODBC")--merge artifact?

* demo createStaticNASIS workflow

* Add WCS/SDA viz to demo

* Fix bug in decoding of horizon data; thanks @dylanbeaudette

* Pass through static_path argument to uncode()

* Fix get_comonth_from_NASIS_db fill=TRUE

* Update demos

* Small adjustments to default args for demo/comparisons

* Rename static_path arg to dsn

* Rename static_path arg to dsn (docs)

* Version bump + update README

* Update NEWS.md
  • Loading branch information
brownag authored Mar 23, 2021
1 parent 3761526 commit d9d057d
Show file tree
Hide file tree
Showing 205 changed files with 9,822 additions and 3,804 deletions.
12 changes: 6 additions & 6 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Package: soilDB
Type: Package
Title: Soil Database Interface
Version: 2.6.0
Date: 2021-02-18
Version: 2.6.1
Date: 2021-03-22
Authors@R: c(person(given="Dylan", family="Beaudette", role = c("aut"), email = "dylan.beaudette@usda.gov"),
person(given="Jay", family="Skovlin", role = c("aut")),
person(given="Stephen", family="Roecker", role = c("aut")),
Expand All @@ -14,11 +14,11 @@ License: GPL (>= 3)
LazyLoad: yes
Depends: R (>= 3.5.0)
Imports: aqp, grDevices, graphics, stats, utils, plyr, xml2, sp, reshape2,
raster, curl, lattice, methods, data.table
Suggests: rgdal, jsonlite, RODBC, httr, sf, rgeos, rvest,
testthat, latticeExtra,
ggplot2, gridExtra, viridisLite, mapview, rasterVis
raster, curl, lattice, methods, data.table, DBI
Suggests: rgdal, jsonlite, RODBC, httr, sf, rgeos, rvest, odbc, RSQLite,
testthat, latticeExtra, gridExtra, ggplot2, viridisLite, mapview, rasterVis
Repository: CRAN
URL: http://ncss-tech.github.io/AQP/
BugReports: https://github.com/ncss-tech/soilDB/issues
RoxygenNote: 7.1.1
Roxygen: list(markdown = TRUE)
7 changes: 7 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,13 @@ importFrom(data.table,
data.table,
as.data.table)

importFrom(DBI,
dbGetQuery,
dbConnect,
dbSendQuery,
dbFetch
)

importFrom(reshape2,
dcast,
melt
Expand Down
8 changes: 7 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,13 @@
# soilDB 2.6.1 (2021-03-22)
* Connections to the local NASIS database now use `DBI` and `odbc` instead of `RODBC`
* Two new methods `dbConnectNASIS` and `dbQueryNASIS` facilitate access with read-only credentials, submission of queries/fetching of results, and closing the DBI connection upon completion
* Use `dsn` argument to specify a local "static" SQLite file containing NASIS tables
* Default argument `dsn = NULL` uses `"nasis_local"` [ODBC connection](http://ncss-tech.github.io/AQP/soilDB/setup_local_nasis.html) to a local NASIS SQL Server instance

# soilDB 2.6.0 (2021-02-18)
* `OSDquery` gets a new argument (`everything`) for searching the entire document
* `fetchNASIS(..., rmHzErrors=TRUE)` -- spurious removals of data due to missing "extended" records. `fetchNASIS` now uses `aqp::horizons<-` after building a minimal `SoilProfileCollection` from NASIS site and horizon tables. This allows `aqp` integrity methods to trigger where needed--preventing unintentional re-ordering or removals of "valid" horizon data.

# soilDB 2.5.9 (2021-01-26)
* `HenryTimeLine` moved to {sharpshootR} package
* new functions `mukey.wcs()` and `ISSR800.wcs()` for hitting web coverage service (WCS) for gSSURGO, gNATSGO, and ISSR-800 grids
Expand Down
49 changes: 22 additions & 27 deletions R/OSDquery.R
Original file line number Diff line number Diff line change
Expand Up @@ -42,53 +42,50 @@
#' - combine search terms into a single expression: (grano:* | granite)
#'
#' Related documentation can be found in the following tutorials
#' \itemize{
#' \item{\href{http://ncss-tech.github.io/AQP/soilDB/soil-series-query-functions.html}{overview of all soil series query functions}}
#'
#' \item{\href{https://ncss-tech.github.io/AQP/soilDB/competing-series.html}{competing soil series}}
#'
#' \item{\href{https://ncss-tech.github.io/AQP/soilDB/siblings.html}{siblings}}
#' }
#'
#' - [overview of all soil series query functions](http://ncss-tech.github.io/AQP/soilDB/soil-series-query-functions.html)
#' - [competing soil series](https://ncss-tech.github.io/AQP/soilDB/competing-series.html)
#' - [siblings](https://ncss-tech.github.io/AQP/soilDB/siblings.html)
#'
#'
#' @references \url{https://www.nrcs.usda.gov/wps/portal/nrcs/detailfull/soils/home/?cid=nrcs142p2_053587}
#'
#'
#' @author D.E. Beaudette
#'
#'
#' @note SoilWeb maintains a snapshot of the Official Series Description data.
#'
#'
#' @seealso \code{\link{fetchOSD}, \link{siblings}, \link{fetchOSD}}
#'
#'
#' @keywords manip
#'
#'
#' @return a \code{data.frame} object containing soil series names that match patterns supplied as arguments.
#' @export
#'
#' @examples
#'
#'
#'
#'
#' \donttest{
#' if(requireNamespace("curl") &
#' curl::has_internet() &
#' require(aqp)) {
#'
#'
#' # find all series that list Pardee as a geographically associated soil.
#' s <- OSDquery(geog_assoc_soils = 'pardee')
#'
#'
#' # get data for these series
#' x <- fetchOSD(s$series, extended = TRUE, colorState = 'dry')
#'
#'
#' # simple figure
#' par(mar=c(0,0,1,1))
#' plot(x$SPC)
#' }
#' }
#'
OSDquery <- function(everything = NULL, mlra='', taxonomic_class='', typical_pedon='', brief_narrative='', ric='', use_and_veg='', competing_series='', geog_location='', geog_assoc_soils='') {

# check for required packages
if(!requireNamespace('httr', quietly=TRUE) | !requireNamespace('jsonlite', quietly=TRUE))
stop('please install the `httr` and `jsonlite` packages', call.=FALSE)

# sanity checks

# mode selection
Expand Down Expand Up @@ -129,27 +126,25 @@ OSDquery <- function(everything = NULL, mlra='', taxonomic_class='', typical_ped
# note: this is the load-balancer
u <- 'https://casoilresource.lawr.ucdavis.edu/osd-search/search-entire-osd.php'
}




# submit via POST
res <- httr::POST(u, body = parameters, encode='form')

# TODO: figure out what an error state looks like
# trap errors, likely related to SQL syntax errors
request.status <- try(httr::stop_for_status(res), silent = TRUE)

# the result is JSON
# should simplify to data.frame nicely
r.content <- httr::content(res, as = 'text', encoding = 'UTF-8')
d <- jsonlite::fromJSON(r.content)

# results will either be: data.frame, empty list, or NULL

# ensure result is either data.frame or NULL
if(inherits(d, 'list') & length(d) < 1)
return(NULL)

return(d)
}

Expand Down
39 changes: 7 additions & 32 deletions R/ROSETTA.R
Original file line number Diff line number Diff line change
Expand Up @@ -96,45 +96,20 @@
#' @details Soil properties supplied in \code{x} must be described, in order, via \code{vars} argument. The API does not use the names but column ordering must follow: sand, silt, clay, bulk density, volumetric water content at 33kPa (1/3 bar), and volumetric water content at 1500kPa (15 bar).
#'
#' The ROSETTA model relies on a minimum of 3 soil properties, with increasing (expected) accuracy as additional properties are included:
#' \itemize{
#' \item{required, sand, silt, clay: }{USDA soil texture separates (percentages) that sum to 100\%}
#' \item{optional, bulk density (any moisture basis): }{mass per volume after accounting for >2mm fragments, units of gm/cm3}
#' \item{optional, volumetric water content at 33 kPa: }{roughly "field capacity" for most soils, units of cm^3/cm^3}
#' \item{optional, volumetric water content at 1500 kPa: }{roughly "permanent wilting point" for most plants, units of cm^3/cm^3}
#' }
#' - required, sand, silt, clay: USDA soil texture separates (percentages) that sum to 100\%
#' - optional, bulk density (any moisture basis): mass per volume after accounting for >2mm fragments, units of gm/cm3
#' - optional, volumetric water content at 33 kPa: roughly "field capacity" for most soils, units of cm^3/cm^3
#' - optional, volumetric water content at 1500 kPa: roughly "permanent wilting point" for most plants, units of cm^3/cm^3
#'
#' Column names not specified in \code{vars} are retained in the output.
#'
#' Three versions of the ROSETTA model are available, selected using \code{v = 1}, \code{v = 2}, or \code{v = 3}.
#'
#' \describe{
#' - version 1 - Schaap, M.G., F.J. Leij, and M.Th. van Genuchten. 2001. ROSETTA: a computer program for estimating soil hydraulic parameters with hierarchical pedotransfer functions. Journal of Hydrology 251(3-4): 163-176. doi: \doi{10.1016/S0022-1694(01)00466-8}.
#'
#' \item{version 1}{Schaap, M.G., F.J. Leij, and M.Th. van Genuchten. 2001. ROSETTA: a computer program for estimating soil hydraulic parameters with hierarchical pedotransfer functions. Journal of Hydrology 251(3-4): 163-176. doi: \doi{10.1016/S0022-1694(01)00466-8}}.
#' - version 2 - Schaap, M.G., A. Nemes, and M.T. van Genuchten. 2004. Comparison of Models for Indirect Estimation of Water Retention and Available Water in Surface Soils. Vadose Zone Journal 3(4): 1455-1463. doi: \doi{10.2136/vzj2004.1455}.
#'
#' \item{version 2}{Schaap, M.G., A. Nemes, and M.T. van Genuchten. 2004. Comparison of Models for Indirect Estimation of Water Retention and Available Water in Surface Soils. Vadose Zone Journal 3(4): 1455-1463. doi: \doi{10.2136/vzj2004.1455}}.
#'
#'
#' \item{version 3}{Zhang, Y., and M.G. Schaap. 2017. Weighted recalibration of the Rosetta pedotransfer model with improved estimates of hydraulic parameter distributions and summary statistics (Rosetta3). Journal of Hydrology 547: 39-53. doi: \doi{10.1016/j.jhydrol.2017.01.004}}.
#' }
#'
#' @note Input data should not contain columns names that will conflict with the ROSETTA API results: `theta_r`, `theta_s`, `alpha`, `npar`, `ksat`.
#'
#' @return a \code{data.frame} object:
#'
#' \describe{
#'
#' \item{... }{columns present in \code{x}}
#'
#' \item{theta_r: }{residual volumetric water content (cm^3/cm^3)}
#' \item{theta_s: }{saturated volumetric water content (cm^3/cm^3)}
#' \item{alpha:}{related to the inverse of the air entry suction, log10-transformed values with units of cm}
#' \item{npar: }{index of pore size distribution, log10-transformed values with units of 1/cm}
#' \item{ksat: }{saturated hydraulic conductivity, log10-transformed values with units of cm/day}
#'
#' \item{.rosetta.model}{best-available model selection (-1 signifies that prediction was not possible due to missing values in \code{x})}
#' \item{.rosetta.version}{ROSETTA algorithm version, selected via function argument \code{v}}
#'
#' }
#' - version 3 - Zhang, Y., and M.G. Schaap. 2017. Weighted recalibration of the Rosetta pedotransfer model with improved estimates of hydraulic parameter distributions and summary statistics (Rosetta3). Journal of Hydrology 547: 39-53. doi: \doi{10.1016/j.jhydrol.2017.01.004}.
#'
#' @references
#' Consider using the interactive version, with copy/paste functionality at: \url{https://www.handbook60.org/rosetta}.
Expand Down
91 changes: 56 additions & 35 deletions R/SDA-spatial.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

## chunked queries for large number of records:
# https://github.com/ncss-tech/soilDB/issues/71

Expand All @@ -16,22 +15,25 @@
## TODO: geometry collections are not allowed in sp objects..
## TODO: consider moving to sf

#' @title Post-process WKT returned from SDA.


#' Post-process WKT returned from SDA.
#'
#' This is a helper function, commonly used with \code{SDA_query} to extract
#' WKT (well-known text) representation of geometry to an sp-class object.
#'
#' @description This is a helper function, commonly used with \code{SDA_query} to extract WKT (well-known text) representation of geometry to an sp-class object.
#' The SDA website can be found at \url{https://sdmdataaccess.nrcs.usda.gov}.
#' See the [SDA Tutorial](http://ncss-tech.github.io/AQP/soilDB/SDA-tutorial.html) for detailed examples.
#'
#' @param d \code{data.frame} returned by \code{SDA_query}, containing WKT representation of geometry
#' @param d \code{data.frame} returned by \code{SDA_query}, containing WKT
#' representation of geometry
#' @param g name of column in \code{d} containing WKT geometry
#' @param p4s PROJ4 CRS definition, typically GCS WGS84
#'
#' @details The SDA website can be found at \url{https://sdmdataaccess.nrcs.usda.gov}. See the \href{http://ncss-tech.github.io/AQP/soilDB/SDA-tutorial.html}{SDA Tutorial} for detailed examples.
#'
#' @note This function requires the `httr`, `jsonlite`, `XML`, and `rgeos` packages.
#'
#' @author D.E. Beaudette
#'
#' @return A \code{Spatial*} object.
#'
#' @note This function requires the \code{httr}, \code{jsonlite}, \code{XML},
#' and \code{rgeos} packages.
#' @author D.E. Beaudette
#' @export processSDA_WKT
processSDA_WKT <- function(d, g='geom', p4s='+proj=longlat +datum=WGS84') {
# iterate over features (rows) and convert into list of SPDF
p <- list()
Expand Down Expand Up @@ -158,36 +160,53 @@ FROM geom_data;
# 10-20x speed improvement over SDA_query_features


#' @title SDA Spatial Query


#' SDA Spatial Query
#'
#' @description Query SDA (SSURGO / STATSGO) records via spatial intersection with supplied geometries. Input can be SpatialPoints, SpatialLines, or SpatialPolygons objects with a valid CRS. Map unit keys, overlapping polygons, or the spatial intersection of \code{geom} + SSURGO / STATSGO polygons can be returned. See details.
#' Query SDA (SSURGO / STATSGO) records via spatial intersection with supplied
#' geometries. Input can be SpatialPoints, SpatialLines, or SpatialPolygons
#' objects with a valid CRS. Map unit keys, overlapping polygons, or the
#' spatial intersection of \code{geom} + SSURGO / STATSGO polygons can be
#' returned. See details.
#'
#' @param geom a Spatial* object, with valid CRS. May contain multiple features.
#' @param what a character vector specifying what to return. 'mukey': \code{data.frame} with intersecting map unit keys and names, \code{geom} overlapping or intersecting map unit polygons
#' @param geomIntersection logical; \code{FALSE}: overlapping map unit polygons returned, \code{TRUE}: intersection of \code{geom} + map unit polygons is returned.
#' @param db a character vector identifying the Soil Geographic Databases
#' ('SSURGO' or 'STATSGO') to query. Option \var{STATSGO} currently works
#' only in combination with \code{what = "geom"}.
#'
#' @return A \code{data.frame} if \code{what = 'mukey'}, otherwise \code{SpatialPolygonsDataFrame} object.
#' Queries for map unit keys are always more efficient vs. queries for
#' overlapping or intersecting (i.e. least efficient) features. \code{geom} is
#' converted to GCS / WGS84 as needed. Map unit keys are always returned when
#' using \code{what = "geom"}.
#'
#' There is a 100,000 record limit and 32Mb JSON serializer limit, per query.
#'
#' SSURGO (detailed soil survey, typically 1:24,000 scale) and STATSGO
#' (generalized soil survey, 1:250,000 scale) data are stored together within
#' SDA. This means that queries that don't specify an area symbol may result in
#' a mixture of SSURGO and STATSGO records. See the examples below and the
#' [SDA Tutorial](http://ncss-tech.github.io/AQP/soilDB/SDA-tutorial.html)
#' for details.
#'
#' @aliases SDA_spatialQuery SDA_make_spatial_query SDA_query_features
#' @param geom a Spatial* object, with valid CRS. May contain multiple
#' features.
#' @param what a character vector specifying what to return. 'mukey':
#' \code{data.frame} with intersecting map unit keys and names, \code{geom}
#' overlapping or intersecting map unit polygons
#' @param geomIntersection logical; \code{FALSE}: overlapping map unit polygons
#' returned, \code{TRUE}: intersection of \code{geom} + map unit polygons is
#' returned.
#' @param db a character vector identifying the Soil Geographic Databases
#' ('SSURGO' or 'STATSGO') to query. Option \var{STATSGO} currently works only
#' in combination with \code{what = "geom"}.
#' @return A \code{data.frame} if \code{what = 'mukey'}, otherwise
#' \code{SpatialPolygonsDataFrame} object.
#' @note Row-order is not preserved across features in \code{geom} and returned
#' object. Use \code{sp::over()} or similar functionality to extract from
#' results. Polygon area in acres is computed server-side when \code{what =
#' 'geom'} and \code{geomIntersection = TRUE}.
#' @author D.E. Beaudette, A.G. Brown, D.R. Schlaepfer
#' @seealso \code{\link{SDA_query}}
#' @keywords manip
#'
#' @aliases SDA_make_spatial_query SDA_query_features
#'
#' @note Row-order is not preserved across features in \code{geom} and returned object. Use \code{sp::over()} or similar functionality to extract from results. Polygon area in acres is computed server-side when \code{what = 'geom'} and \code{geomIntersection = TRUE}.
#'
#'
#' @details Queries for map unit keys are always more efficient vs. queries for overlapping or intersecting (i.e. least efficient) features. \code{geom} is converted to GCS / WGS84 as needed. Map unit keys are always returned when using \code{what = "geom"}.
#'
#' There is a 100,000 record limit and 32Mb JSON serializer limit, per query.
#'
#' SSURGO (detailed soil survey, typically 1:24,000 scale) and STATSGO (generalized soil survey, 1:250,000 scale) data are stored together within SDA. This means that queries that don't specify an area symbol may result in a mixture of SSURGO and STATSGO records. See the examples below and the \href{http://ncss-tech.github.io/AQP/soilDB/SDA-tutorial.html}{SDA Tutorial} for details.
#'
#'
#' @examples
#'
#' \donttest{
#' if(requireNamespace("curl") &
#' curl::has_internet() &
Expand Down Expand Up @@ -324,6 +343,8 @@ FROM geom_data;
#' }
#' }
#'
#'
#' @export SDA_spatialQuery
SDA_spatialQuery <- function(geom, what='mukey', geomIntersection=FALSE,
db = c("SSURGO", "STATSGO")) {

Expand Down
40 changes: 39 additions & 1 deletion R/SSURGO_spatial_query.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,43 @@

# currently only queries SoilWeb for mapunit-level data


#' Get SSURGO Data via Spatial Query
#'
#' Get SSURGO Data via Spatial Query to SoilWeb
#'
#' Data are currently available from SoilWeb. These data are a snapshot of the
#' "official" data. The snapshot date is encoded in the "soilweb_last_update"
#' column in the function return value. Planned updates to this function will
#' include a switch to determine the data source: "official" data via USDA-NRCS
#' servers, or a "snapshot" via SoilWeb.
#'
#' @param bbox a bounding box in WGS84 geographic coordinates, see examples
#' @param coords a coordinate pair in WGS84 geographic coordinates, see
#' examples
#' @param what data to query, currently ignored
#' @param source the data source, currently ignored
#' @return The data returned from this function will depend on the query style.
#' See examples below.
#' @note This function should be considered experimental; arguments, results,
#' and side-effects could change at any time. SDA now supports spatial queries,
#' consider using \code{\link{SDA_query_features}} instead.
#' @author D.E. Beaudette
#' @keywords manip
#' @examples
#'
#' \donttest{
#' if(requireNamespace("curl") &
#' curl::has_internet()) {
#'
#' # query by bbox
#' SoilWeb_spatial_query(bbox=c(-122.05, 37, -122, 37.05))
#'
#' # query by coordinate pair
#' SoilWeb_spatial_query(coords=c(-121, 38))
#' }
#' }
#'
#' @export SoilWeb_spatial_query
SoilWeb_spatial_query <- function(bbox=NULL, coords=NULL, what='mapunit', source='soilweb') {

# check for required packages
Expand Down
Loading

0 comments on commit d9d057d

Please sign in to comment.