Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SoilProfileCollection minimum requirements and validity checks #137

Closed
2 tasks done
dylanbeaudette opened this issue May 28, 2020 · 4 comments
Closed
2 tasks done

Comments

@dylanbeaudette
Copy link
Member

dylanbeaudette commented May 28, 2020

What constitutes the minimum data required to create a valid SoilProfieCollection?

Should all functions attempt to deal with problematic (NA horizon depths) data, or should some functions require clean data so that code complexity is minimized? In other words, how can we strike a reasonable balance between flexibility and simplicity?

Major Tasks

  • What is the authoritative source of data returned by profile_id? It is currently @horizons but this may be problematic for the proposed horizon-less SPC members [test horizon-less members of an SPC #134].
  • Write functions to check / filter SPCs based on validity constraints.
@dylanbeaudette dylanbeaudette changed the title SoilProfileCollection minimum requirements SoilProfileCollection minimum requirements and validity checks May 28, 2020
@brownag
Copy link
Member

brownag commented May 29, 2020

We have two functions for checking and filtering SPCs for validity constraints as they pertain to horizon data.

  • depthLogic - relative ordering of top and bottom depths
  • sameDepth - matching top and bottom depth in same horizon
  • missingDepth - one or more depths NA
  • overlapOrGap - discontinuities or overlaps in horizon data

I think some of which historically would have precluded creation of SPC in the first place. This hard requirement has been relaxed for as long as I have been using these packages.

  • aqp::hzDepthTests(top, bottom) - use on single profile
  • aqp::checkHzDepthLogic(SPC) - vectorized over all profiles

@dylanbeaudette
Copy link
Member Author

Thanks for the reminder, I had forgotten entirely about those functions. You are correct, the original SPC validity checks would have stopped creation in the presence of:

  • bad hz logic
  • missing depths, top or bottom
  • overlapping horizons

I relaxed all of these checks after we began importing data from NASIS via fetchNASIS back in 2013. I adapted most functions at that time to deal with, or at least attempt to deal with and report on failure, the above violations. Moving forward we will need to decide, at the function level, what can / cannot be tolerated.

I'll review these validation functions and think some more about it.

@brownag
Copy link
Member

brownag commented Jun 15, 2020

Expanding on the above functions that focus on horizon-level validity [within profiles of a collection], the following new function in aqp/aqpdf branch spc_in_sync assesses validity of an entire collection.

Currently this logic DOES rely on the historic assumption that profile IDs in site are unique and match the levels of profile IDs found in horizon. Any deviation from that results in one or more "integrity check" failures.

It is imminently achievable (though, probably difficult to work the bugs out) to provide site indexes in metadata(spc)$target.order that are horizonless through similar means as currently used in this function. I am not sure whether these would be stored separately, or annotated, or simply allowed to exist after checking that they are horizonless.

image

brownag added a commit that referenced this issue Jul 7, 2020
brownag added a commit that referenced this issue Jul 8, 2020
@brownag
Copy link
Member

brownag commented Feb 25, 2024

* [ ]  What is the authoritative source of data returned by `profile_id`? It is currently @horizons but this may be problematic for the proposed horizon-less SPC members [https://github.com/ncss-tech/aqp/issues/134].

I think we have agreed, and many functions rely on the the authoritative source being spc@horizons[[idname(spc)]]. I think this issue can be closed.

It makes more sense to represent sites without horizons with a horizon that has depths missing. Conceptually this recognizes there is a soil being represented, but all information about the layers of that soil is missing. This is easier to account for across all functions than the possibility of sites existing with no corresponding horizon records at all--which would be difficult and error prone, if it is even possible with a "horizon-first" approach to creating SPCs.

There are a variety of methods now available for finding and fixing invalid horizon data--but all rely on the ability to have NA/missing depths, gaps, overlaps, in a SPC object. Not every function needs to be able to use invalid data, but the possibility should either be handled or error informatively.

@brownag brownag closed this as completed Feb 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants