Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(sort of) fixed priors with multiple covariates. #1

Open
wants to merge 26 commits into
base: master
Choose a base branch
from

Conversation

acannistra
Copy link

@acannistra acannistra commented Mar 27, 2017

I've been trying for a while to figure out why the code that was written to give priors on multiple environmental covariates wasn't working, and I think I figured it out today. The purpose of this pull request is to detail what went wrong, and to start a conversation about how we might fix it. I think we should wait to merge this PR until we get a better idea of how to use multiple environmental covariates as priors.

The Problem

When I downloaded this code for the first time, we had functions e.max and e.min defined as

e.max<-function(x) ifelse(x<dat$tmax[spec.k]-10, 1, exp(-(x-dat$tmax[spec.k]+10)/5)) #max  
e.min<-function(x) ifelse(x<dat$tmin[spec.k]   , 0, 1- exp(-(x-(dat$tmax[spec.k])/10000) ) ) #min fix

Then, later, the prior function was:

e.prior= function(x)cbind(1,e.max(x[,2]),e.min(x[,3]) )

I was stuck on this, because nothing looked terribly wrong. From the GRaF documentation, the documentation for the prior parameter states that it is

An optional R function providing an a priori estimate of the probability of presence
of the species given the covariates. The function must take a dataframe
of environmental covariates (matching x) as input and return a corresponding
vector of the probability of presence. If NULL a flat prior is used which gives the
species’ prevalence as the probability of presence at all sites.

I opened an issue on the GrAF repo (goldingn/GRaF#13) to see if Nick had advice, and he told me that the prior returns a single probability of presence for each set of environmental covariates, rather than an estimate of presence based on a single covariate. This is where our function went awry. Instead of an N element vector (a (N,1) shaped-matrix), we were returning an (N,3) shaped matrix using cbind.

The "Solution"

To temporarily fix this problem (without a good understanding of what e.min and e.max are actually trying to accomplish), I simply multiplied (66d0c55#diff-9b16e6b217254fc15823def0d4289364R195) the result of e.min and e.max together, attempting a conditional probability. To my dismay, this didn't solve the problem. The following error still occurred:

Error in while (obj.old - obj > tol & it < itmax) { : 
  missing value where TRUE/FALSE needed

Several days later (today, 3/27), I discovered that the problem lies in the downstream qnorm function, which is defined as Inf for qnorm(1) and NaN for qnorm(0). This means that the prior can seemingly never have values of exactly 1 or exactly 0.

Looking above at e.min and e.max, they are defined exactly this way, with 0 and 1. I altered them (66d0c55#diff-9b16e6b217254fc15823def0d4289364R190) to have 0.1 and 0.8 as thresholds instead, and the code works.

Next Steps

The issue here is that I'm not at all sure whether multiplying the result of e.min and e.max makes any sense. However, this work shows that the prior has to be within 0 and 1, inclusive, and it must consume a dataframe with N rows and all environmental covariates and produce an N-length, 1-dimensional vector with the probability of presence given those covariates. That's some progress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant