Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot remove site level attr with SPC$x <- NULL #163

Closed
dylanbeaudette opened this issue Aug 6, 2020 · 5 comments
Closed

cannot remove site level attr with SPC$x <- NULL #163

dylanbeaudette opened this issue Aug 6, 2020 · 5 comments

Comments

@dylanbeaudette
Copy link
Member

library(aqp)
library(cluster)
library(sharpshootR)

data(sp4)
depths(sp4) <- id ~ top + bottom

# simulate some data
horizons(sp4)$bdy <- 4
p <- permute_profile(sp4[1, ], n=10, boundary.attr = 'bdy', min.thickness = 2)

# must remove the original 'id' site level attr for aqp::union() ... another issue to work out

## this doesn't work
# site(p)$id <- NULL

## this does
p@site$id <- NULL

## figure this out later
# z <- union(list(sp4[1:2, ], p))
# join condition resulted in sorting of sites, re-applying original order
# Error in if (nrow(s) != nrow(site.new)) { : argument is of length zero

# this works
z <- union(list(p, sp4[c(1, 6), ]))

plotSPC(z, color = 'K')


d <- profile_compare(z, vars = c('K', 'sand', 'CEC_7'), max_d = 40, k = 0)
dd <- diana(d)

plotProfileDendrogram(z, dd, scaling.factor = 1.5, y.offset = 5, width = 0.3, color = 'K')
@brownag
Copy link
Member

brownag commented Aug 7, 2020

So, a primary problem here was that permute_profile retained id in @site and @horizons ... even though it set the new ID name as pID. I have updated permute_profile and tests to allow for a custom pID. And now the old (only allowed duplicate!!) is removed from horizons as it should be. Another reason that this method should be refactored to use depths<- based logic... but it is fixed for now.

The next problem pertained to the serial dispatch of $<- followed by site<- in the case where it removes a column (rather than adds). This should now be fixed for site<- and horizons<-

brownag added a commit that referenced this issue Aug 7, 2020
@dylanbeaudette
Copy link
Member Author

Thanks for the fixes. site(p)$id <- NULL works as expected, and the new feature in permute_profile to remove the old ID from @horizons is a good idea.

I've noticed something new in union, possibly something that had been lurking but hidden behind a syntax error.

The profile IDs are shuffled in the top figure, associated with union(list(sp4[c(p.idx, spike.idx), ], p)).
image

I suspect making union more robust (#161) will solve the problem, but I wanted to highlight it as shuffling IDs is scary.

New test case.

library(aqp)
library(cluster)
library(sharpshootR)

data(sp4)
depths(sp4) <- id ~ top + bottom

# profile to general realizations of
p.idx <- 1

# spike profile
spike.idx <- 6

# now safely removes id from horizons, thanks @brownag 
# simulate some data
horizons(sp4)$bdy <- 4
p <- permute_profile(sp4[p.idx, ], n=10, boundary.attr = 'bdy', min.thickness = 2)

# this works now, thanks @brownag 
# remove from @site
site(p)$id <- NULL

## Whoa! What?
## IDs shuffled

# this is wrong
z.1 <- union(list(sp4[c(p.idx, spike.idx), ], p))

# this is right
z.2 <- union(list(p, sp4[c(p.idx, spike.idx), ]))

# top version has IDs shuffled, likely bug in union
par(mar=c(0,0,3,0), mfrow=c(2,1))
plotSPC(z.1, color = 'K')
plotSPC(z.2, color = 'K')


## check out the version that worked  
dev.off()

d <- profile_compare(z.2, vars = c('K', 'sand', 'CEC_7'), max_d = 40, k = 0)
dd <- diana(d)
plotProfileDendrogram(z.2, dd, scaling.factor = 1.5, y.offset = 5, width = 0.3, color = 'K')

@brownag
Copy link
Member

brownag commented Aug 7, 2020

Thanks for catching that. This is actually related to the changes I just made to site<- -- not anything with union that we weren't already aware of (assumed slots from first element)

Trying your test case after e64aad7 I obtain
image

I suspected there would be some non-target impacts -- I used your case to make tests to cover this.

The degenerate case where depths<- resorts horizon data, but the site data only contains the idname column was allowing for the column to be re-sorted by bypassing the merge (since you cannot merge something that has no "new" columns). Now there is a check that ensures that will not happen.

@brownag
Copy link
Member

brownag commented Aug 14, 2020

I originally made the above fix within union... but just moved the re-sorting bit into site<- -- to cover the other possible occurences of this instability, They would be easy to make -- and we have a test for it now -- but surely tricky to find in practice since they will arise in specific workflows/programmatic generated site data etc.

@dylanbeaudette
Copy link
Member Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants