Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scaling up profileApply #112

Closed
brownag opened this issue Jan 18, 2020 · 7 comments
Closed

scaling up profileApply #112

brownag opened this issue Jan 18, 2020 · 7 comments

Comments

@brownag
Copy link
Member

brownag commented Jan 18, 2020

As has long been known, profileApply don't scale well.

Below is a first stab at speeding it up; benchmark times are on an old MacBook.

Splitting the operation up into several runs of profileApply (resulting in more, smaller lists) seems to enhance performance significantly in the examples below with 1,000 and 10,000 random profiles.
The chunk size is hardcoded at 100 profiles -- which seems to enhance performance in the thousands to tens of thousands. There is, as expected, some slight overhead relative to straight profileApply when using 100 profiles. Not sure what the upper end of this will be, but presumably as the number of chunks starts to get large, the effective rate of number of profiles applied will decrease.

This is run in a single thread, but this layout lends itself reasonably well to adding parallel computation option.

chunkApply <- function(object, FUN, simplify = FALSE, ...) {
  n <- length(object)
  grp <- sort(1:n %% max(1, round(n / 100))) + 1 
  res <- do.call('c', lapply(split(1:length(grp), grp), function(idx) {
    profileApply(object[idx, ], FUN, simplify = simplify, ...)
  }))
  names(res) <- profile_id(object)
  return(res)
}

library(aqp)

# generate 10,000 random profiles and promote to SPC
foo <- do.call('rbind', lapply(as.list(1:10000), random_profile))
depths(foo) <- id ~ top + bottom

# a "simple" function that returns a "complex" result
simpleFunction <- function(p) data.frame(horizons(p)[2,2:3])

c1 <- system.time(chunkApply(foo[1:100,], simpleFunction))
p1 <- system.time(profileApply(foo[1:100,], simpleFunction, simplify = FALSE))
c2 <- system.time(chunkApply(foo[1:1000,], simpleFunction))
p2 <- system.time(profileApply(foo[1:1000,], simpleFunction, simplify = FALSE))
c3 <- system.time(chunkApply(foo, simpleFunction))
p3 <- system.time(profileApply(foo, simpleFunction, simplify = FALSE))

100 PROFILES

c1

user system elapsed
0.254 0.000 0.267

p1

user system elapsed
0.241 0.001 0.255

1,000 PROFILES

c2

user system elapsed
2.335 0.028 2.689

p2

user system elapsed
5.057 0.024 5.596

10,000 PROFILES

c3

user system elapsed
29.734 0.236 33.839

p3

user system elapsed
455.586 1.294 517.049

@dylanbeaudette
Copy link
Member

Excellent, this is almost a drop-in upgrade to profileApply. This change combined with ideas from #111 would be a nice upgrade.

I'll have to think / research about how we can automatically invoke parallelism without bringing-in a bunch of additional dependencies. Also, enforcing a unique horizon or site level ID via #111 would solve the problem of a non-deterministic re-ordering of results returned by parallel processing.

@brownag
Copy link
Member Author

brownag commented Jan 18, 2020

I wondered about an argument that takes an optional parallel lapply-like function -- eg furrr::future_map() or future.apply::future_lapply. The default could be base R lapply. In my testing so far, I have had limited benefit from using these fancy options over good ol' lapply.

And yes, this would combine well with the frameify option #111

@brownag
Copy link
Member Author

brownag commented Jan 18, 2020

chunkApply

library(aqp)

foo <- do.call('rbind', lapply(as.list(1:100000), random_profile))
depths(foo) <- id ~ top + bottom

simpleFunction <- function(p) data.frame(horizons(p)[2,2:3])

idx <- c(seq(100,900,100), seq(1000,10000,1000), seq(10000, 100000, 10000))
idx.sub <- idx[idx <= 100000]
res <- do.call('rbind', lapply(as.list(idx.sub), function(i) {
  system.time(profileApply(foo[1:i,], simpleFunction, simplify=F))
}))

res2 <- do.call('rbind', lapply(as.list(idx.sub), function(i) {
  system.time(chunkApply(foo[1:i,], simpleFunction))
}))

plot(res[,3]~idx.sub, type="l", lwd=2, main="Time to *Apply n Profiles",
     xlab="Number of Profiles",ylab="Time, seconds")
lines(res2[,3]~idx.sub, col="GREEN", lwd=2)
legend('topleft', legend = c("profileApply","chunkApply"), lty=1, lwd=2, col=c("BLACK","GREEN"))

@dylanbeaudette
Copy link
Member

Sweet! Much better than invoking parallel voodoo. It would like like this is a drop-in addition to profileApply that wouldn't require any additional work by the operator, other than manual adjustment of the chunk size.

@dylanbeaudette
Copy link
Member

Pending a couple more tests, this is just about ready to go. Testing on 10k profiles:

  • 60 seconds using a single chunk
  • 14 seconds using chunk.size=100
  • 17 seconds using chunk.size=1000

@dylanbeaudette
Copy link
Member

Note that the profiling done above includes the additional overhead of [-subsetting SPC objects, although it is clearly not a large portion of the total time.

@brownag
Copy link
Member Author

brownag commented Jan 21, 2020

Sahweet. With b4c171f I think my last dangling issues with this issue are resolved and this issue can be closed

I had made a note pertaining to the sort option needing to be false, but didn't get around to changing over the weekend. Aand good catch with stringsAsFactors -- that one always gets me

@brownag brownag closed this as completed Jan 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants