-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimizing various SoilProfileCollection methods #135
Comments
[i,]
|
Here we see notable improvements (relative to above) from optimizing various parts of heavily used methods following above 2 commits (on ncss-tech/aqp@mollic branch) -- just by being careful to re-use variables and not invoke unnecessary calls to
And with profiling (with same call as above post, generating 100,000 ids (between 1 and 10000) for a n=10,000 SPC) we complete in approximately half the time. This is not a fair comparison because the previous These revisions do not rely on any new package dependencies. There are additional efficiencies that could be gained using All tests currently pass. A major change I made is to what the EDIT: proper benchmark using identical idx
Before:
After:
|
[,j]
Before:
After above commit (~50x faster -- with gains scaling with size of SPC) and fixes long standing issue #89 pertaining to corruption of other slots when using j-only (horizon/slice) indexing
|
Excellent work. Thank you for tackling the thankless task of optimization. I'm all for merging these changes as long as all tests pass (aqp and soilDB with local NASIS data available for testing). I'd like to chat next time we are in the office about the re-use of data vs. wastefully re-computing / extracting details from the SPC. As for future updates: I'd like to build on |
I have thought some about more that we could do to re-use data and intermediate calculations. Something that I was thinking would be really cool is to store in the SPC hash table or ordering of the profile/horizon ID. Storing the LUT in memory (as an atttribute in RHS object) is how
|
I like the idea of pre-computing / pre-hashing some of the internal details of the SPC that are called all the time. The original motivation for things like |
Agreed. I understand the original intent -- but in some of these functions, as flame graphs above showed, is we wind up ensuring our results are correct, repeatedly, when the SPC hasn't changed. I think for something like the profile ID order the efficiency gain would probably be worth it for large collections -- and moot for small ones. I wanted to just get the low hanging fruit with these commits, since there definitely were some optimizations to be made that didn't really cause too much change in logic for decent gains. |
The text was updated successfully, but these errors were encountered: