Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snapshot dependency resolution behavior causes slowness #532

Closed
dpastoor opened this issue Jan 28, 2019 · 4 comments
Closed

snapshot dependency resolution behavior causes slowness #532

dpastoor opened this issue Jan 28, 2019 · 4 comments

Comments

@dpastoor
Copy link

@trestletech as we spoke about briefly at Rstudio Conf, here is an example of looking at the flamegraphs and data associated with a snapshot for a pretty vanilla data science project with shiny (eg tidyverse + shiny and a couple helper packages)

You can see that the snapshot call takes almost 2 minutes:

image

Ironically, the callHook, 49 of the 50 seconds is likewise stuck on another snapshotImpl call.

image

So nearly 97 of 99 seconds is taken just doing dependency reconciliation.

If we drill further into the snapshotImpl method, we can also see that the cause is a recursive call for getting package dependencies, which I believe during unraveling is re-solving the dependency graph and reading the lockfile many times.

image

This is also very evident from looking at how spiky the flame graph is

image

@slopp this is also why we struggle with using programmatic deployment from rsconnect and friends using the traditional bundling, as in https://github.com/rstudio/rsconnect/blob/master/R/bundle.R#L780 can take so long.

I would think by using either memoization or a separate environment as a hash table of sorts to check against before solving for a particular package would be very helpful.

IMO, this should not take more than a second or two max, which I lightly confirmed with an extremely hacky script that replicated the general gist of what is going on during the getPackageRecords call. I'm happy to help get further along with this, but figured it might be worth having a dialog around whether its worth it or if it makes sense to channel these optimization efforts elsewhere.

@dpastoor
Copy link
Author

In digging around this looks like it aligned with the findings at #347

@aronatkins
Copy link
Contributor

@dpastoor - the changes in #615 dramatically reduced the cost of getPackageRecords. Those changes were included in the packrat 0.6.0 release.

Are you able to re-test the performance of dependency resolution using the current CRAN release?

@dpastoor
Copy link
Author

dpastoor commented Nov 5, 2021

Hi Aron,

If it would be helpful, I can certainly retest, however I can confirm that the use of rsconnect to deploy apps, and the packrat invocations underneath have been much much faster in recent history, to the point that I forgot about even filing this long ago :-)

@aronatkins
Copy link
Contributor

@dpastoor I'll go ahead and close this issue given your feedback. No need to retest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants