Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rcmdcheck could be faster by not copying entries from Rbuildignore #74

Closed
HughParsonage opened this issue Aug 23, 2018 · 5 comments
Closed
Milestone

Comments

@HughParsonage
Copy link

I develop a few packages whose repositories can contain a large number of files (or a number of large files). For example data-raw might contain large raw data files, or docs/ might contain a large number of files to produce the webpage. The majority of these files are irrelevant for R CMD check (since they're excluded by .Rbuildignore). However, because the build process first moves the whole package to a temporary directory, then builds with respect to .Rbuildignore, the process is unnecessarily slow. For one package, just excluding data-raw and docs before copying to a temporary directory shaved over 2 minutes from rcmdcheck.

Copying the package without copying entries in .Rbuildignore is non-trivial and the script I wrote to do this is a bit messy, so I want to flag this suggestion before I make a pull request that's considered too complicated. (The script currently is about 100 lines and would replace one file.copy line in build.R.)

file.copy(path, tmpdir, recursive = TRUE)

@gaborcsardi
Copy link
Member

We could use the same approach as R CMD build: https://github.com/wch/r-source/blob/521c90a175d67475b9f1b43d7ae68bc48062d8e6/src/library/tools/R/build.R#L1007-L1012

Maybe this belongs to the pgkbuild package, and rcmdcheck should just call that.

@gaborcsardi gaborcsardi added this to the 1.3.0 milestone Aug 31, 2018
@gaborcsardi
Copy link
Member

I think this is OK now, because we use pkgbuild, without copying anything, and pkgbuild supports .Rbuildignore.

@gaborcsardi
Copy link
Member

Please reopen if you see problems.

@nuno-agostinho
Copy link

nuno-agostinho commented Oct 21, 2021

Hey @gaborcsardi, I still see the same issue reported: the current way R CMD check (and related commands) work is by copying all the files in the directory and then removing the files to be ignored. This can be really slow if the ignored files are big (as in the case of big example datasets that I use for some of my packages).

In a quick look through the code you mentioned from R source, maybe it would be enough to get the files to be ignored (via the inRbuildignore function) and only then copy the remaining files. So yeah, the problem seems to be upstream of R CMD check, but I would love if this was improved! :)

@gaborcsardi
Copy link
Member

This was discussed here: r-lib/pkgbuild#59

It is not an issue with rcmdcheck, but with base R, and it is hard to implement it for ourselves. inRbuildignore is an internal function that we are not supposed to call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants