-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use fcopy for faster file copying #3290
Conversation
666b4d1
to
ec1774c
Compare
Nice, what is the next step? |
So I tried on win32 with fcopy vs win32 CopyFile and there is no big gain to be made (~1% on the test project) but there is quite a bit of gain possible on win32 by removing the On my PC this PR take 11.75s and without the |
So I've been using it for about 2 weeks now, on os x and windows, a bit on linux.. without any issues so far. Based on the perf. reported in #2960 it seems like this performs similarly to FileCopy, but a fair bit of that would probably be from ffi. I'll be interested in seeing if he finds a way to speed it up for windows. One problem is that you need a version of the native extension built for your major version of node (specifically NODE_MODULE_VERSION from https://nodejs.org/en/download/releases/.) That makes distribution a bit tricky. If you're using a mismatching node version it'll fall back to readstream+writestream just like yarn copies files today, so it isn't the end of the world.. and if you want to keep distributing yarn as a single js file that'll work fine as well. My current setup is that it'll download the correct extension from https://github.com/sciolist/fcopy/releases/tag/v0.0.5 for your version of node when you |
I think packaging the binary with yarn would be nice. The total size per platform over supported node versions don't seem so big & the fallback is always there if the user got another version or is using the single-file version. Also in the end I checked and My times btw (Avg over 5 run, offline mode)
|
The binaries look pretty small, we could probably bundle all of them with Yarn and just load the correct one at runtime, if possible. I don't know a lot about how native extensions work though, and whether they can be dynamically loaded. |
Silly question.
Is it possible to do it without Native code?
From distribution point of view, adding extra binary make it more complex
…On Fri, 12 May 2017 at 23:36, Daniel Lo Nigro ***@***.***> wrote:
My current setup is that it'll download the correct extension from
https://github.com/sciolist/fcopy/releases/tag/v0.0.5 for your version of
node when you yarn add fcopy.. But that of course won't work if you're
installing yarn through an .msi. One approach would be to just download the
binaries for all supported node versions up front so it could pick the
correct one at runtime.
The binaries look pretty small, we could probably bundle all of them with
Yarn and just load the correct one at runtime, if possible. I don't know a
lot about how native extensions work though, and whether they can be
dynamically loaded.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3290 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACBdWNFfy6HAd_fyMI64jb34Holgg6oqks5r5N7_gaJpZM4NMvTC>
.
|
@Daniel15 Yeah you can dynamically load them, you just @vbfox for a 1% gain it feels like it's probably preferable to just stick with libuv. :/ a bit less code to maintain, and the libuv code is probably at least somewhat battle tested. regarding utimes, you could maybe skip actually |
@bestander not really, though if fcopys post-install downloads binaries for all supported node versions there shouldn't have to be more handling in yarn than for any other library. the alternative would be to convince the node maintainers that a fast file copy function should be added to That said I do definitely agree that it feels better to not need native extensions. |
@Daniel15 Made a few updates, The function for downloading binaries is exposed now, so if there is no local version of the binary it can attempt to download it: require('fcopy/lib/bindings').downloadIfNeeded(process.versions.modules).then(
function () {},
function (ex) { /* will fall back to fs streams */ }
); Not sure if it should actually be used though.. For most situations it doesn't feel great to just download and execute binaries off the internet, but then yarn already does that, so maybe not a big deal. I figured I'd also put up a quick video comparison of the current file copying in yarn: fs streams (osx), taking 9.3 seconds to copy 42k files (304M) fcopy (osx), taking 2.4 seconds for the same files. It is a pretty significant difference, but if it's worth adding a binary dependency to get it is hard for me to say. |
This will be useful for copy-on-write filesystems and should provide a significant speed improvement for systems that use CoW filesystems. It would be good to get this in, we just need to think about the deployment strategy and how to handle it for the single-file build. Falling back to the JS code when the native library isn't available is probably sufficient. |
On the other hand, for a CoW filesystem, maybe simply generating a shell script with all the required |
Getting a copyfile function in libuv+node would be great, though I'm not sure what the likelihood is of that happening particularly soon. Creating a shell script does have a compatibility advantage over a native module, since as you said they're pretty painful to maintain.. The biggest problem it that changing your local node version stops the native module from working, unless you keep a lot of different versions of the native module around (and when a new version of node comes out, you'd need to download a new native module.) I do have a fallback for that, but it isn't great that changing node version substantially reduces copy speed. Definitely worth looking into alternatives. |
JFYI: Pre-bundled fcopy module size — only 174 KB ( |
Do we wanna still push forward with this? @bestander @arcanis @develar? |
Works well in the electron-builder (500K downloads in month) — no issues at all. One trick is required for npm 3 — empty install script, but in general it works well ( |
@sciolist, @develar how does it compare with the latest Yarn? Could you think of how it would work with single-file build? |
@bestander Results from @sciolist — #3539 (comment) In short: 14s vs 9s
No doubt, that's why all native-related dependencies like If you will decide to use fcopy, fallback should be improved — check file size and if lesser than some threshold, use current solution "do bulk file reads" (to ensure that performance will be not degraded significantly). Update: got your point about single file — will experiment with it. |
This sounds great. If this works with our standalone .js file build (ie. if we can just bundle the binary parts in our tarball and it all works properly), I think we should do it 😃 |
@Daniel15 What happens now when installing fcopy is that it will download all files for the current platform from One problem is that if a user upgrades their local nodejs, but their local version of yarn does not have a native module built for that new nodejs version, it will fall back to the slow path. Something like nodejs/node#12902 would be great, but I'm not so sure it will happen. Also there's no handling of reflink/CoW in fcopy. @bestander I saw your email, haven't had time to take a look at it.. trying to wrap some projects up for the summer. |
A PR just landed in |
That's great! A much better option than handling the issues with a native module. |
Issue for adding this to Node.js: nodejs/node#14906 |
**Summary** Fixes #4331. Supercedes #3290. Uses the newly added `fs.copyFile` on Node 8.5 hen available and falls back to old buffer based method otherwise. This patch also refactors the file copy code a bit making it more efficient. Here are the durations on my computer with [this package.json](https://github.com/sciolist/fcopy/blob/c0ded8b82bbaa3f4cdb0a67e951eb3f8b0f97fb7/perf/project/alottafiles/package.json): | master | w/o fsCopy | w/ fsCopy | | ~23s | ~19s | ~15s | This is with `yarn.lock` in place and w/o `node_modules`. **Test plan** CI should pass.
**Summary** Fixes #4331. Supercedes #3290. Uses the newly added `fs.copyFile` on Node 8.5 hen available and falls back to old buffer based method otherwise. This patch also refactors the file copy code a bit making it more efficient. Here are the durations on my computer with [this package.json](https://github.com/sciolist/fcopy/blob/c0ded8b82bbaa3f4cdb0a67e951eb3f8b0f97fb7/perf/project/alottafiles/package.json): | master | w/o fsCopy | w/ fsCopy | | - | - | - | | ~23s | ~19s | ~15s | This is with `yarn.lock` in place and w/o `node_modules`. **Test plan** CI should pass.
**Summary** Fixes #4331. Supersedes #3290. Uses the newly added `fs.copyFile` on Node 8.5 hen available and falls back to the old buffer based method otherwise. This patch also refactors the file copy code a bit making it more efficient. Here are the durations on my computer with [this package.json](https://github.com/sciolist/fcopy/blob/c0ded8b82bbaa3f4cdb0a67e951eb3f8b0f97fb7/perf/project/alottafiles/package.json): | master | w/o copyFile | w/ copyFile | | - | - | - | | ~23s | ~19s | ~14s | This is with `yarn.lock` in place and w/o `node_modules`. **Test plan** CI should pass.
**Summary** Fixes yarnpkg#4331. Supersedes yarnpkg#3290. Uses the newly added `fs.copyFile` on Node 8.5 hen available and falls back to the old buffer based method otherwise. This patch also refactors the file copy code a bit making it more efficient. Here are the durations on my computer with [this package.json](https://github.com/sciolist/fcopy/blob/c0ded8b82bbaa3f4cdb0a67e951eb3f8b0f97fb7/perf/project/alottafiles/package.json): | master | w/o copyFile | w/ copyFile | | - | - | - | | ~23s | ~19s | ~14s | This is with `yarn.lock` in place and w/o `node_modules`. **Test plan** CI should pass.
* Currently would only be utilized on windows * available in node 8.5.x * prefer Copy-On-Write if available * good benchmarks yarnpkg/yarn#3290 Although node-copy-file-sync module exists, it uses more modern JS features. Which this library cannot yet support (without major version bump)
* Currently would only be utilized on windows * available in node 8.5.x * prefer Copy-On-Write if available * good benchmarks yarnpkg/yarn#3290 Although node-copy-file-sync module exists, it uses more modern JS features. Which this library cannot yet support (without major version bump)
* Currently would only be utilized on windows * available in node 8.5.x * prefer Copy-On-Write if available * good benchmarks yarnpkg/yarn#3290 Although node-copy-file-sync module exists, it uses more modern JS features. Which this library cannot yet support (without major version bump)
(this is not something that should be merged now, needs more testing)
I dug up my old uv_sendfile code and thought it could be interesting to make a little package out of it (https://github.com/sciolist/fcopy) and see what impact it would have on yarn.
My test setup:
Results
I've ran diffs on the output and worked with it a bit today without noticing any differences or issues, but, I wouldn't be surprised if there are some.. particularly on other bsds.