-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return the File instance from Entry::unpack #203
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me and looks like some good wins!
This is technically a breaking change but that's ok. I'm probably gonna try to release this as a patch version and pray it doesn't break anything and if that doesn't work out a new major version could be published.
I expected some further speedups on behalf of rustup from removing extra syscalls on the extracted files (using the returned handle instead of reopening), but I was surprised to see that the main purpose was actually just dropping files! Isn't this a case where if extraction happened in parallel on many threads it'd also solve the issue?
Cool. I have some further things I'm poking at - such as being able to set the content indexing disabled flag - this would need to let folk hook into the open() call more deeply, or would need a more complex way of telling unpack what to do; if you have API suggestions around this that would be great.
FWIW rustup didn't fail to build with this change added to it - the existing signature io::Result<()> is already must-consume, with the value ignored; as you say "technically breaking" - I too hope no fallout happens.
Yes, if extraction was happening in parallel on many threads, CloseHandle would be per-thread as a consequence. However handling of dependencies in tar files becomes more complex with parallel extraction (e.g. extract foo/ then foo/bar ). This by comparison is very simple to reason about (see the patch in rustup). Another thing we can parallelise for more speed is the decompression of the tar content; and then also another person is interested in extracting while downloading from the net for even better wall clock times. |
049e3a3
to
ac582f5
Compare
This permits optimisation of unpacking on Windows - which is perhaps overly special cased to include in tar itself? Doing this for rustup which already doesn't use the Archive wrapper :/. With this patch, I reduced rustup's rust-docs install from 21/22s to 11s on my bench machine - details at rust-lang/rustup#1850 I haven't filled in all the possibilities for the Unpacked enum, but happy to do so if you'd like that done - it seemed like it wouldn't be useful at this stage as there are no live objects to return.
This permits optimisation of unpacking on Windows - which is perhaps
overly special cased to include in tar itself? Doing this for rustup
which already doesn't use the Archive wrapper :/.
With this patch, I reduced rustup's rust-docs install from 21/22s to
11s on my bench machine - details at
rust-lang/rustup#1850
I haven't filled in all the possibilities for the Unpacked enum, but
happy to do so if you'd like that done - it seemed like it wouldn't
be useful at this stage as there are no live objects to return.
Follow on patch from #202 - again, text conflicts or I would have made them standalone.