-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache tarballs streamed through pacote #74
Conversation
Tell `fetch` not to cache it's downloaded tarballs, leave that to us Not sure the implications of changing the return type of _istream / adding a `cacache` to the end of the pipe, too unfamiliar with streams Fixes npm#73 & Fixes npm/cli#2160 Also see npm/cli#2976 for my first attempt & some discussions as to how we got here.
`pacote:tarball:${this.from}`, | ||
this.opts | ||
) | ||
return stream.on('error', er => istream.emit('error', er)).pipe(istream).pipe(cstream) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming the pipe-ing is somewhat correct, would a MinipassPipeline
be better here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think this should almost certainly be return new Pipeline(stream, istream, cstream)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Er, wait, no. I think cstream
might not be a Passthrough?
I'll pull this locally and poke at it a bit, sorry for the uninformed comment 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have a fix for this, will squash into the commit when landing.
return stream.on('error', er => istream.emit('error', er)).pipe(istream).pipe(cstream) | |
const pipeline = new Pipeline(stream, istream) | |
stream.pipe(cacache.put.stream( | |
this.opts.cache, | |
`pacote:tarball:${this.from}`, | |
this.opts | |
)) | |
return pipeline |
So, in principle, I think this is probably the right direction. We still let mfh do the caching of everything other than tarballs, but explicitly cache tarballs in the single choke-point in the logical flow. I'll pull locally and take a look now. At the very least, it will need tests to assert that Thanks for updating your use case information over in npm/cli#2976 (comment) It's helpful to know the end goal, and that does make sense. We may want to update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marking as "request changes", but I have the required changes ready to land, so don't sweat it. I'll open up a separate PR to review, and mark this as closed once that lands.
Thanks!
`pacote:tarball:${this.from}`, | ||
this.opts | ||
) | ||
return stream.on('error', er => istream.emit('error', er)).pipe(istream).pipe(cstream) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have a fix for this, will squash into the commit when landing.
return stream.on('error', er => istream.emit('error', er)).pipe(istream).pipe(cstream) | |
const pipeline = new Pipeline(stream, istream) | |
stream.pipe(cacache.put.stream( | |
this.opts.cache, | |
`pacote:tarball:${this.from}`, | |
this.opts | |
)) | |
return pipeline |
@@ -29,6 +29,7 @@ class RemoteFetcher extends Fetcher { | |||
spec: this.spec, | |||
integrity: this.integrity, | |||
algorithms: [ this.pickIntegrityAlgorithm() ], | |||
cache: null, // leave the caching of the tarball up to _istream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will also prevent make-fetch-happen from seeing the tarballs that pacote has cached. The options seem to be:
- Accept extra requests being made when the tarball already exists in cache (not really acceptable)
- Let it write twice into the cache (cacache prevents data integrity problems by using atomic writes, but still not great, because it's double disk work when fetching a remote tarball)
- Only pipe into the cstream for fetchers that don't expect their fetches to be cached anyway (a little bit dirty, because it requires that the fetcher knows what mfh is going to do)
- Provide a
cacheManager
object that reads from cache, but only writes packuments to cache, not tarballs (probably the "cleanest" solution, but... oof, that's a lot of duplicated code).
I'm back to thinking (3) makes the most sense here.
Tell `fetch` not to cache it's downloaded tarballs, leave that to us Also see npm/cli#2976 for my first attempt & some discussions as to how we got here. Fix: #73 Fix: npm/cli#2160 Close: npm/cli#2976 PR-URL: #74 Credit: @mjsir911 Close: #74 Reviewed-by: @isaacs EDIT(@isaacs): Updated to add test, allow make-fetch-happen to use the cache, pipe into cache properly, and avoid cache double-writes.
Tell `fetch` not to cache it's downloaded tarballs, leave that to us Also see npm/cli#2976 for my first attempt & some discussions as to how we got here. Fix: #73 Fix: npm/cli#2160 Close: npm/cli#2976 PR-URL: #74 Credit: @mjsir911 Close: #74 Reviewed-by: @isaacs EDIT(@isaacs): Updated to add test, allow make-fetch-happen to use the cache, pipe into cache properly, and avoid cache double-writes.
Tell `fetch` not to cache it's downloaded tarballs, leave that to us Also see npm/cli#2976 for my first attempt & some discussions as to how we got here. Fix: #73 Fix: npm/cli#2160 Close: npm/cli#2976 PR-URL: #74 Credit: @mjsir911 Close: #74 Reviewed-by: @isaacs EDIT(@isaacs): Updated to add test, allow make-fetch-happen to use the cache, pipe into cache properly, and avoid cache double-writes.
Closed in favor of #75 |
Tell `fetch` not to cache it's downloaded tarballs, leave that to us Also see npm/cli#2976 for my first attempt & some discussions as to how we got here. Fix: #73 Fix: npm/cli#2160 Close: npm/cli#2976 PR-URL: #74 Credit: @mjsir911 Close: #74 Reviewed-by: @isaacs EDIT(@isaacs): Updated to add test, allow make-fetch-happen to use the cache, pipe into cache properly, and avoid cache double-writes.
Tell `fetch` not to cache it's downloaded tarballs, leave that to us Also see npm/cli#2976 for my first attempt & some discussions as to how we got here. Fix: #73 Fix: npm/cli#2160 Close: npm/cli#2976 Close: #74 EDIT(@isaacs): Updated to add test, allow make-fetch-happen to use the cache, pipe into cache properly, and avoid cache double-writes. PR-URL: #75 Credit: @mjsir911, @isaacs Close: #75 Reviewed-by: @wraithgar, @mjsir911
Tell
fetch
not to cache it's downloaded tarballs, leave that to usNot sure the implications of changing the return type of _istream / adding a
cacache
to the end of the pipe, too unfamiliar with streamsFixes #73 & Fixes npm/cli#2160
Also see npm/cli#2976 for my first attempt & some discussions as to how we got here.
Also this behavior should probably be documented somewhere so it doesn't slip away again 😅