-
-
Notifications
You must be signed in to change notification settings - Fork 632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GC the CAS storage #5136
Comments
...but actually, if the |
Given the atomic update nature of lmdb, we could probably do something clever where a daemon marks a blob as being "leased", and is responsible for re-leasing at least every n hours or something. |
+1 on leases... so at read time, if it looks like the lease will expire in X time, renew it. |
@illicitonion : I think that we'll want a relatively short gap between #5106 landing and this change, as avoiding needing to deal with backwards compatibility of the LMDB store quite yet would be nice. |
Yeah, that's a pretty reasonable concern - will try to get this put together today :) |
Ok, I've pushed the mechanics of a possible lease and garbage collect implementation to https://github.com/twitter/pants/tree/dwagnerhall/fs/cas-garbage-collection - the commit message has some discussion around how to actually wire that up to be useful. I'm going to put together an implementation of |
Actually, I think I'll wait until we have a chat before implementing I suspect that the correct thing to do is to give Alternatively, I could add a kind of hacky single-use WDYT? |
Nodes are cloned in various places, so I think that that Drop impl would be a bit too clever.
I think this is pretty reasonable, and is what I had in mind when I mentioned "walking the graph to find all DigestFile nodes". Regarding A/B from twitter@880d7ff : it's not clear how B actually results in cleanup occurring... ie, what's looking for expired leases? |
We had some discussion on Slack; we settled on:
2 and 3 may end up being the same service (which can be modelled on the We're going to punt on things which don't appear in the graph is |
That looks great, thanks.
I think all pants/src/rust/engine/src/nodes.rs Lines 1134 to 1145 in d12ecec
|
We should add a method to
Scheduler
orGraph
to garbage collect any entries in theStore
that are not currently mentioned in theGraph
, and which either:... depending on whether storing access times has a significant impact on performance.
The text was updated successfully, but these errors were encountered: