-
-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New class zict.Cache #65
Conversation
crusaderky
commented
Mar 11, 2022
•
edited
Loading
edited
- Closes Spill to disk may cause data duplication distributed#3756
- Blocks Prevent data duplication on unspill distributed#5936
Have we looked at cachetools? |
cachetools.Cache does a lot more than this new class - because it's not composable - and also a lot less, for the exact same reason. Technically speaking, Everything else that zict allows - namely,
both of which are key features here, cachetools does not do. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should briefly talk about naming but otherwise this looks good!
cache: MutableMapping | ||
Fast cache for reads from data. This mapping may lose keys on its own; e.g. it | ||
could be a LRU. | ||
update_on_set: bool, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an example where update_on_set=False
is helpful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in distributed.
It boils down to the temporal locality patterns of the library user. Sometimes writing a key/value pair does not imply likelihood that the same key will be read back sooner than the rest.
For example, somebody may write the output of a nightly backup batch job to s3fs or some other very slow filesystem; being a backup, 9 times out of 10 nobody is going to read it back. When people do want to read it back though, they are very likely to access the same files multiple times (assuming the backup is not just dumped back on top of live).
zict/cache.py
Outdated
|
||
else: | ||
|
||
class WeakRefCache(weakref.WeakValueDictionary): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit confusing having the Cache
and WeakRefCache
classes in here although WeakRefCache
is not a Cache
. Maybe one of the classes should be renamed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this one should be called WeakValueMapping
to stick to zict nomenclature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed
Guessing the conflicts are related to PR ( #66 )? |
fixed conflicts |