Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache entry point lookups #6124
Cache entry point lookups #6124
Changes from 6 commits
9af19f0
d18595d
5c4271f
77d7727
6ecd4ad
9add91d
afd59c6
6cf4b2e
190935c
0705701
026f86e
f47457f
674b27e
53345af
0a4b185
1bb0e15
d77e464
9a8e308
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting, so do I understand correctly that
eps()
is cached, nothing is being read from disk here, and processing the few entry points in memory is still taking 25ms?Can you please document this function to explain the quadruple loop problem here, and why this additional cache is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll write a comment. You can look at the cProfile info on the forum. https://aiida.discourse.group/t/why-is-aiidalab-base-widget-import-so-slow/32/16?u=danielhollas
Together with the fact that we were calling this during each Node class build (in a metaclass), this completely explained why the
aiida.orm
import was so slow. See #6091There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we worried about the size of the cache? I think the number of different calls to
eps_select
should be reasonable, not exceeding on the order of 100. So wouldn't we be better of withlru_cache(maxsize=None)
i.e.cache
which will be faster. Not sure how much faster it will be compared to an LRU cache of max 100 items. Might be negligibleThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am actually worried. I wouldn't be surprised if the number of calls was bigger then 100, especially if plugins are installed, since in some functions we're essentially iterating over all existing entry points when looking for the entry point for a given class. I'll take a closer look and do some more benchmarking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we doing this iteration or is it importlib internally?
Are we trying to find an entry point for a class without knowing the name of the entry point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These don't call this
eps_select
function though, do they? The cache here simply applies to the number of combinations of arguments with which it is called. Since it just hasgroup
andname
, it should just be the list of all(group, name)
tuples with which the function is called. This should, reasonably, not be much larger than the entry points that exist.