Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Slow package resolution from local cache when remotes are defined #6483

Closed
fourbft opened this issue Feb 6, 2020 · 9 comments · Fixed by #12807
Closed

[bug] Slow package resolution from local cache when remotes are defined #6483

fourbft opened this issue Feb 6, 2020 · 9 comments · Fixed by #12807

Comments

@fourbft
Copy link

fourbft commented Feb 6, 2020

Context:
Recently we're having issues with our network infrastructure and our internal Conan remotes. In general it takes up to 5 seconds for a small package to get downloaded. We have a CI job which installs a Conan package and then does a number of conan info and conan inspect on it.

Issue:
Because of this we found the issue that even though packages are available in the local Conan cache it takes a long time to access them when remotes are defined.

Expected:

  • First conan install A/1.0.0@user/channel takes 5sec because of the slow remotes
  • Further calls to A/1.0.0@user/channel are very fast because the package is already available in the local Conan Cache

Actual experience:

  • First conan install A/1.0.0@user/channel takes 5sec because of the slow remotes
  • Further calls take very long as well, even when the final package is taken from local Conan cache anyways
  • We do not define -U parameters anywhere, so it should not query any data from the remotes

Some strange side effects:

  • This happens when there are any remotes configured at all. Doesn't matter if they're all deactivated as well.
  • This does not happen when there are no remotes configured. Then the package is taken from the local Conan cache in no time.

Environment Details (include every applicable attribute)

  • Operating System+version: Dockerized build-environment based on Linux Suse 12.2 and 13.1
  • Compiler+version: -
  • Conan version: 1.21.0
  • Python version: Python 3.7

Steps to reproduce (Include if Applicable)

Repeat
time conan info <ref>
a few times. Note the execution times.
Then do a
conan remote clean
Repeat
time conan info <ref>
a few times. Note the execution times, which are now faster than with remotes defined.
Redefine the remote(s)
and do the
time conan info <ref>
again. Times are now slower again.
Times are still slow, when all remotes are disabled via
'conan remote disable '

@jgsogo
Copy link
Contributor

jgsogo commented Feb 6, 2020

Hi, @fourbft, just a question: is something you have started to experience with 1.21.0 or is it there for all the versions? It looks like we are calling the remotes even though they are not needed for the operation (as you report, your use-case works the same without the remotes).

Something to look into, for sure.

@monsdar
Copy link
Contributor

monsdar commented Feb 6, 2020

We only began noticing this when our remotes slowed down and the issue jumped in our eye.

@jgsogo
Copy link
Contributor

jgsogo commented Feb 6, 2020

Ok, thanks! So it should be Conan iterating and calling the remotes for nothing... maybe activating the trace_file is an easy way to know if we are doing any HTTP call.

@monsdar
Copy link
Contributor

monsdar commented Feb 6, 2020

Here's what I've found so far:

  • Emptied my local Conan Cache with conan remove -f *
  • Added conan-center as my only remote
  • Running conan info bzip2/1.0.8@conan/stable will download the package recipe, but no binaries
  • Running subsequent conan info bzip2/1.0.8@conan/stable will take time, as Conan will query the remote (even if none is explicitly defined by the user) to query the binary data. This happens in graph_binary.py
  • Running conan install bzip2/1.0.8@conan/stable will install the binary into the local Cache
  • Running conan info bzip2/1.0.8@conan/stable now will not query any remotes, as the needed binary data is already available
  • Running the above without any remotes defined will simply skip querying the remotes for binary data.

The question now is if it is really necessary to query the binary data when no local data is available. The code within graph_binary.py could simply check remotes.selected instead of iterating through all remotes regardless of what the user selected.

@monsdar
Copy link
Contributor

monsdar commented Feb 6, 2020

I've created a PR with the necessary changes. As the code I changed is used in a lot of other parts of Conan I have no idea if there's something breaking.
It works for conan info though :D

@monsdar
Copy link
Contributor

monsdar commented Feb 7, 2020

To copy what's been discussed in the PR:

The "easy fix" is not solving the issue. It works for conan info then, but many other parts of Conan won't work anymore.

The question is if conan info really needs to gather info about remote binaries or not.

  • If it does, perhaps there could be a parameter to control whether to query remotes or not
  • If it doesn't perhaps it's possible to internally avoid querying the remotes for binary data

@cassava
Copy link

cassava commented Sep 22, 2022

This is also relevant to our use-case. We'd like to be able to query the state of packages in the local cache and ignore remotes.

So ideally, there would be some kind of special remote that is the local cache:

conan info -r _LOCAL_CACHE_

Or a flag that tells Conan to skip remotes for this command.
Or something to that effect.

@memsharded
Copy link
Member

Hi @cassava

You can use conan remote disable * to disable remotes.

In any case please note:

  • conan info will only hit the cache if the packages are already there, but will not hit the servers
  • If the packages are not in the local cache, then it needs hitting the servers, otherwise, the command will fail as it will not be able to resolve the graph

So there must be something else that you are seeing, but this command should be very fast if the packages are already in the cache.

@memsharded
Copy link
Member

Proposing #12808 to force the temporal disabling of remotes (that doesn't invalidate #12807, if servers need to be contacted because packages not in the cache, and using --no-remote, it will fail to resolve)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants