-
Notifications
You must be signed in to change notification settings - Fork 993
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug] conan install sometimes fails offline, despite all recipes/packages in cache #15339
Comments
conanfile.py doesn't contain anything exotic. Note, a good number of deps are custom recipes within our artifactory instance. from conan import ConanFile
from conan.tools.cmake import CMake, cmake_layout
from conan.tools.cmake import CMakeDeps
from conan.tools.files import copy
import os
class TestProjectConan(ConanFile):
name = "test_project"
settings = "os", "compiler", "build_type", "arch"
generators = "CMakeToolchain", "CMakeDeps", "VirtualRunEnv"
options = {
"with_option": [True, False],
}
default_options = {
"with_option": False
}
def layout(self):
cmake_layout(self)
def generate(self):
conanlibs_path = os.path.join( self.build_folder, "thirdparty" )
for dep in self.dependencies.values():
for libdir in dep.cpp_info.libdirs:
copy(self, "*.so", libdir, conanlibs_path )
copy(self, "*.so.*", libdir, conanlibs_path )
def requirements(self):
self.requires("connextdds/7.2.0")
self.requires("rtiddsgen/7.2.0")
self.requires("cli11/2.3.2")
self.requires("asio/1.24.0")
self.requires("concurrentqueue/1.0.3")
self.requires("readerwriterqueue/1.0.6")
self.requires("nlohmann_json/3.11.2")
self.requires("spdlog/1.11.0")
self.requires("imgui/1.89.9-docking")
self.requires("implot/0.16-docking")
self.requires("glfw/3.3.8")
self.requires("boost/1.81.0")
self.requires("yaml-cpp/0.8.0")
self.requires("eigen/3.4.0")
self.requires("mavlink_v2/2.0.0-mr")
self.requires("woodall_serial/1.2.3")
self.requires("hfsm2/2.3.2")
self.requires("threepp/0.0.1-mr")
if self.options.with_option:
self.requires( "ffmpeg/5.0.3-mr" )
self.requires( "sdl/2.26.1" )
self.requires( "intel-media-driver/22.6.6-dev")
self.requires( "nvidia-vaapi-driver/0.0.9-dev" ) |
Hi @spiderkeys Thanks for your report. I have been trying to reproduce again for a while without success.
A possibility to explore is that this could be related to some
If these ideas don't get us to something, then maybe I could propose some patch with extra traces, so if you could run with those traces, we could understand something more about the issue. |
Hi @memsharded , I'll try to answer your questions first, and then I've also found some form of clue/culprit that changes whether the install succeeds or not.
Just now I tried to pare down the problem a bit. First, I found that the error can be encountered simply by running: conan graph info ./src/conanfile.py With -vvv passed in, it fails at the same rm -rf src/build/
conan cache clean -vvv -s -b -d -t "*"
conan remove -c "*"
# ... reconnect to internet, do a successful conan install (so all packages get downloaded again), disconnect again
conan graph info ./src/conanfile.py
# error occurs Next, I tried to reduce the conanfile.py to bare bones to see if it might be a particular package causing the issue.
From there, I gradually added
So maybe it has something to do with the boost package? I acknowledge it could still be some issue with my local conan sqlite db relating to boost, but hopefully this can provide some additional insight. |
One more clue: boost and sdl2 are the only recipes in my conan package graph that specify dependencies with a version range, so far as I can tell: Resolved version ranges
openssl/[>=1.1 <4]: openssl/3.2.0
zlib/[>=1.2.11 <2]: zlib/1.3 I now find that either package causes the install to fail, and if they are not specified as requires (and thus nothing in my graph has a version range to evaluate), the install succeeds. |
Thanks very much for the deep investigation and insights. |
No success. Something like this: diff --git a/conan/internal/cache/db/cache_database.py b/conan/internal/cache/db/cache_database.py
index fffa02486..e48cb135e 100644
--- a/conan/internal/cache/db/cache_database.py
+++ b/conan/internal/cache/db/cache_database.py
@@ -28,6 +28,7 @@ class CacheDatabase:
def get_latest_package_reference(self, ref):
prevs = self.get_package_revisions_references(ref, True)
+ ConanOutput().info(f"CacheDatabase.get_latest_package_reference {ref}: obtained {prevs}")
return prevs[0] if prevs else None
def update_recipe_timestamp(self, ref):
diff --git a/conans/client/graph/graph_binaries.py b/conans/client/graph/graph_binaries.py
index db3a9e78c..47a0e3123 100644
--- a/conans/client/graph/graph_binaries.py
+++ b/conans/client/graph/graph_binaries.py
@@ -197,12 +197,17 @@ class GraphBinariesAnalyzer(object):
return
# Obtain the cache_latest valid one, cleaning things if dirty
+ node.conanfile.output.info(f"Checking if {node} is in the cache")
while True:
+ node.conanfile.output.info(f"Checking DB for {node.pref}")
cache_latest_prev = self._cache.get_latest_package_reference(node.pref)
+ node.conanfile.output.info(f"Obtained latest_prev for {node.pref}: {cache_latest_prev}")
if cache_latest_prev is None:
break
package_layout = self._cache.pkg_layout(cache_latest_prev)
+ node.conanfile.output.info(f"Checking if {node.pref} is dirty {cache_latest_prev}")
if not self._evaluate_clean_pkg_folder_dirty(node, package_layout):
+ node.conanfile.output.info(f"Checked {node.pref} not dirty")
break
if node.conanfile.upload_policy == "skip":
@@ -215,8 +220,10 @@ class GraphBinariesAnalyzer(object):
else:
node.binary = BINARY_MISSING
elif cache_latest_prev is None: # This binary does NOT exist in the cache
+ node.conanfile.output.info(f"Cache latest_prev is None for {node.pref}, evaluate download")
self._evaluate_download(node, remotes, update)
else: # This binary already exists in the cache, maybe can be updated
+ node.conanfile.output.info(f"Cache latest_prev is not None for {node.pref}, checking if in cache")
self._evaluate_in_cache(cache_latest_prev, node, remotes, update) Do you think it would be possible to add those messages in your code and run again? Would you like a patch file, a branch in the repo and running from the branch (from source, with |
A branch to pip install would be good! |
I manually made the logging changes above locally and this is what I get for a conanfile.py that specifies SDL as a requirement, first without -nr and then with -nr. Without
Output relating to
|
Ok so digging deeper, here is the code path that I see getting triggered differently when offline: The various pieces of code that I added more instrumentation to: def _process_node(self, node, build_mode, remotes, update):
# ...
if node.conanfile.upload_policy == "skip":
# Download/update shouldn't be checked in the servers if this is "skip-upload"
# The binary can only be in cache or missing.
if cache_latest_prev:
conanfile.output.info(f"process F")
node.binary = BINARY_CACHE
node.prev = cache_latest_prev.revision
else:
conanfile.output.info(f"process G")
node.binary = BINARY_MISSING
elif cache_latest_prev is None: # This binary does NOT exist in the cache
conanfile.output.info(f"process H")
node.conanfile.output.info(f"Cache latest_prev is None for {node.pref}, evaluate download")
self._evaluate_download(node, remotes, update)
else: # This binary already exists in the cache, maybe can be updated
conanfile.output.info(f"process I")
node.conanfile.output.info(f"Cache latest_prev is not None for {node.pref}, checking if in cache")
self._evaluate_in_cache(cache_latest_prev, node, remotes, update)
# The INVALID should only prevail if a compatible package, due to removal of
# settings in package_id() was not found
if node.binary in (BINARY_MISSING, BINARY_BUILD):
conanfile.output.info(f"process J")
if node.conanfile.info.invalid and node.conanfile.info.invalid[0] == BINARY_INVALID:
conanfile.output.info(f"process K")
node.binary = BINARY_INVALID
# ...
def _evaluate_download(self, node, remotes, update):
output = node.conanfile.output
try:
output.info("Download A")
self._get_package_from_remotes(node, remotes, update)
output.info("Download ~A")
except NotFoundException:
output.info("Download B")
node.binary = BINARY_MISSING
else:
output.info("Download C")
node.binary = BINARY_DOWNLOAD
output.info("Download D")
# ...
# check through all the selected remotes:
# - if not --update: get the first package found
# - if --update: get the latest remote searching in all of them
def _get_package_from_remotes(self, node, remotes, update):
results = []
pref = node.pref
for r in remotes:
try:
info = node.conanfile.info
latest_pref = self._remote_manager.get_latest_package_reference(pref, r, info)
results.append({'pref': latest_pref, 'remote': r})
if len(results) > 0 and not update:
break
except NotFoundException:
pass
if not remotes and update:
node.conanfile.output.warning("Can't update, there are no remotes defined")
if len(results) > 0:
remotes_results = sorted(results, key=lambda k: k['pref'].timestamp, reverse=True)
result = remotes_results[0]
node.prev = result.get("pref").revision
node.pref_timestamp = result.get("pref").timestamp
node.binary_remote = result.get('remote')
else:
node.binary_remote = None
node.prev = None
raise PackageNotFoundException(pref) With
Without
From the above, it looks to me like what is happening is this line is throwing an unhandled exception: latest_pref = self._remote_manager.get_latest_package_reference(pref, r, info) With |
If I change the exception handling to no-op on any exception, instead of just def _get_package_from_remotes(self, node, remotes, update):
results = []
pref = node.pref
for r in remotes:
try:
info = node.conanfile.info
latest_pref = self._remote_manager.get_latest_package_reference(pref, r, info)
results.append({'pref': latest_pref, 'remote': r})
if len(results) > 0 and not update:
break
except:
pass Of course, this may not be the desired behavior. Maybe there is some group of connection-related exceptions that could be caught here explicitly. |
This works and seems like perhaps the better way to check for the connection-specific error: try:
info = node.conanfile.info
latest_pref = self._remote_manager.get_latest_package_reference(pref, r, info)
results.append({'pref': latest_pref, 'remote': r})
if len(results) > 0 and not update:
break
except ConanConnectionError:
node.conanfile.output.warning("Failed to connect to remote - will evaluate packages in local cache")
except NotFoundException:
pass |
Ok, thanks very much for your detailed feedback and investigation again. The key is the line
Appearing in both cases, the successful I'll try to summarize:
So I am now inclined to think that this is expected behavior. If you force a download of Or with another perspective, if something changed in some of your local recipes and you change the Please let me know if this explanation helps a bit to understand the current behavior. I will reproduce the scenario in my test. Skipping all remote exceptions is not a possibility, hiding connection problems with the remotes is something that needs to be avoided, because it can silently produce confusing behavior (like after uploading a new version to the server, but CIs not picking it, but not complaining either, even if the URL to the remote was incorrect or some network error) |
This test reproduces it: def test_info_not_hit_server2():
"""
https://github.com/conan-io/conan/issues/15339
"""
c = TestClient(default_server_user=True)
c.save({"tool/conanfile.py": GenConanfile("tool", "0.1"),
"math/conanfile.py": GenConanfile("math", "0.1").with_tool_requires("tool/0.1"),
"app/conanfile.py": GenConanfile("app", "0.1").with_requires("math/0.1")})
c.run("create tool")
c.run("create math")
c.run("install app")
c.run("upload * -r=default -c")
c.run("remove * -c")
c.run("install app")
assert "Downloaded" in c.out
c.run("cache clean -s -b -d -t *")
# break the server to make sure it is not being contacted at all
c.servers["default"] = None
c.run("graph info app", assert_error=True)
assert "ERROR: 'NoneType' object has no attribute 'fake_url'. [Remote: default]" in c.out
c.run("graph info app -nr")
assert re.search(r"Skipped binaries(\s*)tool/0.1", c.out) |
Thanks for the detailed description of what is happening. This makes some sense to me, in terms of why it is happening, from a mechanical perspective. That said, I'm still a little confused about the current design being expected/desirable from a user experience perspective.
While I think I generally follow that in this case In this scenario, why is an install that succeeds with An explanation of the above withstanding, for our user experience, I think I would prefer to not have situations where Because these tool_requires are transitive, it would be nice if there was a blanket method to making sure they make it into the local cache, rather than manually hunting down and installing/building each one. |
The behavior is different, if the
There is already a conf for that, just defining We didn't select this as the default, because many users were complaining about the extra transfer time, storage (and costs), so the current Conan default is try to avoid transfers if possible |
In any case, the intended and designed Conan logic is that "offline remotes" must be managed explicitly, not implicitly, and |
Thanks, glad to know that there is an existing setting for it. I think this ultimately addresses what I would want. I was typing out the below response, but I think it is ultimately achieved by having a user locally configure their development machine to use I guess one more way to express our need as a user is that we want to be able to run the same build script invocation without the user needing to manually specify whether they are offline or not. In reference to your description of the graph computation process (numbered for reference):
It feels to me like there should be some way (an argument, config setting, etc) to specify that in the process of step 2, if the remotes are not reachable, behave in the same way as if |
That could make sense, it is good feedback thanks. I was actually thinking of ways to improve this UX, this could be a possibility (I am still not fully discarding the possibility to have other automated flows for this) |
👍 glad to help, and thanks again for helping understand what was going on. Feel free to close this or leave open as you see fit. |
I am experimenting in #15516 the possibility of the behavior that you suggest above, and considering the possible risks. Not guaranteed to move forward, just trying at this moment |
Finally #15516 added some clarifying error messages, and includes the |
@memsharded We just had the same issue now when the internet connection went down at the company. We don't want to disable the remote using |
Hi @realbogart
That is the problem. Hiding or silencing connectivity problems and failing servers in the general case cannot be done, it is risky, and users could have obsolete builds and the liking without noticing in their CI just because of Conan not raising errors when a remote is unavailable. For that reasons, when the internet is down or something like that, the alternatives are opt-in: using Furthermore, it is important to highlight that when the packages are in the Conan cache, Conan will not even try to reach to the servers, so Conan can perfectly work fully offline without |
Hi @memsharded, Thank you for the quick reply.
This is exactly what is not working for us. This is the command that we are trying to run: We also have all of the dependencies locked in All packages are in the cache and it works with the network enabled. We also have no It fails like this:
(I hid our URL in the message above using <hidden>) We are currently on Conan version |
The typical situation is that you don't really have all the necessary packages installed in the cache. If you could please run first that command with
The |
Hi again, @memsharded. I tried adding the argument you provided to my install command. This is the full command: Every thing works with the network on. If we disable it, it fails with the same error message as before. |
Ok, then let's open a new ticket, this sounds something different. Could you please create it?:
Thanks very much. |
@memsharded While creating a smaller reproducible example, I managed to boil it down to a one-liner that fails with the network disabled:
I created a ticket here: |
Environment details
We are using a private artifactory instance as our conan remote, with conan-center removed from the remotes list.
Steps to reproduce
Reproduction:
I'm still not sure how to reproduce this. Sometimes it happens, and sometimes it doesn't.
The only step I take on my end that causes it to occur is to disconnect my machine from the internet.
Notes:
conan install
, so it shouldn't be reaching out to any remotes-nr
while offline (and while this error is occurring) allows for a successful build, also proving the recipes/package binaries are available-nr
is passed.conan install
invocation (with -vvv passed in) is contained within the logs attached.-nr
and one without (and contains the error output)Logs
install_log_with_nr.txt
install_log_without_nr.txt
Related slack thread:
https://cpplang.slack.com/archives/C41CWV9HA/p1703140106967259
The text was updated successfully, but these errors were encountered: