Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to download compressed mapping from GitHub, even if I can wget the file. #1882

Open
2 tasks done
traversaro opened this issue Aug 22, 2024 · 22 comments
Open
2 tasks done
Labels
🐞 bug Something isn't working

Comments

@traversaro
Copy link
Contributor

traversaro commented Aug 22, 2024

Checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pixi, using pixi --version: 0.27.1

Reproducible example

I am on a HPC system (of which I do not know all the details, so this is definitely not reproducible) and running a pixi command fails with:

[straversaro@fnode01 jaxsim]$ pixi run -e test-gpu test
 WARN The feature 'style' is defined but not used in any environment
 WARN The feature 'testing' is defined but not used in any environment
 WARN The feature 'viz' is defined but not used in any environment
 WARN The feature 'all' is defined but not used in any environment
  × failed to download pypi mapping from https://raw.githubusercontent.com/prefix-dev/parselmouth/main/files/compressed_mapping.json location
  ├─▶ Middleware error: File still doesn't exist
  ├─▶ File still doesn't exist
  ╰─▶ No such file or directory (os error 2)
~~~

Interestingly, exactly the same file can be downloaded without problem with
~~~
[straversaro@fnode01 jaxsim]$ wget  https://raw.githubusercontent.com/prefix-dev/parselmouth/main/files/compressed_mapping.json
--2024-08-22 10:51:43--  https://raw.githubusercontent.com/prefix-dev/parselmouth/main/files/compressed_mapping.json
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.110.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 714886 (698K) [text/plain]
Saving to: ‘compressed_mapping.json’

compressed_mapping.json                     100%[===========================================================================================>] 698.13K  --.-KB/s    in 0.06s   

2024-08-22 10:51:44 (11.2 MB/s) - ‘compressed_mapping.json’ saved [714886/714886]

Issue description

I found a bunch of similar (but apparently not directly related) issues, so I tought it was more clear to open a new issue. Do you have any idea on how I could debug the problem more?

Expected behavior

That pixi run worked without any download error.

@traversaro traversaro added the 🐞 bug Something isn't working label Aug 22, 2024
@ruben-arts
Copy link
Contributor

Did you try with pixi run --tls-no-verify I'm curious if that would fix it.

@ruben-arts ruben-arts changed the title failed to download pypi mapping from https://raw.githubusercontent.com/prefix-dev/parselmouth/main/files/compressed_mapping.json location on HPC system, even if I can wget the file Failed to download compressed mapping from GitHub, even if I can wget the file. Aug 22, 2024
@traversaro
Copy link
Contributor Author

It still does not work (and now the message slightly changed, it does not print the url of the file to download anymore):

[straversaro@fe01 jaxsim]$ pixi run echo
 WARN The feature 'style' is defined but not used in any environment
 WARN The feature 'testing' is defined but not used in any environment
 WARN The feature 'viz' is defined but not used in any environment
 WARN The feature 'all' is defined but not used in any environment
  ⠒ gpugroup:linux-64    [00:00:04] loading repodata
  ⠒ cpugroup:linux-64    [00:00:04] applying JLAP patches
  ⠒ cpugroup:osx-64      [00:00:04] loading repodata
  × failed to download pypi name mapping
  ├─▶ Middleware error: File still doesn't exist
  ├─▶ File still doesn't exist
  ╰─▶ No such file or directory (os error 2)
[straversaro@fe01 jaxsim]$ pixi run --tls-no-verify echo
 WARN The feature 'style' is defined but not used in any environment
 WARN The feature 'testing' is defined but not used in any environment
 WARN The feature 'viz' is defined but not used in any environment
 WARN The feature 'all' is defined but not used in any environment
 WARN TLS verification is disabled. This is insecure and should only be used for testing or internal networks.
  ⠤ cpugroup:linux-64    [00:00:04] applying JLAP patches
  ⠤ gpugroup:linux-64    [00:00:04] loading repodata
  ⠤ cpugroup:osx-64      [00:00:04] loading repodata
  × failed to download pypi name mapping
  ├─▶ Middleware error: File still doesn't exist
  ├─▶ File still doesn't exist
  ╰─▶ No such file or directory (os error 2)

@ctcjab
Copy link

ctcjab commented Aug 27, 2024

When I add config like the following to my pyproject.toml that the docs suggest...

[tool.pixi.pypi-dependencies]
foo = { path = ".", editable = true }

...and then try to pixi install, I get the error:

  × failed to download pypi name mapping
  ├─▶ error sending request for url (https://conda-mapping.prefix.dev/hash-v0/2c9874a8f76f8edabb9213351be27d8cc8eed2a08428b230e2b87cd00b9a06d8)
  ├─▶ client error (Connect)
  ├─▶ dns error: failed to lookup address information: Try again
  ╰─▶ failed to lookup address information: Try again

Looks similar but not exactly the same as the error reported in this issue -- should I open a separate issue?

@nichmor
Copy link
Contributor

nichmor commented Aug 28, 2024

When I add config like the following to my pyproject.toml that the docs suggest...

[tool.pixi.pypi-dependencies]
foo = { path = ".", editable = true }

...and then try to pixi install, I get the error:

  × failed to download pypi name mapping
  ├─▶ error sending request for url (https://conda-mapping.prefix.dev/hash-v0/2c9874a8f76f8edabb9213351be27d8cc8eed2a08428b230e2b87cd00b9a06d8)
  ├─▶ client error (Connect)
  ├─▶ dns error: failed to lookup address information: Try again
  ╰─▶ failed to lookup address information: Try again

Looks similar but not exactly the same as the error reported in this issue -- should I open a separate issue?

Hey! could you please share your pixi.toml or pyproject.toml configuration?

@ctcjab
Copy link

ctcjab commented Aug 29, 2024

Sure:

[project]
name = "foo"
description = "foo"
version = "0.0.1"

[tool.pixi.project]
channels = [
  "https://artifactory.chicagotrading.com/artifactory/api/conda/ctc-curated-condaforge-main",
  ...
]
platforms = ["linux-64"]

[tool.pixi.dependencies]
# 3rd-party
matplotlib = "*"
pandas = "*"
# 1st-party
...

# The commented out config below is suggested by
# https://pixi.sh/latest/tutorials/python/#whats-in-the-pyprojecttoml:~:text=The%20pixi_py%20package%20itself%20is%20added%20as%20an%20editable%20dependency
# for a good local development workflow, but caused the following error when I tried it:
# error sending request for url (https://conda-mapping.prefix.dev/hash-v0/115b796fddc846bee6f47e3c57d04d12fa93a47a7a8ef639cefdc05203c1bf00)
# ├─▶ client error (Connect)
# ├─▶ dns error: failed to lookup address information: Try again
#
# [tool.pixi.pypi-dependencies]
# foo = { path = ".", editable = true }
#
# Install this package in the test env too so that we can remove "src" from pytest pythonpath config below:
# [tool.pixi.feature.test.pypi-dependencies]
# foo = { path = ".", editable = true }

[tool.pixi.feature.test.dependencies]
pytest = "*"

[tool.pixi.feature.test.tasks]
test = "pytest"

[tool.pixi.environments]
default = { solve-group = "default" }
test = { features = ["test"], solve-group = "default" }

[tool.pytest.ini_options]
pythonpath = ["src"]
testpaths = ["tests"]
addopts = [
  "-vv",
]

@ricrogz
Copy link

ricrogz commented Oct 11, 2024

Did you find what was going on? I think I'm hitting the same issue on Ubuntu 22.04, and I can't but suspect it is related to authentication and access to gnome-keyring.

@ctcjab
Copy link

ctcjab commented Oct 15, 2024

I did not ever find out what was going on, and ended up switching away from pixi for the use case where I was hitting this issue, so I haven't put more time into it since then.

@ricrogz
Copy link

ricrogz commented Oct 25, 2024

Ok, I found out what was happening on my case: I was using the pixi executable from the Releases section here in github, inside of a Rocky8 docker container. That one is built on top of musl. And musl seems to have a known issue with dns resolving: https://stackoverflow.com/questions/65181012/does-alpine-have-known-dns-issue-within-kubernetes#65593511. The stackoverflow post is old (2021!), but it seems the issue is still there in some degree, as I could work around it by using one of the suggested workarounds: direct edition of /etc/host to hardcode some FQDN to their known IPs.

It might be a good idea to add a glibc based build to the release assets.

@RomDeffayet
Copy link

RomDeffayet commented Feb 17, 2025

Hi, I get the same error on RHEL 8.8 and it prevents me from installing any package from pypi, is there any workaround ?

@nichmor
Copy link
Contributor

nichmor commented Feb 17, 2025

hey @RomDeffayet ! do you have the same root issue as @ricrogz ( using musl builds?) ?

@RomDeffayet
Copy link

No sorry, the same one as @traversaro : I just installed pixi and tried setting up a project
pixi init went fine.
pixi add --pypi numpy (or by defining it in the pixi.toml file then running pixi install) throws the following error:

× failed to download pypi mapping from https://raw.githubusercontent.com/prefix-dev/parselmouth/main/files/compressed_mapping.json location
  ├─▶ File still doesn't exist
  ╰─▶ No such file or directory (os error 2)

or just:

× failed to download pypi mapping
  ├─▶ File still doesn't exist
  ╰─▶ No such file or directory (os error 2)

And similarly as @traversaro, I can wget the file without any issues.

@LunarLanding
Copy link

LunarLanding commented Feb 19, 2025

Same here, pixi 0.41.4, fresh install on Debian hpc login node.
Compiled with cargo (no musl, glibc) and still get the same error.
Then I stopped tried to use the fast storage mounted filesystems, and the problem disappeared. Which is not great, since I want to put pixie cache on the fast filesystems. Can't really understand why the download is connected to the location of the cache.

Maybe this library you use, rattler, downloads stuff to disk in a way that is incompatible with the less than ideal HPC filesystems?

@RomDeffayet
Copy link

Can confirm: the problem came from setting $PIXI_CACHE_DIR to a beegfs filesystem ($PIXI_HOME can point to that fs without any issue)

@LunarLanding
Copy link

LunarLanding commented Feb 20, 2025

@RomDeffayet wow, ok, so we have our cases pinned down to beegfs.
It’s a shame since beegfs is the partition my cluster team pitched as the less congested filesystem.
I printed out the output of mount, and at first glance the mount config seemed similar between the working and buggy mounts, so I wonder what is going on.

There seems to be in HPC a real disconnect between efficient shared storage, which requires minimizing calls to the metadata handling part of the filesystem (opening,closing files) and python environments updating and usage, which requires updating/reading thousands of small files.

I am going to try some other storage locations in the cluster.

@wolfv
Copy link
Member

wolfv commented Feb 21, 2025

Yeah for the mapping we are writing a high number of tiny files to the cache directory.

We could think of ways of making that more efficient.

We also have the mapping as a single bigger file and we could make that configurable (or ideally automatically discover if we are on a slow / network file system and prefer that). Do you see other solutions?

@LunarLanding
Copy link

@wolfv thanks for the quick reply. Let me split things more explicitly. All of these put some strain in the filesystem:

  1. the package manager updating and reading the mapping cache
  2. constructing the (hard?/sym?)links to build/updating an environment
  3. importing python modules to run code

I don't think HPC users can expect 2 or 3 to be as fast as local storage, but 1 creates an issue micromamba did not have.
Would reducing the number of files needed by using a single file database for 1 as you suggest also improve performance for non-HPC users? I just checked and on my local pixi installation the conda-pypi-mapping folder in the cache has 11k files.

For a small slice of users that have access to more distributed filesystems fixing #3180 should make all three points less stressing on the filesystem.

@wolfv
Copy link
Member

wolfv commented Mar 11, 2025

We're working on some improvements in #3318

@baszalmstra
Copy link
Contributor

@LunarLanding Would having one sqlite database instead of lots of small files be more performant on these filesystems?

@LunarLanding
Copy link

@wolfv I had a look at the change set, but could not see if they would reduce the large number of files created by the mapping.

@baszalmstra with sqlite on NFS, because the database is a single file then it should be more performant since the metadata server only needs to open and close one file, once.

However if there is a process writting then it needs to be the only process accessing the db, ie. sqlite built-in locking depends on assumptions that are not valid in NFS.

There are two possible solutions:

  • The workflow in HPC is on a login node installing the environment on a NFS mounted directory, and later activating it in parallel in many worker nodes. Ideally users do not update the environment while workers are using it. In practice users will sometimes update their environment while jobs are still running, but this ends up being of little consequence since jobs will already have loaded necessary modules to memory, and in the worst case they can be relaunched if they crash.
    If during activation in worker nodes pixi can be made to use the cached environment, and thus skip any read/write of the mapping database, then using sqlite will be safe.

  • there is a NFS lock solution in flufl.lock that is used. Either pixi/ pixi users could wrap access to db access/ pixi activation with such a lock, making it safe.

@baszalmstra
Copy link
Contributor

There are a number of different issues in this issue.

The original issue from @traversaro seems to be an issue with writing to the http cache. File still doesn't exist is issues by the caching layer where it failed to write to a file. That also explains why wget works. I assume this is because the HPC filesystem acts strange here. Could you try setting the PIXI_CACHE_DIR env variable to another location?

For the other issue regarding musl it might indeed make sense to ship a binary based on glibc. However, I would like to make sure that we ship a binary that links with the lowest compatible glibc possible. Maybe zigbuild can help us out there.

I don't directly see a simple way forward to refactor the code to not produce many small files. We'll have to put that on the backlog for now.

@LunarLanding
Copy link

The original issue from @traversaro seems to be an issue with writing to the http cache. File still doesn't exist is issues by the caching layer where it failed to write to a file. That also explains why wget works. I assume this is because the HPC filesystem acts strange here. Could you try setting the PIXI_CACHE_DIR env variable to another location?

@baszalmstra yes, it is location dependent #1882 (comment) and here is the corresponding issue #3180

I don't directly see a simple way forward to refactor the code to not produce many small files. We'll have to put that on the backlog for now.

Should I create a new issue for the high number of files? Specifically I am talking about the package manager updating and reading the mapping cache, which is not required by micromamba.

@baszalmstra
Copy link
Contributor

Should I create a new issue for the high number of files? Specifically I am talking about the package manager updating and reading the mapping cache, which is not required by micromamba.

Lets do that to not conflate this issue to much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

9 participants