Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDN issue when building linux wheels #1063

Closed
msaroufim opened this issue Oct 12, 2024 · 1 comment
Closed

CDN issue when building linux wheels #1063

msaroufim opened this issue Oct 12, 2024 · 1 comment
Assignees
Labels

Comments

@msaroufim
Copy link
Member

Since Oct 7 our CI jobs on the main branch have been reliably red because everytime we download pytorch we run into this issue

This is a CDN issue that is out of control but escalated this interally

+ conda run -p /__w/_temp/conda_environment_11300887954 pip install torch --pre --index-url https://d3kup0pazkvub8.cloudfront.net/whl/nightly/cpu
ERROR conda.cli.main_run:execute(41): `conda run pip install torch --pre --index-url [https://d3kup0pazkvub8.cloudfront.net/whl/nightly/cpu`](https://d3kup0pazkvub8.cloudfront.net/whl/nightly/cpu%60) failed. (See above for error)
WARNING: The directory '/github/home/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
ERROR: Wheel 'torch' located at /tmp/pip-unpack-nzix8h63/torch-2.6.0.dev20241011_cpu-cp310-cp310-linux_x86_64.whl is invalid.

Looking in indexes: https://d3kup0pazkvub8.cloudfront.net/whl/nightly/cpu
Collecting torch
  Downloading https://d3kup0pazkvub8.cloudfront.net/whl/nightly/cpu/torch-2.6.0.dev20241011%2Bcpu-cp310-cp310-linux_x86_64.whl (175.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━          134.2/175.0 MB 62.5 MB/s eta 0:00:01
@atalman
Copy link
Contributor

atalman commented Oct 14, 2024

The PR that deployed Meta CDN is reverted: pytorch/test-infra#5762

yanbing-j pushed a commit to yanbing-j/ao that referenced this issue Dec 9, 2024
* Initial add of distributed model

Use parallelize_module in model

[ghstack-poisoned]

* Update on "Initial add of distributed model"


Use `parallelize_module` in model.

Added files:

`model_dist.py`: a mirror of model.py with Tensor Parallelism baked in.
`dist_run.py`: toy example of how to run the model in distributed way.

Test:
`torchrun --nproc-per-node 2 dist_run.py`


[ghstack-poisoned]

* Update on "Initial add of distributed model"


Use `parallelize_module` in model.

Added files:

`model_dist.py`: a mirror of model.py with Tensor Parallelism baked in.
`dist_run.py`: toy example of how to run the model in distributed way.

Test:
`torchrun --nproc-per-node 2 dist_run.py`


[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants