[do not merge] sccache test #170

joerunde · 2024-09-26T20:36:06Z

testing if this hits public vllm cache, based on top of #169

fixes RHOAIENG-8043 Co-authored-by: Chih-Chieh-Yang <chih.chieh.yang@ibm.com> Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>

- get rid cuda-devel stage, use cuda 12.4 - add build flags - remove useless installs

add libsodium for tensorizer encryption --------- Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com> Co-authored-by: Daniele <36171005+dtrifiro@users.noreply.github.com>

this is the default when `--worker-use-ray` is not provided and world-size > 1

…hash

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>

… and upstream"

…ject#8445)

…ect#8467)

…lm-project#8468)

- get rid of non-essential dependencies - consolidate package installs - do not copy wheels in final stage - fix ccache usage - use flashattention with triton backend by default: - clone main_perf branch - build rocm target - set up triton rocm env var - configure numba, outlines and triton cache directory

fixup to f5387d0 (openshift/release#56770)

this is a torch dependency when installed from the pytorch/rocm6.1 index: https://download.pytorch.org/whl/nightly/rocm6.1

Dockerfile.ubi.rocm: fix build

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

openshift-ci · 2024-09-26T20:36:14Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: joerunde
Once this PR has been reviewed and has the lgtm label, please assign heyselbi for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

openshift-ci · 2024-09-26T22:29:19Z

@joerunde: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/smoke-test	`8abfa38`	link	true	`/test smoke-test`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

dtrifiro · 2024-09-27T12:39:22Z

@joerunde
No luck:

2024-09-26T22:54:23.111307509Z Compile requests                      133
2024-09-26T22:54:23.111307509Z Compile requests executed             133
2024-09-26T22:54:23.111307509Z Cache hits                              0
2024-09-26T22:54:23.111307509Z Cache misses                          133
2024-09-26T22:54:23.111307509Z Cache misses (C/C++)                    4
2024-09-26T22:54:23.111307509Z Cache misses (CUDA)                   129
2024-09-26T22:54:23.111307509Z Cache timeouts                          0
2024-09-26T22:54:23.111307509Z Cache read errors                       0
2024-09-26T22:54:23.111307509Z Forced recaches                         0
2024-09-26T22:54:23.111307509Z Cache write errors                      0
2024-09-26T22:54:23.111307509Z Compilation failures                    0
2024-09-26T22:54:23.111307509Z Cache errors                            0
2024-09-26T22:54:23.111307509Z Non-cacheable compilations              0
2024-09-26T22:54:23.111307509Z Non-cacheable calls                     0
2024-09-26T22:54:23.111307509Z Non-compilation calls                   0
2024-09-26T22:54:23.111307509Z Unsupported compiler calls              0
2024-09-26T22:54:23.111307509Z Average cache write                 0.001 s
2024-09-26T22:54:23.111307509Z Average compiler                  145.212 s
2024-09-26T22:54:23.111307509Z Average cache read hit              0.000 s
2024-09-26T22:54:23.111307509Z Failed distributed compilations         0
2024-09-26T22:54:23.111307509Z Cache location                  Local disk: "/root/.cache/sccache"
2024-09-26T22:54:23.111307509Z Use direct/preprocessor mode?   yes
2024-09-26T22:54:23.111307509Z Version (client)                0.8.1
2024-09-26T22:54:23.111307509Z Cache size                            172 MiB
2024-09-26T22:54:23.111307509Z Max cache size                         10 GiB

This PR enables LoRA support in HPU. * Implemented custom BGMV for LoRA modules using index-select operator. * Support for both single and multi card scenarios has been tested --------- Co-authored-by: Himangshu Lahkar <49579433+hlahkar@users.noreply.github.com> Co-authored-by: Himangshu Lahkar <hlahkar@habana.ai>

z103cb and others added 30 commits September 13, 2024 11:27

chore: add fork OWNERS

dbf639e

add ubi Dockerfile

a24f03d

Dockerfile.ubi: remove references to grpc/protos

753f948

Dockerfile.ubi: use vllm-tgis-adapter

84eb826

gha: add sync workflow

577cb43

Dockerfile.ubi: use distributed-executor-backend=mp as default

e3be40e

Dockerfile.ubi: remove vllm-nccl workaround

6844cb9

Fixed upstream in vllm-project#5091

Dockerfile.ubi: add missing requirements-*.txt bind mounts

cbffbf1

add triton CustomCacheManger

ae4ac8d

fixes RHOAIENG-8043 Co-authored-by: Chih-Chieh-Yang <chih.chieh.yang@ibm.com> Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>

gha: sync-with-upstream workflow create PRs as draft

bd5ea86

add smoke/unit tests scripts

57b2cbf

extras: exit unit tests on err

5b1ada5

Dockerfile.ubi: misc improvements

3a3dae7

- get rid cuda-devel stage, use cuda 12.4 - add build flags - remove useless installs

update OWNERS

f8e70cd

Dockerfile.ubi: use tensorizer (#64)

57fdb00

add libsodium for tensorizer encryption --------- Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com> Co-authored-by: Daniele <36171005+dtrifiro@users.noreply.github.com>

Dockerfile.ubi: pin vllm-tgis-adapter to 0.1.2

586c30c

gha: fix fetch step in upstream sync workflow

f0dc391

gha: always update sync workflow PR body/title

cfded1b

Dockerfile.ubi: bump vllm-tgis-adapter to 0.1.3

5493671

Dockerfile.ubi: get rid of --distributed-executor-backend=mp

eb82bd4

this is the default when `--worker-use-ray` is not provided and world-size > 1

Dockerfile.ubi: add flashinfer

a4bb48b

pin adapter to 2.0.0

8dcca5c

deps: bump flashinfer to 0.0.9

48aa285

Update OWNERS with IBM folks

8955a3b

Dockerfile.ubi: bind mount .git dir to allow inclusion of git commit …

c66756a

…hash

gha: remove reminder_comment

9d29a26

Dockerfile: bump vllm-tgis-adapter to 0.2.1

24a9763

fix: update setup.py to differentiate between fork and upstream

2e83b96

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>

Dockerfile.ubi: properly mount .git dir

6cb0961

Revert "[CI/Build] fix: update setup.py to differentiate between fork…

a6ee52d

… and upstream"

youkaichao and others added 21 commits September 16, 2024 12:43

[misc][ci] fix quant test (vllm-project#8449)

07afc6d

[Installation] Gate FastAPI version for Python 3.8 (vllm-project#8456)

2c657ec

[plugin][torch.compile] allow to add custom compile backend (vllm-pro…

f8d2bf0

…ject#8445)

[CI/Build] Reorganize models tests (vllm-project#7820)

754dc0f

[Doc] Add oneDNN installation to CPU backend documentation (vllm-proj…

bf7e710

…ect#8467)

[HotFix] Fix final output truncation with stop string + streaming (vl…

8d32eaf

…lm-project#8468)

bump version to v0.6.1.post2 (vllm-project#8473)

1ed711a

add vllm-tgis-adapter layer

66984d4

Dockerfile.ubi: bump python to 3.12

f5387d0

Dockerfile.ubi: bump flashinfer to 0.1.6

b3abd3a

Dockerfile.rocm.ubi: do not use nightly pytorch_triton

9f85dae

Dockerfile.ubi: fix PYTHON_VERSION arg usage

083f0d5

fixup to f5387d0 (openshift/release#56770)

Dockerfile.rocm.ubi: move microdnf update in base stage

c29c9f4

Dockerfile.rocm.ubi: bump torch version to 2.5.0.dev20240912+rocm6.1

69ac6c1

Dockerfile.rocm.ubi: get rid of build triton stage

399c114

this is a torch dependency when installed from the pytorch/rocm6.1 index: https://download.pytorch.org/whl/nightly/rocm6.1

Merge pull request #167 from dtrifiro/fix-amd-build

9ebf28d

Dockerfile.ubi.rocm: fix build

Sync with upstream @ v0.6.2

18d7da7

Dockerfile.rocm.ubi: add setuptools-scm build dependency

56fdd53

Dockerfile.ubi: add VLLM_FA_CMAKE_GPU_ARCHES

d151278

⚗️ try public sccache

42a5a5b

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

openshift-ci bot requested review from tarukumar and vaibhavjainwiz September 26, 2024 20:36

🔥 remove sudo

8abfa38

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

dtrifiro force-pushed the main branch from d19c747 to f63fbdd Compare September 27, 2024 15:14

joerunde closed this Sep 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[do not merge] sccache test #170

[do not merge] sccache test #170

joerunde commented Sep 26, 2024

openshift-ci bot commented Sep 26, 2024

openshift-ci bot commented Sep 26, 2024 •

edited

Loading

dtrifiro commented Sep 27, 2024

[do not merge] sccache test #170

[do not merge] sccache test #170

Conversation

joerunde commented Sep 26, 2024

openshift-ci bot commented Sep 26, 2024

openshift-ci bot commented Sep 26, 2024 • edited Loading

dtrifiro commented Sep 27, 2024

openshift-ci bot commented Sep 26, 2024 •

edited

Loading