[ci-auto] remove "--enforce-eager" for better vLLM perf #631

daisy-ycguo · 2024-12-10T05:19:29Z

GenAIExample ChatQnA compose.yaml got changed

Below files are changed in this commit

ChatQnA/docker_compose/intel/hpu/gaudi/compose_vllm.yaml

Please verify if the helm charts and manifests need to be changed accordingly.

This issue was created automatically by CI.

- Remove --eager-enfoce on hpu to improve performance - Refactor to the upstream docker entrypoint changes Fixes issue opea-project#631. Signed-off-by: Lianhao Lu <lianhao.lu@intel.com>

- Remove --eager-enforce on hpu to improve performance - Refactor to the upstream docker entrypoint changes Fixes issue opea-project#631. Signed-off-by: Lianhao Lu <lianhao.lu@intel.com>

* Add monitoring support for the vLLM component Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com> * Initial vLLM support for ChatQnA For now vLLM replaces just TGI, but as it supports also embedding, also TEI-embed/-rerank may be replaceable later on. Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com> * Fix HPA comments in tgi/tei/tererank values files Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com> * Add HPA scaling support for ChatQnA / vLLM Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com> * Adapt to latest vllm changes - Remove --eager-enforce on hpu to improve performance - Refactor to the upstream docker entrypoint changes Fixes issue #631. Signed-off-by: Lianhao Lu <lianhao.lu@intel.com> * Clean up ChatQnA vLLM Gaudi parameters Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com> --------- Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com> Signed-off-by: Lianhao Lu <lianhao.lu@intel.com> Co-authored-by: Lianhao Lu <lianhao.lu@intel.com>

daisy-ycguo added the helm label Dec 10, 2024

daisy-ycguo assigned lianhao Dec 10, 2024

lianhao changed the title ~~[ci-auto] GenAIExample ChatQnA compose.yaml got changed.~~ [ci-auto] remove "--enforce-eager" for better vLLM perf Dec 10, 2024

lianhao mentioned this issue Dec 10, 2024

Adapt to latest vllm changes #632

Closed

1 task

lianhao mentioned this issue Dec 18, 2024

Add vLLM+HPA support to ChatQnA Helm chart #610

Merged

2 tasks

yongfengdu closed this as completed in #610 Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ci-auto] remove "--enforce-eager" for better vLLM perf #631

[ci-auto] remove "--enforce-eager" for better vLLM perf #631

daisy-ycguo commented Dec 10, 2024

[ci-auto] remove "--enforce-eager" for better vLLM perf #631

[ci-auto] remove "--enforce-eager" for better vLLM perf #631

Comments

daisy-ycguo commented Dec 10, 2024

GenAIExample ChatQnA compose.yaml got changed