Update the tgi service images for gaudi #451

zhlsunshine · 2024-09-23T02:53:57Z

Description

Update the tgi service images for both xeon and gaudi.

Issues

n/a.

Type of change

List the type of change like below. Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds new functionality)
Breaking change (fix or feature that would break existing design and interface)

Dependencies

n/a.

Tests

n/a.

lianhao · 2024-09-23T02:56:28Z

can we confirm all the models used in GenAIExamples are supported by latest-xeon-cpus?

zhlsunshine · 2024-09-23T04:05:05Z

can we confirm all the models used in GenAIExamples are supported by latest-xeon-cpus?

Yeah, sure, let me confirm with @lvliang-intel

lianhao · 2024-09-23T04:43:30Z

based on the test output, tgi pod crashed during test, so I guess it's still not working yet.

yongfengdu · 2024-09-23T05:05:28Z

Seeing this error message at TGI pod, so it looks like the latest tgi image doesn't work with this model correctly.
model_id: "Intel/neural-chat-7b-v3-3"
...
{"timestamp":"2024-09-23T05:00:54.220232Z","level":"ERROR","fields":{"message":"Shard 0 crashed"},"target":"text_generation_launcher"}

BTW, if we want to upgrade to tgi-gaudi:2.0.5, the "gaudi-values.yaml" in TGI helm chart should also be updated.

zhlsunshine · 2024-09-23T09:23:28Z

Hi @yongfengdu and @lianhao, after discussed with Intel Xeon TGI image support engineer, we'd better to use the specified version of TGI image, such as ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu

microservices-connector/config/samples/ChatQnA/use_cases.md

eero-t · 2024-09-24T08:30:30Z

CI fail is for the CPU values file.

That fails also in #454 which does not change any of the components used in CI (it skips HPA testing), unlike this one.

Therefore it's possible that the failure is not related to the TGI update (unless that increases TGI resource usage, or slows it down).

lianhao · 2024-09-25T07:17:47Z

Please wait for PR #456 to land-in first, then rebase to trigger the test to see if it still exists

Signed-off-by: zhlsunshine <huailong.zhang@intel.com>

for more information, see https://pre-commit.ci

lianhao · 2024-09-26T00:49:25Z

@zhlsunshine for the test failure of GMC E2e Test on Xeon which is unrelated to this PR itself, GenAIComps PR 608 could be the trigger for that failure. But could you double check if there is any issue in the GMC router which would cause trouble when trying to forward back large amount of streaming data from tgi to the end user?

zhlsunshine · 2024-09-27T01:04:17Z

@zhlsunshine for the test failure of GMC E2e Test on Xeon which is unrelated to this PR itself, GenAIComps PR 608 could be the trigger for that failure. But could you double check if there is any issue in the GMC router which would cause trouble when trying to forward back large amount of streaming data from tgi to the end user?

Hi @lianhao, I noticed that there are some parameters change in request, however, GMC router just pass through them, so I do not think there is issue for these changes.

zhlsunshine requested review from KfreeZ, yongfengdu and lianhao as code owners September 23, 2024 02:53

zhlsunshine force-pushed the rdimagech branch from f1e88f2 to 3aae23c Compare September 23, 2024 09:21

lianhao requested changes Sep 23, 2024

View reviewed changes

microservices-connector/config/samples/ChatQnA/use_cases.md Outdated Show resolved Hide resolved

zhlsunshine force-pushed the rdimagech branch 2 times, most recently from 1884cd7 to ca52fc2 Compare September 23, 2024 09:58

eero-t mentioned this pull request Sep 24, 2024

Support alternative metrics on accelerated TGI / TEI instances #454

Merged

1 task

zhlsunshine and others added 6 commits September 25, 2024 16:59

update the tgi service images for both xeon and gaudi.

0d56cb2

Signed-off-by: zhlsunshine <huailong.zhang@intel.com>

let's revert to use the specified version of TGI image.

4f8be47

Signed-off-by: zhlsunshine <huailong.zhang@intel.com>

change the tgi gaudi image version.

ddb9f7c

Signed-off-by: zhlsunshine <huailong.zhang@intel.com>

change based on comment.

ecd5e63

Signed-off-by: zhlsunshine <huailong.zhang@intel.com>

change the image tag in helm charts.

0a05380

Signed-off-by: zhlsunshine <huailong.zhang@intel.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

7201b20

for more information, see https://pre-commit.ci

lianhao force-pushed the rdimagech branch from c40eae5 to 7201b20 Compare September 25, 2024 08:59

lianhao changed the title ~~Update the tgi service images for both xeon and gaudi~~ Update the tgi service images for gaudi Sep 25, 2024

lianhao approved these changes Sep 25, 2024

View reviewed changes

yongfengdu approved these changes Sep 27, 2024

View reviewed changes

yongfengdu merged commit bd6f76c into opea-project:main Sep 27, 2024
25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update the tgi service images for gaudi #451

Update the tgi service images for gaudi #451

zhlsunshine commented Sep 23, 2024

lianhao commented Sep 23, 2024

zhlsunshine commented Sep 23, 2024

lianhao commented Sep 23, 2024

yongfengdu commented Sep 23, 2024

zhlsunshine commented Sep 23, 2024

eero-t commented Sep 24, 2024

lianhao commented Sep 25, 2024

lianhao commented Sep 26, 2024 •

edited

Loading

zhlsunshine commented Sep 27, 2024

Update the tgi service images for gaudi #451

Update the tgi service images for gaudi #451

Conversation

zhlsunshine commented Sep 23, 2024

Description

Issues

Type of change

Dependencies

Tests

lianhao commented Sep 23, 2024

zhlsunshine commented Sep 23, 2024

lianhao commented Sep 23, 2024

yongfengdu commented Sep 23, 2024

zhlsunshine commented Sep 23, 2024

eero-t commented Sep 24, 2024

lianhao commented Sep 25, 2024

lianhao commented Sep 26, 2024 • edited Loading

zhlsunshine commented Sep 27, 2024

lianhao commented Sep 26, 2024 •

edited

Loading