Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the tgi service images for gaudi #451

Merged
merged 6 commits into from
Sep 27, 2024

Conversation

zhlsunshine
Copy link
Collaborator

Description

Update the tgi service images for both xeon and gaudi.

Issues

n/a.

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)

Dependencies

n/a.

Tests

n/a.

@lianhao
Copy link
Collaborator

lianhao commented Sep 23, 2024

can we confirm all the models used in GenAIExamples are supported by latest-xeon-cpus?

@zhlsunshine
Copy link
Collaborator Author

can we confirm all the models used in GenAIExamples are supported by latest-xeon-cpus?

Yeah, sure, let me confirm with @lvliang-intel

@lianhao
Copy link
Collaborator

lianhao commented Sep 23, 2024

based on the test output, tgi pod crashed during test, so I guess it's still not working yet.

@yongfengdu
Copy link
Collaborator

Seeing this error message at TGI pod, so it looks like the latest tgi image doesn't work with this model correctly.
model_id: "Intel/neural-chat-7b-v3-3"
...
{"timestamp":"2024-09-23T05:00:54.220232Z","level":"ERROR","fields":{"message":"Shard 0 crashed"},"target":"text_generation_launcher"}

BTW, if we want to upgrade to tgi-gaudi:2.0.5, the "gaudi-values.yaml" in TGI helm chart should also be updated.

@zhlsunshine
Copy link
Collaborator Author

Hi @yongfengdu and @lianhao, after discussed with Intel Xeon TGI image support engineer, we'd better to use the specified version of TGI image, such as ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu

@zhlsunshine zhlsunshine force-pushed the rdimagech branch 2 times, most recently from 1884cd7 to ca52fc2 Compare September 23, 2024 09:58
@eero-t
Copy link
Contributor

eero-t commented Sep 24, 2024

CI fail is for the CPU values file.

That fails also in #454 which does not change any of the components used in CI (it skips HPA testing), unlike this one.

Therefore it's possible that the failure is not related to the TGI update (unless that increases TGI resource usage, or slows it down).

@lianhao
Copy link
Collaborator

lianhao commented Sep 25, 2024

Please wait for PR #456 to land-in first, then rebase to trigger the test to see if it still exists

zhlsunshine and others added 6 commits September 25, 2024 16:59
Signed-off-by: zhlsunshine <huailong.zhang@intel.com>
Signed-off-by: zhlsunshine <huailong.zhang@intel.com>
Signed-off-by: zhlsunshine <huailong.zhang@intel.com>
Signed-off-by: zhlsunshine <huailong.zhang@intel.com>
Signed-off-by: zhlsunshine <huailong.zhang@intel.com>
@lianhao lianhao changed the title Update the tgi service images for both xeon and gaudi Update the tgi service images for gaudi Sep 25, 2024
@lianhao
Copy link
Collaborator

lianhao commented Sep 26, 2024

@zhlsunshine for the test failure of GMC E2e Test on Xeon which is unrelated to this PR itself, GenAIComps PR 608 could be the trigger for that failure. But could you double check if there is any issue in the GMC router which would cause trouble when trying to forward back large amount of streaming data from tgi to the end user?

@zhlsunshine
Copy link
Collaborator Author

@zhlsunshine for the test failure of GMC E2e Test on Xeon which is unrelated to this PR itself, GenAIComps PR 608 could be the trigger for that failure. But could you double check if there is any issue in the GMC router which would cause trouble when trying to forward back large amount of streaming data from tgi to the end user?

Hi @lianhao, I noticed that there are some parameters change in request, however, GMC router just pass through them, so I do not think there is issue for these changes.

@yongfengdu yongfengdu merged commit bd6f76c into opea-project:main Sep 27, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants