Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nv-ds-chat breaks with latest transformers #7052

Merged
merged 1 commit into from
Feb 19, 2025

Conversation

loadams
Copy link
Collaborator

@loadams loadams commented Feb 19, 2025

Traceback:

[rank4]: Traceback (most recent call last):
[rank4]:   File "/scratch/azureml/cr/j/fd0226bc522341fc895296811334080f/exe/wd/actions-runner/_work/DeepSpeed/DeepSpeed/unit-test-venv/lib/python3.10/site-packages/deepspeed/runtime/hybrid_engine.py", line 247, in generate
[rank4]:     generate_ret_vals = self._generate(*inputs, **kwargs)
[rank4]:   File "/scratch/azureml/cr/j/fd0226bc522341fc895296811334080f/exe/wd/actions-runner/_work/DeepSpeed/DeepSpeed/unit-test-venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank4]:     return func(*args, **kwargs)
[rank4]:   File "/scratch/azureml/cr/j/fd0226bc522341fc895296811334080f/exe/wd/actions-runner/_work/DeepSpeed/DeepSpeed/unit-test-venv/lib/python3.10/site-packages/transformers/generation/utils.py", line 2223, in generate
[rank4]:     result = self._sample(
[rank4]:   File "/scratch/azureml/cr/j/fd0226bc522341fc895296811334080f/exe/wd/actions-runner/_work/DeepSpeed/DeepSpeed/unit-test-venv/lib/python3.10/site-packages/transformers/generation/utils.py", line 3214, in _sample
[rank4]:     outputs = model_forward(**model_inputs, return_dict=True)
[rank4]:   File "/scratch/azureml/cr/j/fd0226bc522341fc895296811334080f/exe/wd/actions-runner/_work/DeepSpeed/DeepSpeed/unit-test-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank4]:     return self._call_impl(*args, **kwargs)
[rank4]:   File "/scratch/azureml/cr/j/fd0226bc522341fc895296811334080f/exe/wd/actions-runner/_work/DeepSpeed/DeepSpeed/unit-test-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1845, in _call_impl
[rank4]:     return inner()
[rank4]:   File "/scratch/azureml/cr/j/fd0226bc522341fc895296811334080f/exe/wd/actions-runner/_work/DeepSpeed/DeepSpeed/unit-test-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1793, in inner
[rank4]:     result = forward_call(*args, **kwargs)
[rank4]:   File "/scratch/azureml/cr/j/fd0226bc522341fc895296811334080f/exe/wd/actions-runner/_work/DeepSpeed/DeepSpeed/unit-test-venv/lib/python3.10/site-packages/transformers/models/opt/modeling_opt.py", line 1182, in forward
[rank4]:     outputs = self.model.decoder(
[rank4]:   File "/scratch/azureml/cr/j/fd0226bc522341fc895296811334080f/exe/wd/actions-runner/_work/DeepSpeed/DeepSpeed/unit-test-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank4]:     return self._call_impl(*args, **kwargs)
[rank4]:   File "/scratch/azureml/cr/j/fd0226bc522341fc895296811334080f/exe/wd/actions-runner/_work/DeepSpeed/DeepSpeed/unit-test-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1845, in _call_impl
[rank4]:     return inner()
[rank4]:   File "/scratch/azureml/cr/j/fd0226bc522341fc895296811334080f/exe/wd/actions-runner/_work/DeepSpeed/DeepSpeed/unit-test-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1793, in inner
[rank4]:     result = forward_call(*args, **kwargs)
[rank4]:   File "/scratch/azureml/cr/j/fd0226bc522341fc895296811334080f/exe/wd/actions-runner/_work/DeepSpeed/DeepSpeed/unit-test-venv/lib/python3.10/site-packages/transformers/models/opt/modeling_opt.py", line 853, in forward
[rank4]:     past_key_values = DynamicCache.from_legacy_cache(past_key_values)
[rank4]:   File "/scratch/azureml/cr/j/fd0226bc522341fc895296811334080f/exe/wd/actions-runner/_work/DeepSpeed/DeepSpeed/unit-test-venv/lib/python3.10/site-packages/transformers/utils/deprecation.py", line 172, in wrapped_func
[rank4]:     return func(*args, **kwargs)
[rank4]:   File "/scratch/azureml/cr/j/fd0226bc522341fc895296811334080f/exe/wd/actions-runner/_work/DeepSpeed/DeepSpeed/unit-test-venv/lib/python3.10/site-packages/transformers/cache_utils.py", line 478, in from_legacy_cache
[rank4]:     key_states, value_states = past_key_values[layer_idx]
[rank4]: ValueError: too many values to unpack (expected 2)

The error occurs with the latest transformers (4.49.0+)

Fixes: #7048

Signed-off-by: Logan Adams <loadams@microsoft.com>
@loadams loadams added this pull request to the merge queue Feb 19, 2025
Merged via the queue into master with commit 33dd2e2 Feb 19, 2025
11 checks passed
@loadams loadams deleted the loadams/transformers-ds-chat branch February 19, 2025 17:14
Yejing-Lai pushed a commit to Yejing-Lai/DeepSpeed that referenced this pull request Feb 24, 2025
Signed-off-by: Logan Adams <loadams@microsoft.com>
gyou2021 pushed a commit to gyou2021/DeepSpeed that referenced this pull request Feb 28, 2025
Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: gyou2021 <ganmei.you@intel.com>
tohtana pushed a commit that referenced this pull request Feb 28, 2025
Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com>
shenzheyu pushed a commit to shenzheyu/DeepSpeed that referenced this pull request Mar 5, 2025
Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: Zheyu SHEN <zyshen@umd.edu>
ys950902 pushed a commit to ys950902/DeepSpeed that referenced this pull request Mar 6, 2025
Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: yisheng <yi.sheng@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nv-ds-chat CI test failure
3 participants