[BUG] [0.7.4] Attribute error with DeepSpeedTransformerInference #2478

tomeras91 · 2022-11-05T20:06:05Z

Describe the bug
Running a forward pass on a DeepSpeedTransformerInference layer results in an Attribute error saying that torch.Parameter doesn't have attribute scale.

To Reproduce
Here is a minimal reproducible example that shows the bug:

from deepspeed.ops.transformer import DeepSpeedInferenceConfig, DeepSpeedTransformerInference
import torch

torch.cuda.set_device(0)

hidden_size = 256
heads = 8
num_layers = 12
fp16 = True
layernorm_epsilon = 1e-5
deepspeed_config = DeepSpeedInferenceConfig(hidden_size=hidden_size,
                                            intermediate_size=hidden_size * 4,
                                            heads=heads,
                                            num_hidden_layers=num_layers,
                                            layer_norm_eps=layernorm_epsilon,
                                            # encoder_decoder=False,
                                            fp16=fp16,
                                            pre_layer_norm=True,
                                            stochastic_mode=False,
                                            scale_attention=True,
                                            triangular_masking=True,
                                            local_attention=False,
                                            window_size=256,
                                            )
transformer = DeepSpeedTransformerInference(config=deepspeed_config)
transformer.half()
new_state_dict = {k: 0.01*torch.ones(*v.shape, dtype=v.dtype, device=v.device)
                  for k,v in transformer.state_dict().items()}
transformer.load_state_dict(new_state_dict)
transformer.cuda()
device = list(transformer.parameters())[0].device

batch_size = 1
seq_len = 10
inputs = torch.ones((batch_size, seq_len, hidden_size), dtype=torch.float16, device=device)
input_mask = torch.ones(*inputs.shape[:2], dtype=bool, device=device)

output, _ = transformer(
    input=inputs,
    input_mask=input_mask)

print(f"outupt: \n {output}")

Running the code resulted with the following exception

AttributeError: 'Parameter' object has no attribute 'scale'

Expected behavior
I was expecting to get a correct output, without the exception.

ds_report output

--------------------------------------------------
DeepSpeed C++/CUDA extension op report
--------------------------------------------------
NOTE: Ops not installed will be just-in-time (JIT) compiled at
      runtime if needed. Op compatibility means that your system
      meet the required dependencies to JIT install the op.
--------------------------------------------------
JIT compiled ops requires ninja
ninja .................. [OKAY]
--------------------------------------------------
op name ................ installed .. compatible
--------------------------------------------------
cpu_adam ............... [NO] ....... [OKAY]
cpu_adagrad ............ [NO] ....... [OKAY]
fused_adam ............. [NO] ....... [OKAY]
fused_lamb ............. [NO] ....... [OKAY]
 [WARNING]  using untested triton version (1.1.1), only 1.0.0 is known to be compatible
sparse_attn ............ [NO] ....... [NO]
transformer ............ [NO] ....... [OKAY]
stochastic_transformer . [NO] ....... [OKAY]
 [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.
 [WARNING]  async_io: please install the libaio-dev package with apt
 [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
async_io ............... [NO] ....... [NO]
utils .................. [NO] ....... [OKAY]
quantizer .............. [NO] ....... [OKAY]
transformer_inference .. [NO] ....... [OKAY]
--------------------------------------------------
DeepSpeed general environment info:
torch install path ............... ['/opt/conda/lib/python3.8/site-packages/torch']
torch version .................... 1.8.0a0+1606899
torch cuda version ............... 11.1
torch hip version ................ None
nvcc version ..................... 11.1
deepspeed install path ........... ['/opt/conda/lib/python3.8/site-packages/deepspeed']
deepspeed info ................... 0.7.4, unknown, unknown
deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1

System info (please complete the following information):

OS: Ubuntu 20.04
GPU count and types: a single A100 GPU
Python version: 3.8.5

Launcher context
Launching directly using Python interpreter.

Additional info
I'm not 100% sure, but seems like this bug is a result of PR 2217 by @RezaYazdaniAminabadi. More specifically, the change in line 419 of transformer_inference.py

The text was updated successfully, but these errors were encountered:

cmikeh2 · 2022-11-07T18:32:34Z

Hi @tomeras91,

That's an attribute that is populated on the relevant Tensors during the module injection process (which is triggered by deepspeed.init_inference). Do you have a use case in which you are directly instantiating these inference layers rather than accessing them through the top-level interface?

tomeras91 · 2022-11-07T20:11:54Z

Yes. I want to instantiate inference layers and load weights into them. I have pretrained weights from a different source. Basically, I want to use DeepSpeed Inference to serve my custom models.
Is this possible using deepspeed.init_inference as well?

cmikeh2 · 2022-11-08T17:12:46Z

Yes, you can supply a custom Policy for your specific modules. The relevant abstract class for this can be found here with example implementations for HuggingFace models can be found there as well.

The policy is then passed to init_inference through the injection_policy argument:

model = deepspeed.init_inference(model, 
			         injection_policy={CustomModule: CustomModulePolicy},
				 dtype=torch.float16,
				 # Other arguments
				 )

tomeras91 · 2022-11-09T09:58:28Z

Thanks. I'll try it out

jeffra · 2022-11-18T19:11:30Z

Please re-open if you still have an issue.

tomeras91 added bug Something isn't working inference labels Nov 5, 2022

cmikeh2 self-assigned this Nov 7, 2022

jeffra closed this as completed Nov 18, 2022

weiji14 mentioned this issue Jun 12, 2024

Add missing runtime cuda libs for deepspeed conda-forge/deepspeed-feedstock#61

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] [0.7.4] Attribute error with DeepSpeedTransformerInference #2478

[BUG] [0.7.4] Attribute error with DeepSpeedTransformerInference #2478

tomeras91 commented Nov 5, 2022 •

edited

Loading

cmikeh2 commented Nov 7, 2022

tomeras91 commented Nov 7, 2022

cmikeh2 commented Nov 8, 2022

tomeras91 commented Nov 9, 2022

jeffra commented Nov 18, 2022

[BUG] [0.7.4] Attribute error with DeepSpeedTransformerInference #2478

[BUG] [0.7.4] Attribute error with DeepSpeedTransformerInference #2478

Comments

tomeras91 commented Nov 5, 2022 • edited Loading

cmikeh2 commented Nov 7, 2022

tomeras91 commented Nov 7, 2022

cmikeh2 commented Nov 8, 2022

tomeras91 commented Nov 9, 2022

jeffra commented Nov 18, 2022

tomeras91 commented Nov 5, 2022 •

edited

Loading