Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] [0.7.4] Attribute error with DeepSpeedTransformerInference #2478

Closed
tomeras91 opened this issue Nov 5, 2022 · 5 comments
Closed

[BUG] [0.7.4] Attribute error with DeepSpeedTransformerInference #2478

tomeras91 opened this issue Nov 5, 2022 · 5 comments
Assignees
Labels
bug Something isn't working inference

Comments

@tomeras91
Copy link

tomeras91 commented Nov 5, 2022

Describe the bug
Running a forward pass on a DeepSpeedTransformerInference layer results in an Attribute error saying that torch.Parameter doesn't have attribute scale.

To Reproduce
Here is a minimal reproducible example that shows the bug:

from deepspeed.ops.transformer import DeepSpeedInferenceConfig, DeepSpeedTransformerInference
import torch

torch.cuda.set_device(0)

hidden_size = 256
heads = 8
num_layers = 12
fp16 = True
layernorm_epsilon = 1e-5
deepspeed_config = DeepSpeedInferenceConfig(hidden_size=hidden_size,
                                            intermediate_size=hidden_size * 4,
                                            heads=heads,
                                            num_hidden_layers=num_layers,
                                            layer_norm_eps=layernorm_epsilon,
                                            # encoder_decoder=False,
                                            fp16=fp16,
                                            pre_layer_norm=True,
                                            stochastic_mode=False,
                                            scale_attention=True,
                                            triangular_masking=True,
                                            local_attention=False,
                                            window_size=256,
                                            )
transformer = DeepSpeedTransformerInference(config=deepspeed_config)
transformer.half()
new_state_dict = {k: 0.01*torch.ones(*v.shape, dtype=v.dtype, device=v.device)
                  for k,v in transformer.state_dict().items()}
transformer.load_state_dict(new_state_dict)
transformer.cuda()
device = list(transformer.parameters())[0].device

batch_size = 1
seq_len = 10
inputs = torch.ones((batch_size, seq_len, hidden_size), dtype=torch.float16, device=device)
input_mask = torch.ones(*inputs.shape[:2], dtype=bool, device=device)

output, _ = transformer(
    input=inputs,
    input_mask=input_mask)

print(f"outupt: \n {output}")

Running the code resulted with the following exception

AttributeError: 'Parameter' object has no attribute 'scale'

Expected behavior
I was expecting to get a correct output, without the exception.

ds_report output

--------------------------------------------------
DeepSpeed C++/CUDA extension op report
--------------------------------------------------
NOTE: Ops not installed will be just-in-time (JIT) compiled at
      runtime if needed. Op compatibility means that your system
      meet the required dependencies to JIT install the op.
--------------------------------------------------
JIT compiled ops requires ninja
ninja .................. [OKAY]
--------------------------------------------------
op name ................ installed .. compatible
--------------------------------------------------
cpu_adam ............... [NO] ....... [OKAY]
cpu_adagrad ............ [NO] ....... [OKAY]
fused_adam ............. [NO] ....... [OKAY]
fused_lamb ............. [NO] ....... [OKAY]
 [WARNING]  using untested triton version (1.1.1), only 1.0.0 is known to be compatible
sparse_attn ............ [NO] ....... [NO]
transformer ............ [NO] ....... [OKAY]
stochastic_transformer . [NO] ....... [OKAY]
 [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.
 [WARNING]  async_io: please install the libaio-dev package with apt
 [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
async_io ............... [NO] ....... [NO]
utils .................. [NO] ....... [OKAY]
quantizer .............. [NO] ....... [OKAY]
transformer_inference .. [NO] ....... [OKAY]
--------------------------------------------------
DeepSpeed general environment info:
torch install path ............... ['/opt/conda/lib/python3.8/site-packages/torch']
torch version .................... 1.8.0a0+1606899
torch cuda version ............... 11.1
torch hip version ................ None
nvcc version ..................... 11.1
deepspeed install path ........... ['/opt/conda/lib/python3.8/site-packages/deepspeed']
deepspeed info ................... 0.7.4, unknown, unknown
deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1

System info (please complete the following information):

  • OS: Ubuntu 20.04
  • GPU count and types: a single A100 GPU
  • Python version: 3.8.5

Launcher context
Launching directly using Python interpreter.

Additional info
I'm not 100% sure, but seems like this bug is a result of PR 2217 by @RezaYazdaniAminabadi. More specifically, the change in line 419 of transformer_inference.py

@tomeras91 tomeras91 added bug Something isn't working inference labels Nov 5, 2022
@cmikeh2
Copy link
Contributor

cmikeh2 commented Nov 7, 2022

Hi @tomeras91,

That's an attribute that is populated on the relevant Tensors during the module injection process (which is triggered by deepspeed.init_inference). Do you have a use case in which you are directly instantiating these inference layers rather than accessing them through the top-level interface?

@cmikeh2 cmikeh2 self-assigned this Nov 7, 2022
@tomeras91
Copy link
Author

Yes. I want to instantiate inference layers and load weights into them. I have pretrained weights from a different source. Basically, I want to use DeepSpeed Inference to serve my custom models.
Is this possible using deepspeed.init_inference as well?

@cmikeh2
Copy link
Contributor

cmikeh2 commented Nov 8, 2022

Yes, you can supply a custom Policy for your specific modules. The relevant abstract class for this can be found here with example implementations for HuggingFace models can be found there as well.

The policy is then passed to init_inference through the injection_policy argument:

model = deepspeed.init_inference(model, 
			         injection_policy={CustomModule: CustomModulePolicy},
				 dtype=torch.float16,
				 # Other arguments
				 )

@tomeras91
Copy link
Author

Thanks. I'll try it out

@jeffra
Copy link
Collaborator

jeffra commented Nov 18, 2022

Please re-open if you still have an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working inference
Projects
None yet
Development

No branches or pull requests

3 participants