Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensor size mismatch when using static quantization #665

Closed
yiliu30 opened this issue Aug 13, 2024 · 2 comments
Closed

Tensor size mismatch when using static quantization #665

yiliu30 opened this issue Aug 13, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@yiliu30
Copy link
Contributor

yiliu30 commented Aug 13, 2024

At static_quant, if we change the ToyLinearModel(1024, 1024, 1024) to ToyLinearModel(), it causes the below issue:

Traceback (most recent call last):
  File "projs/torchao/tutorials/calibration_flow/static_quant.py", line 132, in <module>
    m(*example_inputs)
  File "conda_env/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "conda_env/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "projs/torchao/tutorials/calibration_flow/static_quant.py", line 109, in forward
    x = self.linear2(x)
        ^^^^^^^^^^^^^^^
  File "conda_env/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "conda_env/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "projs/torchao/tutorials/calibration_flow/static_quant.py", line 27, in forward
    observed_weight = self.weight_obs(self.weight)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "conda_env/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "conda_env/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "conda_env/torch/ao/quantization/observer.py", line 739, in forward
    return self._forward(x_orig)
           ^^^^^^^^^^^^^^^^^^^^^
  File "conda_env/torch/ao/quantization/observer.py", line 762, in _forward
    min_val = torch.min(min_val_cur, min_val)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: The size of tensor a (64) must match the size of tensor b (32) at non-singleton dimension 0

cc @jerryzh168

@jerryzh168 jerryzh168 self-assigned this Aug 13, 2024
@jerryzh168
Copy link
Contributor

thanks, there is a bug in replacement_fn I think, fixed in #650

@msaroufim msaroufim added the bug Something isn't working label Aug 13, 2024
@jerryzh168
Copy link
Contributor

should be fixed now

yanbing-j pushed a commit to yanbing-j/ao that referenced this issue Dec 9, 2024
* executable README

* fix title of CI workflow

* markup commands in markdown

* extend the markup-markdown language

* Automatically identify cuda from nvidia-smi in install-requirements (pytorch#606)

* Automatically identify cuda from nvidia-smi in install-requirements

* Update README.md

---------

Co-authored-by: Michael Gschwind <61328285+mikekgfb@users.noreply.github.com>

* Unbreak zero-temperature sampling (pytorch#599)

Fixes pytorch#581.

* Improve process README

* [retake] Add sentencepiece tokenizer (pytorch#626)

* Add sentencepiece tokenizer

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Add white space

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Handle white space:

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Handle control ids

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* More cleanup

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Lint

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Use unique_ptr

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Use a larger runner

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Debug

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Debug

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Cleanup

* Update install_utils.sh to use python3 instead of python (pytorch#636)

As titled. On some devices `python` and `python3` are pointing to different environments so good to unify them.

* Fix quantization doc to specify dytpe limitation on a8w4dq (pytorch#629)

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Co-authored-by: Kimish Patel <kimishpatel@fb.com>

* add desktop.json (pytorch#622)

* add desktop.json

* add fast

* remove embedding

* improvements

* update readme from doc branch

* tab/spc

* fix errors in updown language

* fix errors in updown language, and [skip]: begin/end

* fix errors in updown language, and [skip]: begin/end

* a storied run

* stories run on readme instructions does not need HF token

* increase timeout

* check for hang un hf_login

* executable README improvements

* typo

* typo

---------

Co-authored-by: Ian Barber <ian.barber@gmail.com>
Co-authored-by: Scott Wolchok <swolchok@meta.com>
Co-authored-by: Mengwei Liu <larryliu0820@users.noreply.github.com>
Co-authored-by: Kimish Patel <kimishpatel@fb.com>
Co-authored-by: Scott Roy <161522778+metascroy@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants