Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference with weight normalization #7

Open
blueardour opened this issue Apr 27, 2022 · 0 comments
Open

Difference with weight normalization #7

blueardour opened this issue Apr 27, 2022 · 0 comments

Comments

@blueardour
Copy link

blueardour commented Apr 27, 2022

Hi,

Thanks for providing the interesting work. It gives me new insight on the quantization.

In my previous reading of quantization work, weight normalization is a common trick for performance enhancement. By weight normalization, it means the latent weight should be substruct the mean and divided by the std before sent to be quantized. Based my understand of the RobustQuantization, the weight normalization aims to let the Kt to be zero rather 1.8. In practical, it improved the quantization performance on many tasks.

From Figure 3(b) in this paper, accuacry went higher with the decrease of Kt. I wonder if you have try any Kt lower than 1.8. I am just very curious whether it shares some common benefit with weight normalization.

Thanks

Peng

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant