You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for providing the interesting work. It gives me new insight on the quantization.
In my previous reading of quantization work, weight normalization is a common trick for performance enhancement. By weight normalization, it means the latent weight should be substruct the mean and divided by the std before sent to be quantized. Based my understand of the RobustQuantization, the weight normalization aims to let the Kt to be zero rather 1.8. In practical, it improved the quantization performance on many tasks.
From Figure 3(b) in this paper, accuacry went higher with the decrease of Kt. I wonder if you have try any Kt lower than 1.8. I am just very curious whether it shares some common benefit with weight normalization.
Thanks
Peng
The text was updated successfully, but these errors were encountered:
Hi,
Thanks for providing the interesting work. It gives me new insight on the quantization.
In my previous reading of quantization work, weight normalization is a common trick for performance enhancement. By weight normalization, it means the latent weight should be substruct the mean and divided by the std before sent to be quantized. Based my understand of the RobustQuantization, the weight normalization aims to let the Kt to be zero rather 1.8. In practical, it improved the quantization performance on many tasks.
From Figure 3(b) in this paper, accuacry went higher with the decrease of Kt. I wonder if you have try any Kt lower than 1.8. I am just very curious whether it shares some common benefit with weight normalization.
Thanks
Peng
The text was updated successfully, but these errors were encountered: