-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High perplexity and FoSS adjustment in GPT-2 kNN-LM #17
Comments
Hi,
Thank you for your interest in our work!
Did you try running with our datastore, our index etc? (Without tuning
anything)
…On Wed, Dec 4, 2024 at 11:24 He Li ***@***.***> wrote:
Hi @urialon <https://github.com/urialon> !
Thank you for your excellent work and the resources provided in this
repository.
I am currently trying to replicate the kNN-LM results for GPT-2 but have
encountered an issue with the perplexity being higher than expected. Using
the default hyperparameters, I obtained a PPL of 17.34 vs 12.57.
What have I tried? Tuning k, knn_temp, and lambda:
Adjusting these hyperparameters improved the PPL to some extent. For
example, setting knn_temp to 50 resulted in a PPL of 14.84. Despite trying
numerous combinations of these hyperparameters, I could not achieve the
reported PPL of 12.57. Could you share the specific hyperparameter settings
that were used to obtain this result?
Modifying the Fraction of Saved Searches (FoSS):
In the README, I noticed that changing the FoSS value significantly
impacts kNN-LM performance. However, in traditional kNN-LM implementations,
a full kNN search is conducted for every token, which seems to make the
FoSS value fixed. I am unsure how to adjust the FoSS value effectively in
this context. Could you provide any guidance or examples on how to modify
it to optimize performance?
_20241205002258.png (view on web)
<https://github.com/user-attachments/assets/3f06a789-05fd-41a4-8ce4-a5dd289880c9>
Thank you!
—
Reply to this email directly, view it on GitHub
<#17>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSOXMHYBYPJV2IUVAR4FOD2D4UELAVCNFSM6AAAAABTARXKKGVHI2DSMVQWIX3LMV43ASLTON2WKOZSG4YTQMRUGQ2TQMA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Thank you for your reply! Yes, I have tried running with both your datastore and index, and I am able to achieve the same PPL as yours with both the base model and RetoMaton. |
So where is the problem?
…On Wed, Dec 4, 2024 at 11:34 He Li ***@***.***> wrote:
Thank you for your reply!
Yes, I have tried running with both your datastore and index, and I am
able to achieve the same PPL as yours with both the base model and
RetoMaton.
—
Reply to this email directly, view it on GitHub
<#17 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSOXMGWCXNSI7TIFETRBPD2D4VJPAVCNFSM6AAAAABTARXKKGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMJXHE3DQMRUG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
The issue is with the GPT-2 kNN-LM's PPL. Without changing any hyperparameters, I get a PPL of 17.34 compared to your reported 12.57, which is significantly higher. (Although for the base model and RetoMaton, I do achieve results consistent with yours.) |
But it is happening only when you create the datastore, right?
With my datastore it works fine?
…On Wed, Dec 4, 2024 at 20:13 He Li ***@***.***> wrote:
The issue is with the GPT-2 kNN-LM's PPL. Without changing any
hyperparameters, I get a PPL of 17.34 compared to your reported 12.57,
which is significantly higher. (Although for the base model and RetoMaton,
I do achieve results consistent with yours.)
—
Reply to this email directly, view it on GitHub
<#17 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSOXMAIXCOR337FWPIZPQD2D6SCVAVCNFSM6AAAAABTARXKKGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMJYHA3TKNJTGI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi @urialon !
Thank you for your excellent work and the resources provided in this repository.
I am currently trying to replicate the kNN-LM results for GPT-2 but have encountered an issue with the perplexity being higher than expected. Using the default hyperparameters, I obtained a PPL of 17.34 vs 12.57.
What have I tried?
Tuning k, knn_temp, and lambda:
Adjusting these hyperparameters improved the PPL to some extent. For example, setting knn_temp to 50 resulted in a PPL of 14.84. Despite trying numerous combinations of these hyperparameters, I could not achieve the reported PPL of 12.57. Could you share the specific hyperparameter settings that were used to obtain this result?
Modifying the Fraction of Saved Searches (FoSS):
In the README, I noticed that changing the FoSS value significantly impacts kNN-LM performance. However, in traditional kNN-LM implementations, a full kNN search is conducted for every token, which seems to make the FoSS value fixed. I am unsure how to adjust the FoSS value effectively in this context. Could you provide any guidance or examples on how to modify it to optimize performance?
Thank you!
The text was updated successfully, but these errors were encountered: