-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IUXray: train test validation split affecting token/id mappings #7
Comments
Hi, I'm sorry to bother you. May I ask you some questions about the code in R2Gen? the return is |
Hi, generally speaking you are right, the vanilla transformer uses the post-LN as what you described. However, as in paper On Layer Normalization in the Transformer Architecture, the position of Layer Normalization in Transformer implementation is used as pre-LN and post-LN. For example, Transformer Encoder-based BERT uses post-LN, but Vision Transformer uses pre-LN. Hope this can help you figure out the problem. Maybe this is a modification by the R2Gen authors to make the training more stable possibly according to their experiments. I am not sure and you may need to discuss this with the R2Gen authors to gain the details. |
Thank you very much. With your reply, I can completely understand. |
Hi! I'm sorry to bother you again! I hope to get your help.
|
Hello, I've attempted to reproduce the results seen in the report however after using your weights, I've been getting a shape error that is due to the size of the word mappings (token_to_id and id_to_token). I've made my own annotations.json file to split the data (70-10-20 as indicated in the paper) however the random split is causing a difference in how many words get embedded into meaningful tokens and how many are embedded as . May I ask how you split the data and if it's possible to have access to the original annotations.json file?
The text was updated successfully, but these errors were encountered: