Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How is the target vector (program rules sequence) in TreeGen created during training? #14

Open
brando90 opened this issue Jun 22, 2021 · 1 comment

Comments

@brando90
Copy link

Hi Authors,

My understaning is that TreeGen learns by predicting the rules from the target program and then computing the CE loss with the ground truth grammar rules (as a sequence). Thus, I assume you make the target program into an AST and in that parsing process you get a sequence (that is padded) indicating which rule was used. In particular to do that you need to decide on an ordering for the rules. Did you use DFS, BFS or something else for that to create the actual target rule sequence the model is going to learn from? Since there is not a unique way to create this label my assumption is that the model is "biased" to output, say, BFS generated programs. Is this correct? Where is the code that does that?

Thanks for your time!

@zysszy
Copy link
Owner

zysszy commented Jun 29, 2021

Did you use DFS, BFS or something else for that to create the actual target rule sequence the model is going to learn from?

We use the preorder traversal sequence as the target rule sequence. During inference, we always generate the left most node.

Zeyu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants