How is the target vector (program rules sequence) in TreeGen created during training? #14

brando90 · 2021-06-22T21:40:08Z

Hi Authors,

My understaning is that TreeGen learns by predicting the rules from the target program and then computing the CE loss with the ground truth grammar rules (as a sequence). Thus, I assume you make the target program into an AST and in that parsing process you get a sequence (that is padded) indicating which rule was used. In particular to do that you need to decide on an ordering for the rules. Did you use DFS, BFS or something else for that to create the actual target rule sequence the model is going to learn from? Since there is not a unique way to create this label my assumption is that the model is "biased" to output, say, BFS generated programs. Is this correct? Where is the code that does that?

Thanks for your time!

zysszy · 2021-06-29T08:07:10Z

Did you use DFS, BFS or something else for that to create the actual target rule sequence the model is going to learn from?

We use the preorder traversal sequence as the target rule sequence. During inference, we always generate the left most node.

Zeyu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How is the target vector (program rules sequence) in TreeGen created during training? #14

How is the target vector (program rules sequence) in TreeGen created during training? #14

brando90 commented Jun 22, 2021

zysszy commented Jun 29, 2021

How is the target vector (program rules sequence) in TreeGen created during training? #14

How is the target vector (program rules sequence) in TreeGen created during training? #14

Comments

brando90 commented Jun 22, 2021

zysszy commented Jun 29, 2021