Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need help understanding the labels of the parser model #104

Open
sujoung opened this issue Sep 20, 2023 · 0 comments
Open

Need help understanding the labels of the parser model #104

sujoung opened this issue Sep 20, 2023 · 0 comments

Comments

@sujoung
Copy link

sujoung commented Sep 20, 2023

Hello! Firstly I have to say that I love this project. Really helping me exploring syntax of different kinds of text. So thank you so much!

I have a question regarding tagsets. I am using swedish model, and few years back, I remember it used to be based on Swedish treebank tagset called Mamba. But it seems like it has been changed in the new version (benepar-sv2).

I tried to print what kind of labels have been used to train the core model, and I got these results.

>>> parser._parser.label_vocab
{'': 0,
 'AP': 1,
 'AP::AP': 2,
 'AP::XP': 3,
 'AVP': 4,
 'AVP::XP': 5,
 'NP': 6,
 'NP::AP': 7,
 'NP::NP': 8,
 'NP::NP::AP': 9,
 'NP::NP::NP::NP::XP': 10,
 'NP::NP::S': 11,
 'NP::NP::VP': 12,
 'NP::PP': 13,
 'NP::S': 14,
 'NP::XP': 15,
 'NP::XP::NP': 16,
 'NP::XP::S': 17,
 'PP': 18,
 'PP::AVP': 19,
 'PP::AVP::XP': 20,
 'PP::NP': 21,
 'PP::XP': 22,
 'PSEUDO': 23,
 'S': 24,
 'S::AVP': 25,
 'S::NP': 26,
 'S::NP::NP': 27,
 'S::NP::NP::NP::NP': 28,
 'S::NP::S': 29,
 'S::NP::XP': 30,
 'S::NP::XP::S': 31,
 'S::PP': 32,
 'S::PP::NP': 33,
 'S::S': 34,
 'S::S::NP': 35,
 'S::S::NP::NP': 36,
 'S::VP': 37,
 'S::XP': 38,
 'VP': 39,
 'VP::AP': 40,
 'VP::PP': 41,
 'VP::S': 42,
 'VP::VP': 43,
 'VP::XP': 44,
 'XP': 45,
 'XP::AVP': 46,
 'XP::NP': 47,
 'XP::PP': 48,
 'XP::S': 49}
>>> parser._parser.tag_vocab
{'AB': 1,
 'DT': 2,
 'HA': 3,
 'HD': 4,
 'HP': 5,
 'HS': 6,
 'IE': 7,
 'IN': 8,
 'JJ': 9,
 'KN': 10,
 'MAD': 11,
 'MID': 12,
 'NN': 13,
 'P': 14,
 'PAD': 15,
 'PC': 16,
 'PL': 17,
 'PM': 18,
 'PN': 19,
 'PS': 20,
 'RG': 21,
 'RO': 22,
 'SN': 23,
 'UNK': 0,
 'UO': 24,
 'VB': 25}

What is the difference between NP::NP::S and S::NP::NP ?

Screenshot 2023-09-20 at 10 45 57

In this example ( In English: Hello, I am a banana)
There is a S (simple declarative clause) which has 2 NPs as children. Would this be NP::NP::S or S::NP::NP ? And what is happening with AUX? Because, for me it is hard to think about any structure where S has only 2 NPs. Because at least one VP is required to become a S.

Also, general question: I saw from #30 that you are using this for training: http://surdeanu.cs.arizona.edu//mihai/teaching/ista555-fall13/readings/PennTreebankConstituents.html Is it same for Swedish model and other language's models? For example unlike English model, I see there is no FRAG in labels for Swedish models. Is this because of the nature of the language itself? Or did you use different label set for different languages?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant