-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BHS verbforms #260
Comments
Let me tell you I've made my own research 2 years ago. BHS similarly as EWA and KEWA quotes roots as -ti and -te forms. I'm no fan of it, but still.
06.01.2014. http://yadi.sk/d/S2LkEfxDFZtqo "-te" Most of Verbs in Schwarz's list marked as E (=BHS) are sopasarga roots. We do not change that. We do not cut upasargas off. We leave them as they are. False positives (have to be cleaned out manually before rules apply): Cutting Rules: |
Gana and the terminals These are the major chopping rules. |
Thanks, @drdhaval2785 - I guess it will kill some of the false positive ones. If we check before apply |
@gasyoun In the list I made, these false positives have been pretty thoroughly weeded out. Is the objective of your 'chopping' to know, for example, that 'anucalati' in BHS would correspond to If so, this should be doable by a program that (a) removes the prefixes (eg removes 'anu' from Is this the kind of analysis you are interested in? |
As well, right. Yes, I'm interested in such analysis. For it maybe even cutting of of upasargas and upasarga combinations would not be needed. Because MW has all the upasargas in it as part of the word. What would be interesting is would be to generate the list of PWG verbs with upasargas - because PWG's nest style makes it impossible to know how many forms are there actually related to verbs. Is there a list of your false positives? |
The program (verbs1.py) generating the list is fairly simple. The false positives occur in two ways:
The program also excludes any headword that does NOT end in 'ati' or 'ate'. If there are verbs in BHS that end in some other way (which I doubt), then these would be silently excluded. |
None I guess. Thanks for the detailed as usual comment. |
A study was made of the headwords of the BHS dictionary to identify verbs and verbforms.
This was motivated by the whitelisting work being done as mentioned in #254.
It was noticed that many of the otherwise unidentified headwords occurring only in the BHS dictionary (and in no other dictionary as a headword) were verb forms (such as third person singular of some conjugation of the verb). In connection with the whitelisting, it is felt that the spelling correctness of these verb forms should take into account the fact that they are inflected forms.
Of course, there is independent interest in lists of verbs, unrelated to the whitelisting objective.
The program and results are in the dictionaries/BHS/verbs directory of this repository.
A brief description of the files in this directory.
The text was updated successfully, but these errors were encountered: