Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abbreviations in PWG #197

Open
funderburkjim opened this issue Dec 20, 2017 · 6 comments
Open

Abbreviations in PWG #197

funderburkjim opened this issue Dec 20, 2017 · 6 comments
Labels
Documentation How TXT , XML work enhancement New website features

Comments

@funderburkjim
Copy link
Contributor

As mentioned in #190, the abbreviations of PWG have not, for the most part, yet been added to the digitization markup.

Such an addition would certainly be an enhancement.

The first task is to identify the main abbreviations and their expansions in pwg.

funderburkjim added a commit to sanskrit-lexicon/PWG that referenced this issue Dec 20, 2017
Based on similar file for pw dictionary. See
sanskrit-lexicon/COLOGNE#197
@funderburkjim
Copy link
Contributor Author

Many common abbreviations used in pwg will be similar to those used in PW. This thought prompted me to make
a prototype of the master abbreviation file based on the similar file for PW.
This pwg abbreviation starter file is pwgab_prelim.txt.

There are some semantically minor but programmatically significant differences which need to be ironed out even for abbreviations used in both dictionaries. For instance, I think pwg uniformly puts a space in 'v. l.' while pw digitization does not. There are also some abbreviations (like 'a. a. O') which occur often in pwg which are not part of the preliminary list based on pw.

Once we get a good list of pwg abbreviations (i.e. by modifying the pwgab_prelim file), then we can work on developing a program to accurately mark the abbreviations in the digitization.
There are some difficulties here; I'm thinking of the multiple uses of the 'N.' abbreviation -- sometimes a literary source abbreviation (Nala), sometimes abbreviating 'Nomen' (but there is also 'N. pr.' which maybe should be considered one abbreviation rather than two), and possibly abbreviating 'Note', again within literary source cases. So there is some delicacy required in applying the markup accurately throughout the dictionary.

@funderburkjim funderburkjim added the enhancement New website features label Dec 20, 2017
@gasyoun
Copy link
Member

gasyoun commented Dec 20, 2017

I'm fully yours.

N.ag.

Nomen agentis = Nom.ag.

'N. pr.' which maybe should be considered one abbreviation

Agree

siehe unten?/unter?

siehe unten
https://de.wiktionary.org/wiki/s._u.

@gasyoun
Copy link
Member

gasyoun commented Apr 18, 2021

If https://github.com/sanskrit-lexicon/PWG/blob/master/pwg_ls1/pwgab_prelim.txt is the latest (would be great to have a list of all the latest abbreviation lists in one place enlisted)

To be added @funderburkjim :

  • <lex>Padap.</lex>Padapāṭha
  • <lex>voc.</lex> additional to <ab>Voc.</ab>
; the expansions of abbreviation 
; preliminary 10/6/2017. Expansions supplied by Thomas Malten.
; Abbreviation     count     German   English
<ab>Abl.</ab>    1364  Ablativ - ablative (case)
<ab>Absol.</ab>     379 Absolutiv - absolutive (case)
<ab>Acc.</ab>    4719  Accusativ - accusative (case)
<ab>Act.</ab>     618  Activ - active 
<ab>Adv.</ab>    3894  Adverb - adverb
<ab>adv.</ab>      41  adverbial? - adverbial
<ab>Aor.</ab>      75  Aorist - aorist
<ab>Bein.</ab>     271 Beiname - epithet, by name
<ab>Bed.</ab>    1   Bedeutungen(?) - meanings
<ab>Beinn.</ab>     271 Beinamen - epithets, by names
<ab>Caus.</ab>    2418  Causativ - causative
<ab>Comm.</ab>    2219 Commentar? - commentary
<ab>Conj.</ab>       1 Conjunctiv? - subjunctive (mood)
<ab>dass.</ab>      70  dasselbe - the same
<ab>Dat.</ab>     957  Dativ - dative (case)
<ab>Denom.</ab>       1  Denominativ? - denominative
<ab>Desid.</ab>     473  Desiderativ - desiderative
<ab>Du.</ab>     467  Dual - dual 
<ab>Gen.</ab>    1921 Genitiv - genitive (case)
<ab>Inf.</ab>       3  Infinitiv - infinitive
<ab>Infin.</ab>     452  Infinitiv - infinitive
<ab>Instr.</ab>    2163  Instrumental - instrumental (case)
<ab>Intens.</ab>     375 Intensiv - intensive
<ab>Interj.</ab>      97 Interjection - interjection 
<ab>Loc.</ab>    2690  Locativ - locative case
<ab>Med.</ab>    1295  Medizin - medicine
<ab>Metron.</ab>     294 Metronym - metronym
<ab>N.</ab>      70  Namen? - name
<ab>N.ag.</ab>      39   ? - ?
<ab>N.pr.</ab>       1 Nomen proprium? - proper noun
<ab>Nom.</ab>     553 Nomen? - noun
<ab>Nom.abstr.</ab>    2597 Nomen abstractum - abstract noun
<ab>Nom.ag.</ab>     903  Nomen agentis - nomen agentis
<ab>Nomin.</ab>     182  Nominativ - nominative (case)
<ab>Partic.</ab>     621 Participium - participle
<ab>Pass.</ab>       6  Passiv - passive
<ab>Patron.</ab>    1998 Patronym - patronym
<ab>Patronn.</ab>    3 Patronymen - patronyms
<ab>Pl.</ab>    4514  Plural - plural
<ab>Potent.</ab>       1 Potentialis - potential mood
<ab>Präp.</ab>       1  Präposition - preposition
<ab>s.u.</ab>    1171 siehe unten?/unter? - see below/under
<ab>Sg.</ab>     734  Singular - singular
<ab>Subst.</ab>     301 Substantiv - noun
<ab>v.l.</ab>      52  vide licet - that us to say, namely
<ab>Vgl.</ab>     107 Vergleiche - compare
<ab>Voc.</ab>      57 Vocativ - vocative (case)
;<is>gaṇa</is>     603  --
<lex>Adj.</lex>   42801 Adjectiv - adjective
<lex>adj.</lex>    1847  adjectivisch - adjective
<lex>f.</lex>   35219 femininum - feminine
<lex>Indecl.</lex>      81 Indeclinabilum - indeclinable
<lex>m.</lex>   54315 masculinum - masculine
<lex>n.</lex>   29307 neutrum - neuter

voc

@funderburkjim
Copy link
Contributor Author

The current list of abbreviations are under https://github.com/sanskrit-lexicon/csl-pywork/tree/master/v02/distinctfiles.

For example the general abbreviations for pwg are
in https://github.com/sanskrit-lexicon/csl-pywork/blob/master/v02/distinctfiles/pwg/pywork/pwgab/pwgab_input.txt

And the literary source abbreviations for pwg are in
https://github.com/sanskrit-lexicon/csl-pywork/blob/master/v02/distinctfiles/pwg/pywork/pwgauth/pwgbib_input.txt.

Other dictionary filenames are similar, but not identical.

You could do some aggregation of these files.

@gasyoun
Copy link
Member

gasyoun commented Apr 23, 2021

You could do some aggregation of these files.

Sounds like a plan.

So if there is no abbreviation file in https://github.com/sanskrit-lexicon/csl-pywork/tree/master/v02/distinctfiles/yat/pywork there is none, right?

@funderburkjim
Copy link
Contributor Author

Right!

@gasyoun gasyoun added the Documentation How TXT , XML work label Apr 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation How TXT , XML work enhancement New website features
Projects
None yet
Development

No branches or pull requests

2 participants