This code analyses a given Sanskrit text and gives its possible wordform analysis.
e.g. किन्तु
would be analysed as किन्तु(किन्तु-अव्ययम्-क्रियाविशेषणम्)
, शृगालः
would be analysed as शृगालः(शृगाल-प्रथमाविभक्तिः-एकवचनम्-पुंल्लिङ्गम्)
and चिन्तयेत्
would be analysed as चिन्तयेत्(चिन्त्-प्राथमिकः-विधिलिङ्-कर्तरि-एकवचनम्-प्रथमपुरुषः)
.
-
Download ZIP from the repository.
-
Extract the content to your favourite folder.
-
Put the file you want to analyse in
sanskritinput.txt
. -
Open Terminal / cmd.exe.
-
cd to your folder.
-
Type
python sanskritmark.py
and press enter. -
After the execution is over, check
analysedoutput.txt
for analysis of the text.
Currently support for sandhi and samAsas is quite premitive. We are working on its improvement.
If you want to update your database (in case Gerard updates his list), please download data from http://sanskrit.inria.fr/DATA/XML/SL_morph.tar.gz
Put the extracted XML data into the code directory. (SL_parts.xml is bigger than what github allows). If there are any preexisting XML files, overwrite.
-
sanskritmark.py is the curent code under development.
-
SL_adverbs.xml, SL_final.xml, SL_morph.dtd, SL_nouns.xml, SL_parts.xml, SL_preverbs.txt, SL_pronouns.xml and SL_roots.xml are files taken from Gerard's database.
-
suffixentryfile.py is a file for generating data entry template for various parameters in code sanskritmark.py.
-
inriaxmlparser.py was the premitive version of the code. Now abandoned.