Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syllabification issues for words with श्च #1

Open
shardulc opened this issue Apr 11, 2018 · 1 comment
Open

Syllabification issues for words with श्च #1

shardulc opened this issue Apr 11, 2018 · 1 comment

Comments

@shardulc
Copy link

These words don't get syllabified:

> transcribe निश्चित
niɕt͡ɕit̪ə
> transcribe निश्चय
niɕt͡ɕəjə
> transcribe नरश्च गजश्च
nəɹəɕt͡ɕə gəd͡ʑəɕt͡ɕə

But this one does:

> transcribe पुनःश्च
pu.ˈnəhɕ.t͡ɕə

In my opinion, the code should reach this condition but doesn't—might be a starting point for debugging.

(Note: at first I thought this was an issue relating to zero-width non-joiners, but it wasn't. Nevertheless, the presense of ZWNJs in the input gives a warning in the script. I believe Sanskrit text does not use ZWNJs but it might be good to explicitly ignore them anyway.)

@shardulc
Copy link
Author

Another note: when multiple words are inputted, if syllabification fails for one of them, then the entire input is printed unsyllabified. In my opinion, only the troublesome word should be printed unsyllabified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant