Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MW display of <shortlong/> markup #353

Open
funderburkjim opened this issue May 9, 2017 · 6 comments
Open

MW display of <shortlong/> markup #353

funderburkjim opened this issue May 9, 2017 · 6 comments

Comments

@funderburkjim
Copy link
Contributor

funderburkjim commented May 9, 2017

This issue was raised recently by Gemonat, a frequent contributor of corrections to MW.

Case 23694: 05/04/2017 dict=MW, L= 110540, hw=nizam, user=geymonat
old = caus. -zAmayati LB ind.p. -zAmya 
new = caus. -zAmayati/zamayati [or: -zA/amayati but I think is less clear] LB ind.p. -zAmya/-zamya [
or: -zA/amya but I think is less clear]
comment = The problem is that in the printed edition, when the causative and the absolutive of the 
causative are quoted, we find the double sign of short and long over the a, something that cannot be 
reproduced in HK unless repeating the entries as indicated in the correction.  Because both forms are 
regular I suppose it is important to make the suggested correction, otherwise when the forms with short 
a are found one cannot connect them to the causative meaning "to observe, perceive, hear, learn"

Now there actually is markup (thanks to Peter's foresight) in MW that represents these short-long vowels:

<s>-SA<shortlong/>mayati</s>    [SLP1]

But the display of MW (disp.php) currently ignores this markup; So in this case the word is rendered as
<s>-SAmayati</s>

As Geymonat points out, it is useful information that the short-vowel form <s>-Samayati</s> is also acceptable in this causal inflection.

Thus, it would be a material enhancement for us to come up with a revision to disp.php for MW that renders both short and long forms.

Maybe the easiest would be to, in effect do

<s>-Sa(A)mayati</s> 

This replacement of the preceding vowel by S(L) (short vowel(longvowel)) would work in all cases, and would make the short-long forms visible in the displays. Putting the long vowel form in parentheses seems a clearer representation than separating the two vowel forms with a forward slash 'S/L'.

By contrast, I suspect that generating two forms of the whole word-fragment might sometimes give odd results, because of all the varieties of Sanskrit word fragments in MW.

@gasyoun
Copy link
Member

gasyoun commented May 10, 2017

display of MW (disp.php) currently ignores this markup

If we add our web font, then we can show it as in the book, no need to code much. My Charter contains the needed signs and I can add more if needed.

aiu

This replacement of the preceding vowel by S(L) (short vowel(longvowel)) would work in all cases, and would make the short-long forms visible in the displays. Putting the long vowel form in parentheses seems a clearer representation than separating the two vowel forms with a forward slash 'S/L'.

S(L) is better then 'S/L', agree.

I suspect that generating two forms of the whole word-fragment might sometimes give odd results

Indeed, it needs manual verification if done so. What if Gemonat is ready to verify the results?

@drdhaval2785
Copy link
Contributor

drdhaval2785 commented May 10, 2017 via email

@drdhaval2785
Copy link
Contributor

drdhaval2785 commented May 10, 2017 via email

@funderburkjim
Copy link
Contributor Author

funderburkjim commented May 11, 2017

आवलि(ली) is better than आवलि(ई)

We have to take into account the multiple output forms for things coded as <s>X</s> in xxx.xml.
We want to display appropriately for output = Devanagari, itrans, hk,slp1, roman(=IAST).

If we didn't treat IAST output separately, then the IAST output form would be

<s>Sa<shortlong/>mayati</s> -> śa(ā)mayati which is quite readable, though admittedly less
elegant than the form @gasyoun shows above.

I think only Devanagari output would require taking the preceding consonant (or consonant cluster) into account.

@funderburkjim
Copy link
Contributor Author

Question on Charter example.

The way I read the lovely a+macron+breve example shown in the Charter Font comment above is that there is a specific
preformed character in this Charter font. Is this so?

And if so, how does one get that preformed character into a text string?

The way to generate a+macron+breve in Unicode that I know of is to use two unicode characters:

  • [LATIN SMALL LETTER A WITH MACRON][UNICODE COMBINING BREVE]

@gasyoun
Copy link
Member

gasyoun commented May 11, 2017

a specific
preformed character in this Charter font. Is this so?

Yes, in private Unicode zone. No legal Unicode way to do what MW did.

आवलि(ली) is better than आवलि(ई).

Sure, but MW's base is IAST, so I would want it to remain so. But I understand the idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants