Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Links wrongly rendered due to space between the digits #130

Closed
Andhrabharati opened this issue Mar 7, 2022 · 20 comments
Closed

Links wrongly rendered due to space between the digits #130

Andhrabharati opened this issue Mar 7, 2022 · 20 comments
Labels

Comments

@Andhrabharati
Copy link
Contributor

@funderburkjim

See the entry "akratu", as an example-

akratu

The link is going to RV.x.8,3, which no way is connected to akratu word.
The link should go to RV.x,83,5 instead; by removing the space between 8 and 3 in the mw.txt

Noticed ~100 such cases, that need correction in the digitisation.

@gasyoun
Copy link
Member

gasyoun commented Mar 8, 2022

removing the space between 8 and 3 in the mw.txt

Interesting, never noticed before.

@funderburkjim
Copy link
Contributor

@Andhrabharati If you have a list of these, please provide,
or else provide the search regex(es) you use.
In a first look, I'm only finding 10 or so similar to your example above.

10 matches for "RV. [xvi]+, [0-9]+ [0-9]+, [0-9]" in buffer: mw.txt

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Mar 8, 2022

I looked for "space between digits" [0-9] [0-9], not just for RV link cases.

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Mar 9, 2022

Here is the extracted search result-
space between digits.txt

Incidentally there are some <pc> lines as well in this!

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Mar 9, 2022

Also there are quite many cases where a number is outside the <ls>...</ls> tag, which needs to be within the tag.

I used the regex </ls>[;\.,:] [0-9] to get them.

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Mar 9, 2022

@funderburkjim

just checked that PWG also has about 90 cases of "space between digits".

@funderburkjim
Copy link
Contributor

@Andhrabharati Thanks for alerting me of these problems. Will attend to them.

@gasyoun gasyoun added the bug label Mar 10, 2022
@gasyoun
Copy link
Member

gasyoun commented Jun 16, 2022

citta

Empty source, first tima I see such an effect.

@gasyoun
Copy link
Member

gasyoun commented Jun 18, 2022

@funderburkjim can't figure out what's wrong here

rvdd

cxzxzxccc

@Andhrabharati
Copy link
Contributor Author

He seems to have looked ONLY for the number pattern "Roman, IA, IA".

If the punctuation mark or space is different in that "block", he is not taking it as the 'Rigveda link'.

@funderburkjim
Copy link
Contributor

The current markup and link logic works for verses only; e.g. <ls>RV. x, 10, 5 for verse 5 of hymn x,10.

The yama example, by contrast, should be interpreted as a reference to two hymns
(Rv. x 10 AND RV x 14), with no verse specified.

the supposed author of <ls>RV. x, 10; 14</ls>, of a hymn to <s1 slp1="vizRu">Viṣṇu</s1> and of a law-book;

Perhaps the display program (basicadjust.php) can be extended to generate links for examples like <ls>RV. x, 10</ls>.

The other aspect of this yama example is that two references implied, and that the semicolon (the semicolon between 10 and 14) is used, in MW, to separate the two references.

A search for semicolons within RV references in mw.txt yields:

623 matches for "<ls>RV\. [xiv]+,[^<]*;" in buffer: mw.txt

All of these need to be examined and recoded where possible so that multiple links will be available. For instance, changes such as the following are desireable:

OLD
<ls>RV. i, 139, 1; iv, 44, 5.</ls>
NEW
<ls>RV. i, 139, 1</ls>; <ls n="RV">iv, 44, 5.</ls>

Work will be carried out with an aim to improve the markup and display in these dimensions.

@gasyoun
Copy link
Member

gasyoun commented Jun 29, 2022

Work will be carried out with an aim to improve the markup and display in these dimensions.

I give you my thanks.

623

Is a lot and not at the same time. I see them a lot!

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Jul 1, 2022
funderburkjim added a commit that referenced this issue Jul 1, 2022
funderburkjim added a commit that referenced this issue Jul 1, 2022
Apparently, someone made changes to MWS independent of RV MARKUP #130 work.
@funderburkjim
Copy link
Contributor

The RV ls markup improvements mentioned above have been completed in MW.
The work is done in mwauthorities/ls/20220628-rv.

  • lsextract_mw_0.txt shows counts of all ls markup for MW, before the changes
  • lsextract_mw.txt shows the counts after the changes
    • the main difference is for 'RV.': 16318 instances marked before, and 17754 marked after changes
  • change_1.txt provides the sequence of line changes.
  • lsextract_RV_1.txt shows, by entry, all the RV markup after the changes.
    • a small number (5) are marked as 'ABNORMAL'.

Most of the changes are as predicted in the above comment. But several were
typos with errors in spacing or punctuation.
And a small number of errors involved homonym markup. For instance:

; <L>100473<pc>512,3<k1>Darmakft
; <ls>RV. viii.87, 1.2.</ls>
; <ls>RV. viii.87, 1.2.</ls>    <<< That 2.  should be the homonym number of 'next' entry
338230 old <hom>3.</hom> <s>Da/rma</s> ¦ in <ab>comp.</ab> for <s>°man</s> <ab>q.v.</ab> 2.   <<< DROP the 2.
338230 new <hom>3.</hom> <s>Da/rma</s> ¦ in <ab>comp.</ab> for <s>°man</s> <ab>q.v.</ab>
;  CHANGE the <h> value in next entry
338232 old <L>100473<pc>512,3<k1>Darmakft<k2>Da/rma—kft<h>b<e>3
338232 new <L>100473<pc>512,3<k1>Darmakft<k2>Da/rma—kft<h>2<e>3
; and similarly remark as hom 2.
338233 old <s>Da/rma—kft</s> <hom>b</hom> ¦ <lex>m.</lex> maintainer of order 
(<s1 slp1="indra">Indra</s1>), <ls>RV. viii.87, 1.2.</ls><info lex="m"/>  <<<< ALSO Drop this 2.
; 
338233 new <hom>2.</hom> <s>Da/rma—kft</s> ¦ <lex>m.</lex> maintainer of order (<s1 slp1="indra">Indra</s1>), <ls>RV. viii, 87, 1.</ls><info lex="m"/>
; and do similar change for Darmavat: b change to 2
338235 old <L>100474<pc>512,3<k1>Darmavat<k2>Da/rma—vat<h>b<e>3
338235 new <L>100474<pc>512,3<k1>Darmavat<k2>Da/rma—vat<h>2<e>3
; change hom markup
338236 old <s>Da/rma—vat</s> <hom>b</hom> ¦ (<s>Da/rma</s>) <lex>mfn.</lex> accompanied by <s1 slp1="Darman">Dharman</s1> or the law (<s1 slp1="aSvin">Aśvin</s1>s), <ls>viii, 35, 13.</ls><info lex="m:f:n"/>
338236 new <hom>2.</hom> <s>Da/rma—vat</s> (<s>Da/rma</s>) 
<lex>mfn.</lex> accompanied by <s1 slp1="Darman">Dharman</s1> or the law (<s1 slp1="aSvin">Aśvin</s1>s), <ls n="RV.">viii, 35, 13.</ls><info lex="m:f:n"/>
;; other homonyms of Darmakft and Darmavat
336540 old <L>99961<pc>510,3<k1>Darmakft<k2>Da/rma—kft<h>a<e>3
336540 new <L>99961<pc>510,3<k1>Darmakft<k2>Da/rma—kft<h>1<e>3
;
336541 old <s>Da/rma—kft</s> <hom>a</hom> ¦ <lex>mfn.</lex> 
(2. See under 3. <s>Darma</s>) doing one's duty, virtuous, <ls>MBh.</ls><info lex="m:f:n"/>
336541 new <hom>1.</hom> <s>Da/rma—kft</s> ¦ <lex>mfn.</lex> 
(<hom>2.</hom> See under <hom>3.</hom> <s>Darma</s>) doing one's duty, virtuous, 
<ls>MBh.</ls><info lex="m:f:n"/>
;
337425 old <L>100234<pc>511,3<k1>Darmavat<k2>Da/rma—vat<h>a<e>3
337425 new <L>100234<pc>511,3<k1>Darmavat<k2>Da/rma—vat<h>1<e>3
;
337426 old <s>Da/rma—vat</s> <hom>a</hom> ¦ <lex>mfn.</lex> (2. See under 3. 
<s>Darma</s>) virtuous, pious, just, <ls>L.</ls><info lex="m:f:n"/>
337426 new <hom>1.</hom> <s>Da/rma—vat</s> ¦ <lex>mfn.</lex> (<hom>2.</hom> 
See under <hom>3.</hom> <s>Darma</s>) virtuous, pious, just, <ls>L.</ls><info lex="m:f:n"/>

funderburkjim added a commit to sanskrit-lexicon/csl-websanlexicon that referenced this issue Jul 1, 2022
funderburkjim added a commit to sanskrit-lexicon/csl-apidev that referenced this issue Jul 1, 2022
@funderburkjim
Copy link
Contributor

links for 2-parameter references

There are many mentions of hymns, with no verse specified, such as
<ls>RV. viii, 13</ls> in

356994 new <s>nA/rada</s> ¦ <lex>m.</lex> or <s>nArada/</s> <ab>N.</ab> of a 
<s1 slp1="fzi">Ṛṣi</s1> (a <s1 slp1="kARva">Kāṇva</s1> or 
<s1 slp1="kASyapa">Kāśyapa</s1>, 
author of <ls>RV. viii, 13</ls>; <ls n="RV.">ix, 104</ls>; <ls n="RV. ix,">105</ls>; 
<ls>Anukr.</ls>; 

The basicadjust.php component of the displays is now adjusted so that this 2-parameter reference generates a link to first verse of the hymn.

@funderburkjim
Copy link
Contributor

funderburkjim commented Jul 1, 2022

subtle errors still unresolved

Here is an example of a likely markup error, just noticed by accident.
Under pragATa
image

The markup is

<ls>RV. viii, 1, 2</ls>; 
<ls n="RV. viii, 1,">10</ls>; 
<ls n="RV. viii, 1,">48</ls>; 
<ls n="RV. viii, 1,">51</ls>-
<ls n="RV. viii, 1,">54</ls>

The markup looks consistent with the printed text, but it can't be right,
since there is no verse 48 (or 51 or 54) in hymn 'viii, 1'.
Maybe the markup should be hymns 1, 2, 10, 48, 51, and 54 of mandala viii ?

<ls>RV. viii, 1</ls>, <ls n="RV. viii,">2</ls>; 
<ls n="RV. viii,">10</ls>; 
<ls n="RV. viii,">48</ls>; 
<ls n="RV. viii,">51</ls>-
<ls n="RV. viii,">54</ls>

No doubt there are other similar problematic markups to identify and alter.

@funderburkjim
Copy link
Contributor

Similar review of other links in MW

Other ls abbreviations in MW with link targets should be reviewed in a manner similar to the above review of RV link. Such as AV., P.,

@funderburkjim
Copy link
Contributor

spacing issues

I think the spacing issues should have been handled.
The specific cases

  • RV. i, 37, i 2 was corrected to RV. i, 37, 12 (under 'cyu' mentioned here).
  • the error under akratu mentioned at top of this issue.
  • I did not check individually the 'space.between.digits.txt' list of above, but think the
    ls examples therein have been handled.
    • @Andhrabharati mentioned there being other kinds of errors mentioned in the file. Perhaps he would select those remaining items in a separate file for further examination by me.

@gasyoun
Copy link
Member

gasyoun commented Jul 1, 2022

The basicadjust.php component of the displays is now adjusted so that this 2-parameter reference generates a link to first verse of the hymn.

Hurray! A badly needed one around all the dictionaries and targets available.

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Jul 1, 2022

Quite a few of such RV links to the Marcis's version (which is presently being used) would not be helpful, as those links do not give any clue about the meaning/intent in the MW.

All such should be linked to some other source, as I had proposed elsewhere recently.

@gasyoun
Copy link
Member

gasyoun commented Jul 2, 2022

All such should be linked to some other source, as I had proposed elsewhere recently.

Did not get why.

would not be helpful, as those links do not give any clue about the meaning/intent in the MW.

What do you mean?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants