-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Printed Book Categories Lost in OCR? #1
Comments
They are, @gasyoun! Here are the lines from mci.txt
But they are just remaining as section (?) names |
So they are and are not at the same time @Andhrabharati |
The entries in section 1.1 go from Then there is a section Then there is Section 1.2, with entries Probably etc., etc. I guess the suggestion is that for each entry within a section, we add some markup indicating One use of such a tag could be to add some text to the display of each entry. |
Exactly.
As in the original book - in bold and above all? |
The second one looks good enough; but adding the section number 1. probably makes it "complete". And similar approach could be applied to "the numbered appendices" in few works (like IEG, AP90, AP etc.) as well. |
We could also use similar approach for 'VN' (Corrections) in various dictionaries. |
To solve sanskrit-lexicon/MWS#12 I need help from abroad.
Are the MCI categories from the book kept in the OCR metadata as well?
akarkara 001-a should belong to "1.1. Names of Serpents"
agastyasya 507-a+ 36 should belong to "1.5A Names of Villages"
2387 does not tell nothing, but does 788-a+ 39 tell us enough? A stands for vol. 1? +39 stands for what?
If the data is lost, I can write out the L numbers that belong to each of MCI printed book categories, if we can fit the data in the metadata. Otherwise how can I get a list of all the snakes or of all sages?
The text was updated successfully, but these errors were encountered: