Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyphens and other non-alphabetic characters as legit parts of words treated incorrectly as word boundaries (Bugzilla Bug 2641) #41

Open
albbas opened this issue Nov 16, 2019 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@albbas
Copy link
Contributor

albbas commented Nov 16, 2019

This issue was created automatically with bugzilla2github

Bugzilla Bug 2641

Date: 2019-11-16T02:44:29+01:00
From: @arppe@ualberta.ca
To: Børre Gaup <<borre.gaup>>

Last updated: 2019-11-16T02:44:29+01:00

@albbas
Copy link
Contributor Author

albbas commented Nov 16, 2019

Comment 13816

Date: 2019-11-16 02:44:29 +0100
From: @arppe@ualberta.ca

The last time I checked MS Word on Windows (on Sjur Mac running a Windows OS), the demo crk spell checker (or Word) treated hyphens as word boundaries, when in actual fact hyphens are an integral part of well-written SRO crk words.

Examples of words that should be recognized:

a. Non-hyphenated:
êkota
êwako
ispîhk
kistapinânihk
mistahi
mâna
namôya
nitiskonikanihk
ohci
ohpimê

b. Hyphenated
kâ-kî-awâsisîwiyân
nikî-nitawi-kiskinwahamâkosin
kâ-kî-nitawi-kiskinwahamâkosiyân
ê-kî-itohtahikawiyân
nikî-kitimâkihikawinân
niwî-âtotên
niwî-âcimâwak

While this applies to crk, there are similar issues in e.g. Mohawk, where the colon ':' should be allowed as an integral part of a word (denoting long phonemes).

Sjur tells me that this might have been resolved generally with the Divvun speller engine (?) using the character set of the speller FST as a basis for defining what words are. Nevertheless, I'm reporting this as an explicit issue so that the previous incorrect behavior is registered and that there are example cases to check that it has been properly resolved (now and later on).

@albbas albbas transferred this issue from giellalt/bugzilla-dummy Sep 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants