You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to support Unicode character sets. Unfortunately Javascript doesn't make it easy to work with Unicode character sets, so we should consider using http://xregexp.com/.
After some initial testing, it appears the xRegExp library fits the job, but there are certain decisions we need to make before properly integrating it.
The spellchecker plugin currently relies on sending a string of words to the back-end service/s. This means we need to strip punctuation from the string of text. Determining what is punctuation is the difficult part.
The RegExp library gives us the \p{P} (unicode punctuation regexp category) but we don't want to strip punctuation that forms part of a word, eg "Here's". I can't even being to decide on how to handle this for languages other than English.
I suggest we have a look at existing spellcheckers to determine how others handle this to come to a decision. The solution has to be generic for all languages.
We need to support Unicode character sets. Unfortunately Javascript doesn't make it easy to work with Unicode character sets, so we should consider using http://xregexp.com/.
After some initial testing, it appears the xRegExp library fits the job, but there are certain decisions we need to make before properly integrating it.
The spellchecker plugin currently relies on sending a string of words to the back-end service/s. This means we need to strip punctuation from the string of text. Determining what is punctuation is the difficult part.
The RegExp library gives us the \p{P} (unicode punctuation regexp category) but we don't want to strip punctuation that forms part of a word, eg "Here's". I can't even being to decide on how to handle this for languages other than English.
I suggest we have a look at existing spellcheckers to determine how others handle this to come to a decision. The solution has to be generic for all languages.
I'm making changes related to Unicode support the unicode-support branch.
Any advice or suggestions would be greatly appreciated.
The text was updated successfully, but these errors were encountered: