Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

interested in UTR#30 support? #4

Open
jrochkind opened this issue Jan 7, 2013 · 0 comments
Open

interested in UTR#30 support? #4

jrochkind opened this issue Jan 7, 2013 · 0 comments

Comments

@jrochkind
Copy link

There was a UTR#30 for 'ascii folding'. While it's been withdrawn as part of the Unicode standard, many people find it useful anyway -- for instance Solr/Lucene still supports it with their ICUFoldingFilterFactory

Here are what I think are the relevant unicode ".txt" source files with mappings to implement UTR#30, from the lucene source: https://github.com/apache/lucene-solr/tree/trunk/lucene/analysis/icu/src/data/utr30

I note that unicode_utils uses these same unicode .txt mapping source files as definitions to implement the parts of unicode it does implement.

So that would probably make it pretty feasible to do UTR#30 too. Even though it's not part of unicode, some people are still finding it useful and have need of it (including myself).

Are you interested in unicode_utils supporting UTR#30? I could try to create a pull request, although it would take me a while to figure out what to actually do with the .txt mapping definition files to fit them into unicode_util properly, it's possible you could do it in only a few minutes if you were interested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant