-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Unicode regex util #5019
Comments
@mlewand / @Reinmar any insights on this could help me with closing https://github.com/ckeditor/ckeditor5-mention/issues/44 and https://github.com/ckeditor/ckeditor5-mention/issues/44 and probably some part of #1151 for @oleq. |
ps.: The full list of unicode characters in categories is here: https://www.unicode.org/Public/12.0.0/ucd/extracted/DerivedGeneralCategory.txt. |
ps.: There's a
So it does not provide the unicode category sequences like Ref: |
There has been no activity on this issue for the past year. We've marked it as stale and will close it in 30 days. We understand it may be relevant, so if you're interested in the solution, leave a comment or reaction under this issue. |
We've closed your issue due to inactivity over the last year. We understand that the issue may still be relevant. If so, feel free to open a new one (and link this issue to it). |
When we start digging into the RTL support or better support of other languages the problem with some regexes surfaces again and again.
ATM we need better regexes to:
All the groups are defined in unicode standard.
I'm not sure if we need all categories right now (thus it might be helpful).
Now we now that there's already a library that adds support for groups (and other features not present in JS RegExp engine) called xregexp. It already defines those categories. The lib looks useful to me but it have all the typical downsides of external libraries:
The xRegExp library compiles to native JS RegExp so only little overhead is added when creating a regexp.
If not using library we need to create an util that will provide set of characters that meets our needs and which can be used with RegExp engine:
The text was updated successfully, but these errors were encountered: