Step over surrogate pairs on zero-lenth matches (fixes #100134) #100482
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes #100134
Problem lies in fact that regexp with
u
flag will changelastIndex
to the start of surrogate pair if started after high pair:Which ended attempts to step forward for zero-length matches in endless loop.
This PR takes into account surrogate pairs and propagates
lastIndex
accordingly.Endless loop during find matches is fixed in this PR.
Not in this PR:
In some cases (for files with a lot of matches) it's impossible to use Find next button to get to the next match after surrogate pair. Can give a try in this PR if needed, but would prefer to leave it out of scope of this PR
(new fallback logic not triggered in that case, but should use similar strategy - remember last position and advance for zero-length matches, but this should survive
reset
)What do you think?
Inconsistency of handling matches of for surrogate pairs between JS RegExp's with
u
flag and ripgrepI can create issue with more details but simple example will be for file with 1💻 replace using RegExp
1.
- Find in files breaks surrogate, while replace in editor replaces whole emoji