Fix regexp mutation p{Latin} #1234

mbj · 2021-05-03T13:49:19Z

Fixes the p{Latin} regexp constructs that used to crash mutant as reported in #1231.

Also refactor the Mutant::Registry class to allow use case specific default behavior, making it easier to address incomplete mappings in the future.

* Before the default was to return the generic node mutator on lookup misses. * This is fine for the main mutation registry, but not the external regexp ast transformations. * This change allows the default to be instance specific, allowing the transform registry to fail on unknown nodes rather than to return the generic mutator that hides the real issue.

- This is up for early review because I'm not sure about the dynamic creation of the table of unicode properties. I tried just creating a list of them but it was so slow for my editor to process that I couldn't even format the giant lookup table. I suspect that if we want to "bake" these to avoid however long it takes to compute the table and maybe avoid any unexpected drift, it might make sense to dump to YAML or something like that. I'm not sure the best approach. - I'm also guessing there's a better option than just dumping all the regexp node types in the other list of supported regexp nodes. - We probably should do this for other regex types--we might be missing some of the posix classes, for instance (I have not checked yet). - Prevents crashes when having an unsupported property type in source. - Related to #1234 (which was a very partial fix) - Note that this turns our `\p{Latin}` formatting into `\p{latin}`. We could fix this with some very simple inflection but I wanted to do the simplest approach first to demonstrate the problem since this seems to be semantically equivalent. The ruby docs use the uppercase form. I have a text file from the upstream regex toolkit that we could use to confirm inflection rules if we want to.

mbj force-pushed the fix/regexp-mutation branch from 9769ff0 to ae251bb Compare May 3, 2021 13:49

Fix regexp latin property mapping

e3edac2

mbj force-pushed the fix/regexp-mutation branch from ae251bb to e3edac2 Compare May 3, 2021 13:50

mbj force-pushed the fix/regexp-mutation branch from b1edf4e to 9a1e505 Compare May 3, 2021 14:23

Change version to 0.10.31

58f5455

mbj force-pushed the fix/regexp-mutation branch from 4c4f444 to 58f5455 Compare May 3, 2021 14:35

mbj changed the title ~~Fix regexp mutation~~ Fix regexp mutation p{Latin} May 3, 2021

mbj merged commit db7167f into master May 3, 2021

mbj deleted the fix/regexp-mutation branch May 3, 2021 14:48

dgollahon mentioned this pull request Nov 7, 2021

[WIP] Fix unicode property support #1278

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix regexp mutation p{Latin} #1234

Fix regexp mutation p{Latin} #1234

mbj commented May 3, 2021 •

edited

Loading

Fix regexp mutation p{Latin} #1234

Fix regexp mutation p{Latin} #1234

Conversation

mbj commented May 3, 2021 • edited Loading

mbj commented May 3, 2021 •

edited

Loading