Reintroduce Regexp mutations #1166

dgollahon · 2020-12-20T02:54:33Z

This reverts commit 21d3fef.

This was not a clean revert. Note that:

The version of regexp_parser was 1.3.0, now it is 1.8.2 to accomodate our current rubocop version and because there were some relevant bugfixes implemented between 1.3.x and 1.8.x. We should eventually move to 2.0 but it is currently incompatible with this integration. There are some issues with the frozen Regexp classes getting mutated so we may have to open an issue.
Since "expected exception" support was removed from the specs, I have had to exclude two files entirely. This seems unfortunate as it reduces our overall coverage.
Since unsupported nodes are no longer explicitly tracked, I removed the code that used to handle that for regular expressions. See: Remove yard docs for private methods #1021
I had to change the example case for where we are more permissive than regexp_parser because regexp_parser has decided to become more permissive and try to match Ruby's semantics. It was actually very hard to find a case that failed--I brute-forced 50 million regexp strings that had perfect parity of being accepted and then stumbled onto the single hex escape case by accident. See: regexp_parser rejects /\xA/ but MRI accepts it ammar/regexp_parser#75. This can be removed once we reach regexp_parser >= 2.0.1.
Added logic to skip invalid group options until we are on regexp_parser >= 2.0.1. See: Multi-byte named capture groups do not parse ammar/regexp_parser#76
Changed an access pattern for regexp mutations which became equivalent based on this: https://github.com/ammar/regexp_parser/blame/4ca7cec03b210e3e00473b7b1a7308f963190c1e/lib/regexp_parser/expression/subexpression.rb#L30-L33
I have marked several dispatch methods as private.
I have also removed the old YARD doc comments on private methods at @mbj's request.
Some other minor conflicts and small spec assertion changes were resolved as well.

spec/integrations.yml

dgollahon · 2020-12-20T03:15:18Z

lib/mutant/mutator/node/literal/regex.rb

@@ -63,6 +63,8 @@ def body_expression
          #
          # @return [Array<Parser::AST::Node>]
          def body
+            # TODO: also: kill 0...1 mutation -- how the heck does this happen?
+            # manually inserting it causes unit tests to fail.
            children.slice(0...-1)


@mbj This is the mysterious mutation that fails the unit tests if you actually insert it:

def body - children.slice(0...-1) + children.slice(0...1) end

Must be a test selection issue. Very likely the examples from the meta are tagged :regexp and mutants test selection overwrite does not catch the various regexp subtypes.

I can look into providing a better selection hint.

@dgollahon Is that still a problem?

Yeah, last I checked. If you have any suggestions here i'd appreciate it.

Oh, and another thing with this method @mbj: It fails with multiple mutations on 2.7 but only one on 2.6. I believe that's because of beginless ranges in 2.7. What's our policy for working around that? I could lift the range to a constant or add an --ignore-subject. That would also resolve this other mutation but I'm not sure if that's what you'd like me to do.

Also, I was wondering: is there a reason we don't check mutation coverage on 2.5?

@dgollahon Lets go with ignore subject for the moment.

I'll produce a version target soon.

mbj · 2020-12-20T03:38:01Z

lib/mutant.rb

+require 'mutant/ast/regexp/transformer/quantifier'
+require 'mutant/ast/regexp/transformer/recursive'
+require 'mutant/ast/regexp/transformer/root'
+require 'mutant/ast/regexp/transformer/text'


I'd love if we could get this into upstream. Having an immutable AST is a much better interface than their mutable one.

Yeah, we may want to open an issue and discuss with the maintainers. I suspect we need changes here or upstream before we can use 2.0.

lib/mutant/ast/regexp/transformer.rb

mbj · 2020-12-23T17:25:20Z

@dgollahon The only thing I'm concerned about is:

rubocop allows 1.8 and 2.0 releases of regexp_parser.
mutant will require 1.8.

Which means that as soon rubocop moves to 2.x only, which will happen in the future mutant has to catch up to 2.x ASAP. rubocops userbase is much bigger than mutants.

Do you have any idea about rubocops plans on 2.x only API usage?

Can you elaborate what the 2.x challenge for regexp_parser is?

dgollahon · 2020-12-23T20:25:46Z

Which means that as soon rubocop moves to 2.x only, which will happen in the future mutant has to catch up to 2.x ASAP. rubocops userbase is much bigger than mutants.

Do you have any idea about rubocops plans on 2.x only API usage?

Right. I don't have any specific intel on this. I doubt it would be immediately but they will probably want to upgrade in the relatively near future since there are now a few bugfixes in 2.x that I expect they would want as well. That said, they already allow 2.x so I'm not sure how eager they are to restrict the 1.x series or not.

Can you elaborate what the 2.x challenge for regexp_parser is?

I need to go back and look into this in more detail now that I have everything else working (minus that mutation) but several things broke when I upgraded and it all seemed to be around mutability that adamantium was catching. I think there must be some changes to internal state management on some of the APIs we are using. I'm sure we can come up with a workaround or open an issue (if appropriate), but I wanted to focus on getting something working first and not let the perfect become the enemy of the good. I am willing to do the work to get us to 2.x but I am unclear on how much time it will take me to do that since I have not fully debugged what is going wrong yet.

Is upgrading to 2.x a merge blocker? Or should we get this shipped and then try to make our way to 2.x within the next month or two or so? I was thinking more towards the second option but we can wait until I figure out the 2.x challenges if you want--it will just delay the feature (maybe be a small amount or maybe by a significant amount--I won't know until I invest more time exploring it).

mbj · 2020-12-23T21:35:59Z

Is upgrading to 2.x a merge blocker?

No, I mostly want to be able to quantify the risk. And ideally have issues open in regexp_parser before merging.

dgollahon · 2020-12-23T21:50:40Z

@mbj I figured out what changed and went wrong. See the second commit. I included a monkeypatch to show one possible way it could be fixed (but obviously will remove that). I suppose I will send a PR to regexp_parser and see what they think. Otherwise we will have to not freeze passive groups.

dgollahon · 2020-12-23T22:16:31Z

See: ammar/regexp_parser#77

NOTE: I have removed the commit that attempts to upgrade to regexp_parser 2.x for now pending the above change.

dgollahon · 2020-12-24T21:31:32Z

BTW, just as a bit of context around using 2.x: rubocop upgraded to 2.x and then got some complaints because various other tooling relies on 1.x. After that they relaxed the requirement even thought i causes bugs in some instances. I think some of those have since been updated (capybara) but I don't know if that would be a common complaint if we require 2+. It simplifies a little bit of handling if we require 2+ (and we need to blacklist 2.0.0 and 2.0.1 because of the freezing issue) but we could otherwise be compatible with 1.8.x and 2.x if you want us to do that. Otherwise We can require 2.x once my regexp_parser PR lands.

mbj · 2020-12-24T22:46:20Z

@dgollahon It seems that the upstream issue is making good progress. I'd personally try to give it a few days before deciding here. It could be we can depend on 2.x right away.

dgollahon · 2020-12-27T06:24:37Z

@mbj regexp_parser 2.0.2 has been released with the fix to the frozen issue. I have removed the special-casing logic and pinned regexp_parser to 2.x, >= 2.0.2 and removed the special error handling from the regex parsing logic and rebased this PR.

I believe the main outstanding issue is that unusual test selection for that slice mutation but we may want/need to lift the range to a constant or ignore the subject since it would force us to be incompatible with 2.5/2.6 I believe due to beginless range support.

I also noticed this in the 2.6 run which was odd:

[killfork] /home/runner/work/mutant/mutant/lib/mutant/loader.rb:24:in `eval': > /home/runner/work/mutant/mutant/lib/mutant/meta/example/dsl.rb:47: syntax error, unexpected ':', expecting end (SyntaxError)
[killfork] expected: @expected, location: @Locati...
[killfork] ^

I am wondering if this is due to the new 3.0 keyword argument parsing/mutating?

mbj · 2020-12-30T04:08:44Z

@dgollahon This should come back green after a rebase.

- Reintroduces regular expression mutations to `mutant` by reverting commit 21d3fef with various improvements and adjustments.

dgollahon force-pushed the restore-regexp-mutations branch from af26771 to 5499ad2 Compare December 20, 2020 02:55

dgollahon changed the title ~~Revert "Remove regexp body mutation support"~~ Reintroduce Regexp mutations Dec 20, 2020

dgollahon force-pushed the restore-regexp-mutations branch 2 times, most recently from 4fd79be to 2d4281d Compare December 20, 2020 02:59

dgollahon commented Dec 20, 2020

View reviewed changes

spec/integrations.yml Outdated Show resolved Hide resolved

dgollahon requested a review from mbj December 20, 2020 03:14

dgollahon commented Dec 20, 2020

View reviewed changes

mbj reviewed Dec 20, 2020

View reviewed changes

lib/mutant/ast/regexp/transformer.rb Outdated Show resolved Hide resolved

dgollahon force-pushed the restore-regexp-mutations branch 2 times, most recently from d049f9d to 89b92f3 Compare December 20, 2020 18:48

dgollahon force-pushed the restore-regexp-mutations branch from 147e608 to 460dcdc Compare December 23, 2020 20:16

dgollahon force-pushed the restore-regexp-mutations branch from 40a7954 to ef82da9 Compare December 23, 2020 21:51

dgollahon mentioned this pull request Dec 23, 2020

Support #to_s on frozen Group::Passive ammar/regexp_parser#77

Merged

dgollahon force-pushed the restore-regexp-mutations branch from ef82da9 to 460dcdc Compare December 24, 2020 02:35

dgollahon marked this pull request as ready for review December 24, 2020 02:36

dgollahon force-pushed the restore-regexp-mutations branch from 460dcdc to 999fbe0 Compare December 27, 2020 06:17

dgollahon force-pushed the restore-regexp-mutations branch 3 times, most recently from 8041f23 to 3b699e6 Compare December 30, 2020 02:25

dgollahon force-pushed the restore-regexp-mutations branch from 80bc1ca to fc3216f Compare December 30, 2020 05:39

dgollahon force-pushed the restore-regexp-mutations branch from fc3216f to 7b95e26 Compare December 30, 2020 05:44

Reintroduce Regexp mutations

a848e48

- Reintroduces regular expression mutations to `mutant` by reverting commit 21d3fef with various improvements and adjustments.

dgollahon force-pushed the restore-regexp-mutations branch from 7b95e26 to a848e48 Compare December 30, 2020 06:25

mbj self-requested a review December 30, 2020 13:33

mbj merged commit 5a7c8f2 into master Dec 30, 2020

mbj deleted the restore-regexp-mutations branch December 30, 2020 13:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reintroduce Regexp mutations #1166

Reintroduce Regexp mutations #1166

dgollahon commented Dec 20, 2020 •

edited

Loading

dgollahon Dec 20, 2020

mbj Dec 20, 2020

mbj Dec 23, 2020

dgollahon Dec 23, 2020

dgollahon Dec 23, 2020

mbj Dec 30, 2020

mbj Dec 20, 2020

dgollahon Dec 20, 2020

mbj commented Dec 23, 2020

dgollahon commented Dec 23, 2020

mbj commented Dec 23, 2020

dgollahon commented Dec 23, 2020

dgollahon commented Dec 23, 2020 •

edited

Loading

dgollahon commented Dec 24, 2020 •

edited

Loading

mbj commented Dec 24, 2020

dgollahon commented Dec 27, 2020

mbj commented Dec 30, 2020

Reintroduce Regexp mutations #1166

Reintroduce Regexp mutations #1166

Conversation

dgollahon commented Dec 20, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mbj commented Dec 23, 2020

dgollahon commented Dec 23, 2020

mbj commented Dec 23, 2020

dgollahon commented Dec 23, 2020

dgollahon commented Dec 23, 2020 • edited Loading

dgollahon commented Dec 24, 2020 • edited Loading

mbj commented Dec 24, 2020

dgollahon commented Dec 27, 2020

mbj commented Dec 30, 2020

dgollahon commented Dec 20, 2020 •

edited

Loading

dgollahon commented Dec 23, 2020 •

edited

Loading

dgollahon commented Dec 24, 2020 •

edited

Loading