-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regexp_parser
rejects /\xA/
but MRI accepts it
#75
Comments
dgollahon
added a commit
to mbj/mutant
that referenced
this issue
Dec 20, 2020
This reverts commit 21d3fef. This was not a clean revert. Note that: - The version of `regexp_parser` was 1.3.0, now it is 1.8.2 to accomodate our current `rubocop` version and because there were some relevant bugfixes implemented between 1.3.x and 1.8.x. We should eventually move to 2.0 but it is currently incompatible with this integration. There are some issues with the frozen Regexp classes getting mutated so we may have to open an issue. - Since "expected exception" support was removed from the specs, I have had to exclude two files entirely. This seems unfortunate as it reduces our overall coverage. - Since unsupported nodes are no longer explicitly tracked, I removed the code that used to handle that for regular expressions. See: #1021 - I had to change the example case for where we are more permissive than `regexp_parser` because `regexp_parser` has decided to become more permissive and try to match Ruby's semantics. It was actually very hard to find a case that failed--I brute-forced 50 million regexp strings that had perfect parity of being accepted and then stumbled onto the single hex escape case by accident. See: ammar/regexp_parser#75 - Changed an access pattern for regexp mutations which became equivalent based on this: https://github.com/ammar/regexp_parser/blame/4ca7cec03b210e3e00473b7b1a7308f963190c1e/lib/regexp_parser/expression/subexpression.rb#L30-L33 - Some other minor conflicts and small spec assertion changes were resolved as well.
dgollahon
added a commit
to mbj/mutant
that referenced
this issue
Dec 20, 2020
This reverts commit 21d3fef. This was not a clean revert. Note that: - The version of `regexp_parser` was 1.3.0, now it is 1.8.2 to accomodate our current `rubocop` version and because there were some relevant bugfixes implemented between 1.3.x and 1.8.x. We should eventually move to 2.0 but it is currently incompatible with this integration. There are some issues with the frozen Regexp classes getting mutated so we may have to open an issue. - Since "expected exception" support was removed from the specs, I have had to exclude two files entirely. This seems unfortunate as it reduces our overall coverage. - Since unsupported nodes are no longer explicitly tracked, I removed the code that used to handle that for regular expressions. See: #1021 - I had to change the example case for where we are more permissive than `regexp_parser` because `regexp_parser` has decided to become more permissive and try to match Ruby's semantics. It was actually very hard to find a case that failed--I brute-forced 50 million regexp strings that had perfect parity of being accepted and then stumbled onto the single hex escape case by accident. See: ammar/regexp_parser#75 - Changed an access pattern for regexp mutations which became equivalent based on this: https://github.com/ammar/regexp_parser/blame/4ca7cec03b210e3e00473b7b1a7308f963190c1e/lib/regexp_parser/expression/subexpression.rb#L30-L33 - Some other minor conflicts and small spec assertion changes were resolved as well.
dgollahon
added a commit
to mbj/mutant
that referenced
this issue
Dec 20, 2020
This reverts commit 21d3fef. This was not a clean revert. Note that: - The version of `regexp_parser` was 1.3.0, now it is 1.8.2 to accomodate our current `rubocop` version and because there were some relevant bugfixes implemented between 1.3.x and 1.8.x. We should eventually move to 2.0 but it is currently incompatible with this integration. There are some issues with the frozen Regexp classes getting mutated so we may have to open an issue. - Since "expected exception" support was removed from the specs, I have had to exclude two files entirely. This seems unfortunate as it reduces our overall coverage. - Since unsupported nodes are no longer explicitly tracked, I removed the code that used to handle that for regular expressions. See: #1021 - I had to change the example case for where we are more permissive than `regexp_parser` because `regexp_parser` has decided to become more permissive and try to match Ruby's semantics. It was actually very hard to find a case that failed--I brute-forced 50 million regexp strings that had perfect parity of being accepted and then stumbled onto the single hex escape case by accident. See: ammar/regexp_parser#75 - Changed an access pattern for regexp mutations which became equivalent based on this: https://github.com/ammar/regexp_parser/blame/4ca7cec03b210e3e00473b7b1a7308f963190c1e/lib/regexp_parser/expression/subexpression.rb#L30-L33 - Some other minor conflicts and small spec assertion changes were resolved as well.
dgollahon
added a commit
to mbj/mutant
that referenced
this issue
Dec 20, 2020
This reverts commit 21d3fef. This was not a clean revert. Note that: - The version of `regexp_parser` was 1.3.0, now it is 1.8.2 to accomodate our current `rubocop` version and because there were some relevant bugfixes implemented between 1.3.x and 1.8.x. We should eventually move to 2.0 but it is currently incompatible with this integration. There are some issues with the frozen Regexp classes getting mutated so we may have to open an issue. - Since "expected exception" support was removed from the specs, I have had to exclude two files entirely. This seems unfortunate as it reduces our overall coverage. - Since unsupported nodes are no longer explicitly tracked, I removed the code that used to handle that for regular expressions. See: #1021 - I had to change the example case for where we are more permissive than `regexp_parser` because `regexp_parser` has decided to become more permissive and try to match Ruby's semantics. It was actually very hard to find a case that failed--I brute-forced 50 million regexp strings that had perfect parity of being accepted and then stumbled onto the single hex escape case by accident. See: ammar/regexp_parser#75 - Changed an access pattern for regexp mutations which became equivalent based on this: https://github.com/ammar/regexp_parser/blame/4ca7cec03b210e3e00473b7b1a7308f963190c1e/lib/regexp_parser/expression/subexpression.rb#L30-L33 - Some other minor conflicts and small spec assertion changes were resolved as well.
dgollahon
added a commit
to mbj/mutant
that referenced
this issue
Dec 20, 2020
This reverts commit 21d3fef. This was not a clean revert. Note that: - The version of `regexp_parser` was 1.3.0, now it is 1.8.2 to accomodate our current `rubocop` version and because there were some relevant bugfixes implemented between 1.3.x and 1.8.x. We should eventually move to 2.0 but it is currently incompatible with this integration. There are some issues with the frozen Regexp classes getting mutated so we may have to open an issue. - Since "expected exception" support was removed from the specs, I have had to exclude two files entirely. This seems unfortunate as it reduces our overall coverage. - Since unsupported nodes are no longer explicitly tracked, I removed the code that used to handle that for regular expressions. See: #1021 - I had to change the example case for where we are more permissive than `regexp_parser` because `regexp_parser` has decided to become more permissive and try to match Ruby's semantics. It was actually very hard to find a case that failed--I brute-forced 50 million regexp strings that had perfect parity of being accepted and then stumbled onto the single hex escape case by accident. See: ammar/regexp_parser#75 - Changed an access pattern for regexp mutations which became equivalent based on this: https://github.com/ammar/regexp_parser/blame/4ca7cec03b210e3e00473b7b1a7308f963190c1e/lib/regexp_parser/expression/subexpression.rb#L30-L33 - Some other minor conflicts and small spec assertion changes were resolved as well.
dgollahon
added a commit
to mbj/mutant
that referenced
this issue
Dec 20, 2020
This reverts commit 21d3fef. This was not a clean revert. Note that: - The version of `regexp_parser` was 1.3.0, now it is 1.8.2 to accomodate our current `rubocop` version and because there were some relevant bugfixes implemented between 1.3.x and 1.8.x. We should eventually move to 2.0 but it is currently incompatible with this integration. There are some issues with the frozen Regexp classes getting mutated so we may have to open an issue. - Since "expected exception" support was removed from the specs, I have had to exclude two files entirely. This seems unfortunate as it reduces our overall coverage. - Since unsupported nodes are no longer explicitly tracked, I removed the code that used to handle that for regular expressions. See: #1021 - I had to change the example case for where we are more permissive than `regexp_parser` because `regexp_parser` has decided to become more permissive and try to match Ruby's semantics. It was actually very hard to find a case that failed--I brute-forced 50 million regexp strings that had perfect parity of being accepted and then stumbled onto the single hex escape case by accident. See: ammar/regexp_parser#75 - Changed an access pattern for regexp mutations which became equivalent based on this: https://github.com/ammar/regexp_parser/blame/4ca7cec03b210e3e00473b7b1a7308f963190c1e/lib/regexp_parser/expression/subexpression.rb#L30-L33 - Some other minor conflicts and small spec assertion changes were resolved as well.
dgollahon
added a commit
to mbj/mutant
that referenced
this issue
Dec 20, 2020
This reverts commit 21d3fef. This was not a clean revert. Note that: - The version of `regexp_parser` was 1.3.0, now it is 1.8.2 to accomodate our current `rubocop` version and because there were some relevant bugfixes implemented between 1.3.x and 1.8.x. We should eventually move to 2.0 but it is currently incompatible with this integration. There are some issues with the frozen Regexp classes getting mutated so we may have to open an issue. - Since "expected exception" support was removed from the specs, I have had to exclude two files entirely. This seems unfortunate as it reduces our overall coverage. - Since unsupported nodes are no longer explicitly tracked, I removed the code that used to handle that for regular expressions. See: #1021 - I had to change the example case for where we are more permissive than `regexp_parser` because `regexp_parser` has decided to become more permissive and try to match Ruby's semantics. It was actually very hard to find a case that failed--I brute-forced 50 million regexp strings that had perfect parity of being accepted and then stumbled onto the single hex escape case by accident. See: ammar/regexp_parser#75 - Changed an access pattern for regexp mutations which became equivalent based on this: https://github.com/ammar/regexp_parser/blame/4ca7cec03b210e3e00473b7b1a7308f963190c1e/lib/regexp_parser/expression/subexpression.rb#L30-L33 - Some other minor conflicts and small spec assertion changes were resolved as well.
dgollahon
added a commit
to mbj/mutant
that referenced
this issue
Dec 20, 2020
This reverts commit 21d3fef. This was not a clean revert. Note that: - The version of `regexp_parser` was 1.3.0, now it is 1.8.2 to accomodate our current `rubocop` version and because there were some relevant bugfixes implemented between 1.3.x and 1.8.x. We should eventually move to 2.0 but it is currently incompatible with this integration. There are some issues with the frozen Regexp classes getting mutated so we may have to open an issue. - Since "expected exception" support was removed from the specs, I have had to exclude two files entirely. This seems unfortunate as it reduces our overall coverage. - Since unsupported nodes are no longer explicitly tracked, I removed the code that used to handle that for regular expressions. See: #1021 - I had to change the example case for where we are more permissive than `regexp_parser` because `regexp_parser` has decided to become more permissive and try to match Ruby's semantics. It was actually very hard to find a case that failed--I brute-forced 50 million regexp strings that had perfect parity of being accepted and then stumbled onto the single hex escape case by accident. See: ammar/regexp_parser#75 - Changed an access pattern for regexp mutations which became equivalent based on this: https://github.com/ammar/regexp_parser/blame/4ca7cec03b210e3e00473b7b1a7308f963190c1e/lib/regexp_parser/expression/subexpression.rb#L30-L33 - Some other minor conflicts and small spec assertion changes were resolved as well.
dgollahon
added a commit
to mbj/mutant
that referenced
this issue
Dec 20, 2020
This reverts commit 21d3fef. This was not a clean revert. Note that: - The version of `regexp_parser` was 1.3.0, now it is 1.8.2 to accomodate our current `rubocop` version and because there were some relevant bugfixes implemented between 1.3.x and 1.8.x. We should eventually move to 2.0 but it is currently incompatible with this integration. There are some issues with the frozen Regexp classes getting mutated so we may have to open an issue. - Since "expected exception" support was removed from the specs, I have had to exclude two files entirely. This seems unfortunate as it reduces our overall coverage. - Since unsupported nodes are no longer explicitly tracked, I removed the code that used to handle that for regular expressions. See: #1021 - I had to change the example case for where we are more permissive than `regexp_parser` because `regexp_parser` has decided to become more permissive and try to match Ruby's semantics. It was actually very hard to find a case that failed--I brute-forced 50 million regexp strings that had perfect parity of being accepted and then stumbled onto the single hex escape case by accident. See: ammar/regexp_parser#75 - Changed an access pattern for regexp mutations which became equivalent based on this: https://github.com/ammar/regexp_parser/blame/4ca7cec03b210e3e00473b7b1a7308f963190c1e/lib/regexp_parser/expression/subexpression.rb#L30-L33 - I have marked several dispatch methods as `private`. - I have also removed the old YARD doc comments on private methods at @mbj's request. - Some other minor conflicts and small spec assertion changes were resolved as well.
@dgollahon thanks for the report, and for going to such lengths to check all kinds of regexps! ❤️ This one was also clearly a bug. We had code to handle hex escapes with just one xdigit since the start, it's just been unreachable for all these 10 years 😄 The fix is included in v2.0.1. |
Fantastic! Thanks for the excellent response @jaynetics. :D |
dgollahon
added a commit
to mbj/mutant
that referenced
this issue
Dec 23, 2020
This reverts commit 21d3fef. This was not a clean revert. Note that: - The version of `regexp_parser` was 1.3.0, now it is 1.8.2 to accomodate our current `rubocop` version and because there were some relevant bugfixes implemented between 1.3.x and 1.8.x. We should eventually move to 2.0 but it is currently incompatible with this integration. There are some issues with the frozen Regexp classes getting mutated so we may have to open an issue. - Since "expected exception" support was removed from the specs, I have had to exclude two files entirely. This seems unfortunate as it reduces our overall coverage. - Since unsupported nodes are no longer explicitly tracked, I removed the code that used to handle that for regular expressions. See: #1021 - I had to change the example case for where we are more permissive than `regexp_parser` because `regexp_parser` has decided to become more permissive and try to match Ruby's semantics. It was actually very hard to find a case that failed--I brute-forced 50 million regexp strings that had perfect parity of being accepted and then stumbled onto the single hex escape case by accident. See: ammar/regexp_parser#75. This can be removed once we reach `regexp_parser` >= 2.0.1. - Added logic to skip invalid group options until we are on `regexp_parser` >= 2.0.1. See: ammar/regexp_parser#76 - Changed an access pattern for regexp mutations which became equivalent based on this: https://github.com/ammar/regexp_parser/blame/4ca7cec03b210e3e00473b7b1a7308f963190c1e/lib/regexp_parser/expression/subexpression.rb#L30-L33 - I have marked several dispatch methods as `private`. - I have also removed the old YARD doc comments on private methods at @mbj's request. - Some other minor conflicts and small spec assertion changes were resolved as well.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
I am working on re-introducing regexp mutation support on mutant and I noticed that since the old integration existed
regexp_parser
seems to have decided to stop rejecting a large % of regexps that ruby would accept (#63) butregexp_parser
did not. I did find one additional case that was not documented anywhere I found (I tried brute-forcing millions of regexps to infer if there were any cases whereregexp_parser
was stricter than MRI and this is the only class of instances I could find).Is this a bug or intended behavior? Either is fine for my purposes since I can just add a special check to ignore errors in this case, but I was curious if this was an intended difference or not. The coverage matrix in the README suggests that hex escapes work but I guess this is a special case that was not highlighted. If it is intentional behavior, it would be helpful to document it (unless I missed where this was done already) or alternatively having parity with MRI would work for me.
Thanks!
The text was updated successfully, but these errors were encountered: