-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent use of characters above \xFF in bitwise string ops #17022
Comments
From @jkeenanpod/perldeprecation.pod contains this entry for perl-5.32: ##### The string bitwise operators, "&", "|", "^", and "~", treat An example of this is "" & "\x{100}" The wide character is not used in the "&" operation because This entry was made in the following commit: ##### Deprecate above \xFF in bitwise string ops This is already a fatal error for operations whose outcome "abc" & "def\x{100}" the wide character doesn't actually need to participate in Make it so. |
From @jkeenanSummary of my perl5 (revision 5 version 31 subversion 0) configuration: Characteristics of this binary (from libperl): |
From @jkeenanOn Sat, 25 May 2019 02:13:08 GMT, jkeenan@pobox.com wrote:
Please review patch attached. Available for smoking in this branch: smoke-me/jkeenan/rt-134140-bitwise Thank you very much. -- |
From @jkeenan0001-Use-of-code-points-over-0xFF-in-string-bitwise-opera.patchFrom 9c60d4d1fc88f4c92b7dd6b09691e8ef5828109d Mon Sep 17 00:00:00 2001
From: James E Keenan <jkeenan@cpan.org>
Date: Mon, 27 May 2019 13:18:10 -0400
Subject: [PATCH] Use of code points over 0xFF in string bitwise operators
Implement complete fatalization. Some instances of these were fatalized
in 5.28. However, in cases where the wide characters did not affect the
end result, no deprecation notice was raised. So they remained legal,
though deprecated. Now, all occurrences are fatal (as of 5.32).
Modify source code in doop.c. Adapt test file. Update perldiag and
perldeprecation.
For: RT 134140
---
doop.c | 27 ++----------
pod/perldeprecation.pod | 4 +-
pod/perldiag.pod | 2 +-
t/op/bop.t | 95 +++++++++++------------------------------
4 files changed, 33 insertions(+), 95 deletions(-)
diff --git a/doop.c b/doop.c
index 54e35f10a6..c348afe292 100644
--- a/doop.c
+++ b/doop.c
@@ -1087,30 +1087,11 @@ Perl_do_vop(pTHX_ I32 optype, SV *sv, SV *left, SV *right)
* on zeros without having to do it. In the case of '&', the result is
* zero, and the dangling portion is simply discarded. For '|' and '^', the
* result is the same as the other operand, so the dangling part is just
- * appended to the final result, unchanged. We currently accept above-FF
- * code points in the dangling portion, as that's how it has long worked,
- * and code depends on it staying that way. But it is now fatal for
- * above-FF to appear in the portion that does get operated on. Hence, any
- * above-FF must come only in the longer operand, and only in its dangling
- * portion. That means that at least one of the operands has to be
- * entirely non-UTF-8, and the length of that operand has to be before the
- * first above-FF in the other */
+ * appended to the final result, unchanged. As of perl-5.32, we no longer
+ * accept above-FF code points in the dangling portion.
+ */
if (left_utf8 || right_utf8) {
- if (left_utf8) {
- if (right_utf8 || rightlen > leftlen) {
- Perl_croak(aTHX_ FATAL_ABOVE_FF_MSG, PL_op_desc[optype]);
- }
- len = rightlen;
- }
- else if (right_utf8) {
- if (leftlen > rightlen) {
- Perl_croak(aTHX_ FATAL_ABOVE_FF_MSG, PL_op_desc[optype]);
- }
- len = leftlen;
- }
-
- Perl_ck_warner_d(aTHX_ packWARN(WARN_DEPRECATED),
- DEPRECATED_ABOVE_FF_MSG, PL_op_desc[optype]);
+ Perl_croak(aTHX_ FATAL_ABOVE_FF_MSG, PL_op_desc[optype]);
}
else { /* Neither is UTF-8 */
len = MIN(leftlen, rightlen);
diff --git a/pod/perldeprecation.pod b/pod/perldeprecation.pod
index 5213eca229..5395e11ac4 100644
--- a/pod/perldeprecation.pod
+++ b/pod/perldeprecation.pod
@@ -65,14 +65,14 @@ nonsensical. Some instances of these have been deprecated since Perl
the wide characters did not affect the end result, no deprecation
notice was raised, and so remain legal. Now, all occurrences either are
fatal or raise a deprecation warning, so that the remaining legal
-occurrences will be fatal in 5.32.
+occurrences become fatal in 5.32.
An example of this is
"" & "\x{100}"
The wide character is not used in the C<&> operation because the left
-operand is shorter. This now warns anyway.
+operand is shorter. This now throws an exception.
=head3 hostname() doesn't accept any arguments
diff --git a/pod/perldiag.pod b/pod/perldiag.pod
index f69b1b8367..a7176c342e 100644
--- a/pod/perldiag.pod
+++ b/pod/perldiag.pod
@@ -7338,7 +7338,7 @@ C<~>) on a string containing a code point over 0xFF. The string bitwise
operators treat their operands as strings of bytes, and values beyond
0xFF are nonsensical in this context.
-This became fatal in Perl 5.28.
+Certain instances became fatal in Perl 5.28; others in perl 5.32.
=item Use of strings with code points over 0xFF as arguments to C<vec>
is deprecated. This will be a fatal error in Perl 5.32
diff --git a/t/op/bop.t b/t/op/bop.t
index 411d253a7b..666dfb8114 100644
--- a/t/op/bop.t
+++ b/t/op/bop.t
@@ -18,7 +18,7 @@ BEGIN {
# If you find tests are failing, please try adding names to tests to track
# down where the failure is, and supply your new names as a patch.
# (Just-in-time test naming)
-plan tests => 504;
+plan tests => 491;
# numerics
ok ((0xdead & 0xbeef) == 0x9ead);
@@ -613,73 +613,30 @@ foreach my $op_info ([and => "&"], [or => "|"], [xor => "^"]) {
}
{
- # Since these are temporary, and it was a pain to make them into loops,
- # the code is just rolled out.
- local $SIG{__WARN__} = sub { push @warnings, @_; };
-
- undef @warnings;
- is("abc" & "abc\x{100}", "abc", '"abc" & "abc\x{100}" works');
- if (! is(@warnings, 1, "... but returned a single warning")) {
- diag join "\n", @warnings;
- }
- like ($warnings[0], qr /^Use of strings with code points over 0xFF as (?#
- )arguments to bitwise and \(&\) operator (?#
- )is deprecated/,
- "... which is the expected warning");
- undef @warnings;
- is("abc" | "abc\x{100}", "abc\x{100}", '"abc" | "abc\x{100}" works');
- if (! is(@warnings, 1, "... but returned a single warning")) {
- diag join "\n", @warnings;
- }
- like ($warnings[0], qr /^Use of strings with code points over 0xFF as (?#
- )arguments to bitwise or \(|\) operator (?#
- )is deprecated/,
- "... which is the expected warning");
- undef @warnings;
- is("abc" ^ "abc\x{100}", "\0\0\0\x{100}", '"abc" ^ "abc\x{100}" works');
- if (! is(@warnings, 1, "... but returned a single warning")) {
- diag join "\n", @warnings;
- }
- like ($warnings[0], qr /^Use of strings with code points over 0xFF as (?#
- )arguments to bitwise xor \(\^\) operator (?#
- )is deprecated/,
- "... which is the expected warning");
- undef @warnings;
- is("abc\x{100}" & "abc", "abc", '"abc\x{100}" & "abc" works');
- if (! is(@warnings, 1, "... but returned a single warning")) {
- diag join "\n", @warnings;
- }
- like ($warnings[0], qr /^Use of strings with code points over 0xFF as (?#
- )arguments to bitwise and \(&\) operator (?#
- )is deprecated/,
- "... which is the expected warning");
- undef @warnings;
- is("abc\x{100}" | "abc", "abc\x{100}", '"abc\x{100}" | "abc" works');
- if (! is(@warnings, 1, "... but returned a single warning")) {
- diag join "\n", @warnings;
- }
- like ($warnings[0], qr /^Use of strings with code points over 0xFF as (?#
- )arguments to bitwise or \(|\) operator (?#
- )is deprecated/,
- "... which is the expected warning");
- undef @warnings;
- is("abc\x{100}" ^ "abc", "\0\0\0\x{100}", '"abc\x{100}" ^ "abc" works');
- if (! is(@warnings, 1, "... but returned a single warning")) {
- diag join "\n", @warnings;
- }
- like ($warnings[0], qr /^Use of strings with code points over 0xFF as (?#
- )arguments to bitwise xor \(\^\) operator (?#
- )is deprecated/,
- "... which is the expected warning");
- no warnings 'deprecated';
- undef @warnings;
- my $foo = "abc" & "abc\x{100}";
- $foo = "abc" | "abc\x{100}";
- $foo = "abc" ^ "abc\x{100}";
- $foo = "abc\x{100}" & "abc";
- $foo = "abc\x{100}" | "abc";
- $foo = "abc\x{100}" ^ "abc";
- if (! is(@warnings, 0, "... And none of the last 6 main tests warns when 'deprecated' is off")) {
- diag join "\n", @warnings;
+ # RT 134140 fatalizations
+ my %op_pairs = (
+ and => { low => 'and', high => '&', regex => qr/&/ },
+ or => { low => 'or', high => '|', regex => qr/\|/ },
+ xor => { low => 'xor', high => '^', regex => qr/\^/ },
+ );
+ my @combos = (
+ { string => '"abc" & "abc\x{100}"', op_pair => $op_pairs{and} },
+ { string => '"abc" | "abc\x{100}"', op_pair => $op_pairs{or} },
+ { string => '"abc" ^ "abc\x{100}"', op_pair => $op_pairs{xor} },
+ { string => '"abc\x{100}" & "abc"', op_pair => $op_pairs{and} },
+ { string => '"abc\x{100}" | "abc"', op_pair => $op_pairs{or} },
+ { string => '"abc\x{100}" ^ "abc"', op_pair => $op_pairs{xor} },
+
+ );
+
+ # Use of strings with code points over 0xFF as arguments to %s operator is not allowed
+ for my $h (@combos) {
+ my $s1 = "Use of strings with code points over 0xFF as arguments to bitwise";
+ my $s2 = "operator is not allowed";
+ my $expected = qr/$s1 $h->{op_pair}->{low} \($h->{op_pair}->{regex}\) $s2/;
+ my $description = "$s1 $h->{op_pair}->{low} ($h->{op_pair}->{high}) operator is not allowed";
+ local $@;
+ eval $h->{string};
+ like $@, $expected, $description;
}
}
--
2.17.1
|
The RT System itself - Status changed from 'new' to 'open' |
From @khwilliamsonThanks, applied as c8b94fe |
@khwilliamson - Status changed from 'open' to 'pending release' |
Migrated from rt.perl.org#134140 (status was 'pending release')
Searchable as RT134140$
The text was updated successfully, but these errors were encountered: