Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexec.c:880: Perl_re_intuit_start: Assertion `prog->substrs->data[1].max_offset >= 0' failed. #17730

Closed
dur-randir opened this issue Apr 20, 2020 · 13 comments

Comments

@dur-randir
Copy link
Member

This is a bug report for perl from sergey.aleynikov@gmail.com,
generated with the help of perlbug 1.41 running under perl 5.31.10.

[Please describe your issue here]

While fuzzing perl v5.31.9-70-g0c96aa4b7b built with afl and run
under libdislocator, I found the following program

q0=~'(\b*0)?\W0'

to cause an assertion failure

perl: regexec.c:880: Perl_re_intuit_start: Assertion `prog->substrs->data[1].max_offset >= 0' failed.

GDB stack trace is:

#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ffff7c24535 in __GI_abort () at abort.c:79
#2 0x00007ffff7c2440f in __assert_fail_base (fmt=0x7ffff7d86ee0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
assertion=0x555555b7e638 "prog->substrs->data[1].max_offset >= 0", file=0x555555b7c268 "regexec.c", line=880, function=) at assert.c:92
#3 0x00007ffff7c32102 in __GI___assert_fail (assertion=0x555555b7e638 "prog->substrs->data[1].max_offset >= 0", file=0x555555b7c268 "regexec.c", line=880,
function=0x555555baf060 <PRETTY_FUNCTION.19050> "Perl_re_intuit_start") at assert.c:101
#4 0x00005555558b08c7 in Perl_re_intuit_start (rx=0x555555c3aff8, sv=0x555555c3af98, strbeg=0x555555c446f0 "q0", strpos=0x555555c446f0 "q0",
strend=0x555555c446f2 "", flags=97, data=0x0) at regexec.c:880
#5 0x00005555558c147c in Perl_regexec_flags (rx=0x555555c3aff8, stringarg=0x555555c446f0 "q0", strend=0x555555c446f2 "", strbeg=0x555555c446f0 "q0",
minend=0, sv=0x555555c3af98, data=0x0, flags=97) at regexec.c:3390
#6 0x000055555577f1d9 in Perl_pp_match () at pp_hot.c:3060
#7 0x00005555557206b2 in Perl_runops_debug () at dump.c:2571
#8 0x00005555555f083e in S_run_body (oldscope=1) at perl.c:2759
#9 0x00005555555efdbc in perl_run (my_perl=0x555555c15260) at perl.c:2682
#10 0x00005555555a2155 in main (argc=2, argv=0x7fffffffe1b8, env=0x7fffffffe1d0) at perlmain.c:134

This is a regression in blead, bisect points to f6231eb is the first bad commit

commit f6231ebfc0a4a5472c54d7a8d9fb20a2daa9bf37
Author: Karl Williamson <khw@cpan.org>
Date:   Mon Mar 2 10:15:25 2020 -0700

    regcomp.c: Get rid of meaningless test

    Since ea3daa5, parts of this test became nonsensical as max_offset
    cannot be larger than OPTIMIZE_INFIINITY.  (I don't know why compilers
    didn't say that this branch is always false.)

    Hugo van der Sanden suggested something like this commit to keep the
    still valid part of the test.

[Please do not change anything below this line]
Flags:
category=core
severity=high
Site configuration information for perl 5.31.10:

Configured by root at Fri Mar 13 17:15:02 MSK 2020.

Summary of my perl5 (revision 5 version 31 subversion 10) configuration:
Commit id: 0c96aa4
Platform:
osname=linux
osvers=4.19.0-8-amd64
archname=x86_64-linux
uname='linux dorothy 4.19.0-8-amd64 #1 smp debian 4.19.98-1 (2020-01-26) x86_64 gnulinux '
config_args='-de -Dusedevel -Doptimize=-O2'
hint=recommended
useposix=true
d_sigaction=define
useithreads=undef
usemultiplicity=undef
use64bitint=define
use64bitall=define
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
bincompat5005=undef
Compiler:
cc='cc'
ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2'
optimize='-O2'
cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
ccversion=''
gccversion='8.3.0'
gccosandvers=''
intsize=4
longsize=8
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='off_t'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='cc'
ldflags =' -fstack-protector-strong -L/usr/local/lib'
libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/8/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
libc=libc-2.28.so
so=so
useshrplib=false
libperl=libperl.a
gnulibc_version='2.28'
Dynamic Linking:
dlsrc=dl_dlopen.xs
dlext=so
d_dlsymun=undef
ccdlflags='-Wl,-E'
cccdlflags='-fPIC'
lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong'

@inc for perl 5.31.10:
lib
/usr/local/lib/perl5/site_perl/5.31.10/x86_64-linux
/usr/local/lib/perl5/site_perl/5.31.10
/usr/local/lib/perl5/5.31.10/x86_64-linux
/usr/local/lib/perl5/5.31.10

Environment for perl 5.31.10:
HOME=/home/afl
LANG=en_US.UTF-8
LANGUAGE=en_US:en
LC_CTYPE=en_US.UTF-8
LC_TIME=C
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=/home/afl/perlbrew/bin:/home/afl/perlbrew/perls/perl-5.30.0-dbg/bin:/opt/local/bin:/usr/texbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PERLBREW_HOME=/home/afl/.perlbrew
PERLBREW_MANPATH=/home/afl/perlbrew/perls/perl-5.30.0-dbg/man
PERLBREW_PATH=/home/afl/perlbrew/bin:/home/afl/perlbrew/perls/perl-5.30.0-dbg/bin
PERLBREW_PERL=perl-5.30.0-dbg
PERLBREW_ROOT=/home/afl/perlbrew
PERLBREW_SHELLRC_VERSION=0.88
PERLBREW_VERSION=0.88
PERL_BADLANG (unset)

@khwilliamson khwilliamson added this to the 5.32.0 milestone Apr 20, 2020
@hvds
Copy link
Contributor

hvds commented Apr 20, 2020

I anticipate that this will be due to zero-width constructs such as \b being marked SIMPLE. I was planning to push a smoke-me branch fixing all those once 5.32 was released, but I wonder if I should do it now instead.

@hvds
Copy link
Contributor

hvds commented Apr 21, 2020

I confirmed that branch fixes this issue, I've now pushed it for smoking as https://github.com/Perl/perl5/tree/smoke-me/hv/gh17594.

@dur-randir
Copy link
Member Author

I can still trigger this assertion even after the pull request with the following:

q00=~'(((*ACCEPT)0)*00)?0(?1)'

@hvds
Copy link
Contributor

hvds commented Apr 22, 2020

I can still trigger this assertion even after the pull request with the following:

q00=~'(((*ACCEPT)0)*00)?0(?1)'

This is a quite different case: we're overflowing from maxint to minint in tracking the last place the floating substring can match at regcomp.c:5304. It goes something like:

  • we reach the final EXACT "0" - at which point for some reason is_inf is false despite the ((*ACCEPT)0)* - and at this point pos_min is 0 and pos_delta is maxint; min is updated to 1;
  • we recurse for the GOSUB
  • we reach the initial EXACT "00". Now pos_min is 1, pos_delta is still maxint, and is_inf is still false.

My guess is that the ACCEPT is suppressing the setting of is_inf, which may be an additional bug, but clamping the addition above is enough to fix the assert:

--- a/regcomp.c
+++ b/regcomp.c
@@ -5301,8 +5301,10 @@ S_study_chunk(pTHX_ RExC_state_t *pRExC_state, regnode **scanp,
                   offset, later match for variable offset.  */
                if (data->last_end == -1) { /* Update the start info. */
                    data->last_start_min = data->pos_min;
-                   data->last_start_max = is_inf
-                        ? OPTIMIZE_INFTY : data->pos_min + data->pos_delta;
+                   data->last_start_max =
+                        is_inf ? OPTIMIZE_INFTY
+                        : (data->pos_delta > OPTIMIZE_INFTY - data->pos_min)
+                            ? OPTIMIZE_INFTY : data->pos_min + data->pos_delta;
                }
                sv_catpvn(data->last_found, STRING(scan), bytelen);
                if (UTF)

@hvds
Copy link
Contributor

hvds commented Apr 22, 2020

Ah, we don't continue after ACCEPT, so it is correct that ((*ACCEPT)0)* cannot match infinite width - but there is a disconnect between failing to set is_inf for that case, but still deciding that the min width of the paren is 1, so that the STAR has a {min, delta} of {0, inf}. It is that disconnect that is not catered for by the patched lines, and may be difficult to resolve additionally before making the optimizer more tractable.

@dur-randir
Copy link
Member Author

This is a quite different case: we're overflowing from maxint to minint in tracking the last place the floating substring can match at regcomp.c:5304.

It is in a sense that it's broken by the same commit.

@hvds
Copy link
Contributor

hvds commented Apr 22, 2020

This is a quite different case: we're overflowing from maxint to minint in tracking the last place the floating substring can match at regcomp.c:5304.

It is in a sense that it's broken by the same commit.

Ah right, because the "nonsensical" test removed by that commit was papering over multiple bugs by casting MININT to an unsigned type.

To avoid delaying 5.32 we might want to temporarily restore that in a cleaner form, something like:

--- a/regcomp.c
+++ b/regcomp.c
@@ -1493,6 +1493,7 @@ S_scan_commit(pTHX_ const RExC_state_t *pRExC_state, scan_data_t *data,
                        ? OPTIMIZE_INFTY
                        : (l
                           ? data->last_start_max
+                          : data->pos_delta < 0 ? OPTIMIZE_INFTY
                           : (data->pos_delta > OPTIMIZE_INFTY - data->pos_min
                                         ? OPTIMIZE_INFTY
                                         : data->pos_min + data->pos_delta));

... and then ferret out any remaining such issues in the 5.34 cycle.

@hvds
Copy link
Contributor

hvds commented Apr 23, 2020

Opened PR #17744 with the fix for the *ACCEPT case and the temporary guard.

@hvds
Copy link
Contributor

hvds commented Apr 25, 2020

PR #17744 now merged; I recommend that we remove this as a blocker, but keep it open (maybe with a 5.33.1 block) for removal of the temporary guard and subsequent investigation.

@khwilliamson khwilliamson removed this from the 5.32.0 milestone Apr 25, 2020
@khwilliamson
Copy link
Contributor

So be it. removed from blockers

@dur-randir
Copy link
Member Author

dur-randir commented Apr 25, 2020

The funny thing is, i wouldn't have noticed this if not for #17715, as it can lead to the same stacktrace, so i've blacklisted it before.

@hvds
Copy link
Contributor

hvds commented Oct 8, 2020

I've now merged the fix for #17594, I believe this ticket can also now be closed. @dur-randir do you see any further concerns here?

@khwilliamson
Copy link
Contributor

No response in 1.5 years; closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants