Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regen/regcharclass.pl generates wrong code #12448

Closed
p5pRT opened this issue Sep 29, 2012 · 5 comments
Closed

regen/regcharclass.pl generates wrong code #12448

p5pRT opened this issue Sep 29, 2012 · 5 comments

Comments

@p5pRT
Copy link

p5pRT commented Sep 29, 2012

Migrated from rt.perl.org#115078 (status was 'resolved')

Searchable as RT115078$

@p5pRT
Copy link
Author

p5pRT commented Sep 29, 2012

From @khwilliamson

This is a bug report for perl from khw@​karl.(none),
generated with the help of perlbug 1.39 running under perl 5.17.5.


The following lines added to regen/regcharclass.pl generate
wrong code​:
DEMO​: This is wrong code
=> UTF8 :safe
"\x{3B7}\x{342}"
"\x{3B9}\x{308}\x{301}"

==============
The generated code is​:
/*
  DEMO​: This is wrong code

  "\x{3B7}\x{342}"
  "\x{3B9}\x{308}\x{301}"
*/
/*** GENERATED CODE ***/
#define is_DEMO_utf8_safe(s,e)
  \
( ( ((e)-(s) > 5) && ( 0xCE == ((U8*)s)[0] ) ) ? ( ( 0xB7 == ((U8*)s)[1]
) ?\
  ( ( 0xCD == ((U8*)s)[2] ) ? \
  ( ( 0x82 == ((U8*)s)[3] ) ? 4 : 0 ) \
  : ( 0xB9 == ((U8*)s)[1] ) ? \
  ( ( ( ( ( 0xCC == ((U8*)s)[2] ) && ( 0x88 == ((U8*)s)[3] ) ) && (
0xCC == ((U8*)s)[4] ) ) && ( 0x81 == ((U8*)s)[5] ) ) ? 6 : 0 )\
  : 0 ) \
  : 0 )
  \
: ((e)-(s) > 3) ?
  \
  ( ( ( ( ( 0xCE == ((U8*)s)[0] ) && ( 0xB7 == ((U8*)s)[1] ) ) && (
0xCD == ((U8*)s)[2] ) ) && ( 0x82 == ((U8*)s)[3] ) ) ? 4 : 0 )\
: 0 )

==============
Here is the (correct) code for just the first string​:
/*
  DEMO A​: This is correct code

  "\x{3B7}\x{342}"
*/
/*** GENERATED CODE ***/
#define is_DEMO A_utf8_safe(s,e)
  \
( ( ( ( ( ((e)-(s) > 3) && ( 0xCE == ((U8*)s)[0] ) ) && ( 0xB7 ==
((U8*)s)[1] ) ) && ( 0xCD == ((U8*)s)[2] ) ) && ( 0x82 == ((U8*)s)[3] )
) ? 4 : 0 )

=============
Here is the (correct) code for just the second string​:
/*
  DEMO B​: This is correct code

  "\x{3B9}\x{308}\x{301}"
*/
/*** GENERATED CODE ***/
#define is_DEMO B_utf8_safe(s,e)
  \
( ( ( ( ( ( ( ((e)-(s) > 5) && ( 0xCE == ((U8*)s)[0] ) ) && ( 0xB9 ==
((U8*)s)[1] ) ) && ( 0xCC == ((U8*)s)[2] ) ) && ( 0x88 == ((U8*)s)[3] )
) && ( 0xCC == ((U8*)s)[4] ) ) && ( 0x81 == ((U8*)s)[5] ) ) ? 6 : 0 )

============
The reason it is wrong is there is a parenthesis grouping problem. The
0xB9 in s[1] in the wrong code should be the alternative tested if s[1]
isn't 0xB7, like so​:
  ( B7 == s[1] )
  ? ( ... )
  : ( B9 == s[1] ? (...) : (...) )
But instead the parentheses group it differently, which you can tell if your
editor shows balanced parentheses.

I get the same results running this on a version of this utility that
predates the recent changes made to it. The reason this is marked
critical severity is it is needed for fixing a regression in 5.17.



Flags​:
  category=utilities
  severity=critical


Site configuration information for perl 5.17.5​:

Configured by khw at Fri Sep 28 09​:22​:42 MDT 2012.

Summary of my perl5 (revision 5 version 17 subversion 5) configuration​:
  Commit id​: f056f7d
  Platform​:
  osname=linux, osvers=2.6.35-32-generic-pae,
archname=i686-linux-thread-multi-64int-ld
  uname='linux karl 2.6.35-32-generic-pae #67-ubuntu smp mon mar 5
21​:23​:19 utc 2012 i686 gnulinux '
  config_args='-des -Dprefix=/home/khw/blead -Dusedevel
-D'optimize=-ggdb3' -A'optimize=-ggdb3' -A'optimize=-O0' -Dman1dir=none
-Dman3dir=none -DDEBUGGING -Dcc=g++ -Dusemorebits -Dusethreads'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=define, usemultiplicity=define
  useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
  use64bitint=define, use64bitall=undef, uselongdouble=define
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='g++', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
  optimize='-O0 -ggdb3',
  cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
  ccversion='', gccversion='4.4.5', gccosandvers=''
  intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
  ivtype='long long', ivsize=8, nvtype='long double', nvsize=12,
Off_t='off_t', lseeksize=8
  alignbytes=4, prototype=define
  Linker and Libraries​:
  ld='g++', ldflags =' -fstack-protector -L/usr/local/lib'
  libpth=/usr/local/lib /lib/../lib /usr/lib/../lib /lib /usr/lib
/usr/lib/i686-linux-gnu
  libs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
  perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
  libc=/lib/../lib/libc.so.6, so=so, useshrplib=false, libperl=libperl.a
  gnulibc_version='2.12'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
  cccdlflags='-fPIC', lddlflags='-shared -ggdb3 -ggdb3 -O0
-L/usr/local/lib -fstack-protector'

Locally applied patches​:


@​INC for perl 5.17.5​:

/home/khw/blead/lib/perl5/site_perl/5.17.5/i686-linux-thread-multi-64int-ld
  /home/khw/blead/lib/perl5/site_perl/5.17.5
  /home/khw/blead/lib/perl5/5.17.5/i686-linux-thread-multi-64int-ld
  /home/khw/blead/lib/perl5/5.17.5
  /home/khw/blead/lib/perl5/site_perl
  .


Environment for perl 5.17.5​:
  HOME=/home/khw
  LANG=en_US.UTF-8
  LANGUAGE=en_US​:en
  LD_LIBRARY_PATH (unset)
  LOGDIR (unset)

PATH=/home/khw/bin​:/home/khw/print/bin​:/bin​:/usr/local/sbin​:/usr/local/bin​:/usr/sbin​:/usr/bin​:/sbin​:/usr/games​:/home/khw/cxoffice/bin
  PERL5OPT=-w
  PERL_BADLANG (unset)
  SHELL=/bin/ksh

@p5pRT
Copy link
Author

p5pRT commented Sep 29, 2012

From @demerphq

On 29 September 2012 06​:29, karl williamson <perlbug-followup@​perl.org> wrote​:

# New Ticket Created by karl williamson
# Please include the string​: [perl #115078]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=115078 >

This is a bug report for perl from khw@​karl.(none),
generated with the help of perlbug 1.39 running under perl 5.17.5.

-----------------------------------------------------------------
The following lines added to regen/regcharclass.pl generate
wrong code​:
DEMO​: This is wrong code
=> UTF8 :safe
"\x{3B7}\x{342}"
"\x{3B9}\x{308}\x{301}"

==============
The generated code is​:
/*
DEMO​: This is wrong code

    "\\x\{3B7\}\\x\{342\}"
    "\\x\{3B9\}\\x\{308\}\\x\{301\}"

*/
/*** GENERATED CODE ***/
#define is_DEMO_utf8_safe(s,e)
\
( ( ((e)-(s) > 5) && ( 0xCE == ((U8*)s)[0] ) ) ? ( ( 0xB7 == ((U8*)s)[1]
) ?\
( ( 0xCD == ((U8*)s)[2] ) ? \
( ( 0x82 == ((U8*)s)[3] ) ? 4 : 0 ) \
: ( 0xB9 == ((U8*)s)[1] ) ? \
( ( ( ( ( 0xCC == ((U8*)s)[2] ) && ( 0x88 == ((U8*)s)[3] ) ) && (
0xCC == ((U8*)s)[4] ) ) && ( 0x81 == ((U8*)s)[5] ) ) ? 6 : 0 )\
: 0 ) \
: 0 )
\
: ((e)-(s) > 3) ?
\
( ( ( ( ( 0xCE == ((U8*)s)[0] ) && ( 0xB7 == ((U8*)s)[1] ) ) && (
0xCD == ((U8*)s)[2] ) ) && ( 0x82 == ((U8*)s)[3] ) ) ? 4 : 0 )\
: 0 )

==============
Here is the (correct) code for just the first string​:
/*
DEMO A​: This is correct code

    "\\x\{3B7\}\\x\{342\}"

*/
/*** GENERATED CODE ***/
#define is_DEMO A_utf8_safe(s,e)
\
( ( ( ( ( ((e)-(s) > 3) && ( 0xCE == ((U8*)s)[0] ) ) && ( 0xB7 ==
((U8*)s)[1] ) ) && ( 0xCD == ((U8*)s)[2] ) ) && ( 0x82 == ((U8*)s)[3] )
) ? 4 : 0 )

=============
Here is the (correct) code for just the second string​:
/*
DEMO B​: This is correct code

    "\\x\{3B9\}\\x\{308\}\\x\{301\}"

*/
/*** GENERATED CODE ***/
#define is_DEMO B_utf8_safe(s,e)
\
( ( ( ( ( ( ( ((e)-(s) > 5) && ( 0xCE == ((U8*)s)[0] ) ) && ( 0xB9 ==
((U8*)s)[1] ) ) && ( 0xCC == ((U8*)s)[2] ) ) && ( 0x88 == ((U8*)s)[3] )
) && ( 0xCC == ((U8*)s)[4] ) ) && ( 0x81 == ((U8*)s)[5] ) ) ? 6 : 0 )

============
The reason it is wrong is there is a parenthesis grouping problem. The
0xB9 in s[1] in the wrong code should be the alternative tested if s[1]
isn't 0xB7, like so​:
( B7 == s[1] )
? ( ... )
: ( B9 == s[1] ? (...) : (...) )
But instead the parentheses group it differently, which you can tell if your
editor shows balanced parentheses.

I get the same results running this on a version of this utility that
predates the recent changes made to it. The reason this is marked
critical severity is it is needed for fixing a regression in 5.17.

-----------------------------------------------------------------
---
Flags​:
category=utilities
severity=critical
---
Site configuration information for perl 5.17.5​:

Configured by khw at Fri Sep 28 09​:22​:42 MDT 2012.

Summary of my perl5 (revision 5 version 17 subversion 5) configuration​:
Commit id​: f056f7d
Platform​:
osname=linux, osvers=2.6.35-32-generic-pae,
archname=i686-linux-thread-multi-64int-ld
uname='linux karl 2.6.35-32-generic-pae #67-ubuntu smp mon mar 5
21​:23​:19 utc 2012 i686 gnulinux '
config_args='-des -Dprefix=/home/khw/blead -Dusedevel
-D'optimize=-ggdb3' -A'optimize=-ggdb3' -A'optimize=-O0' -Dman1dir=none
-Dman3dir=none -DDEBUGGING -Dcc=g++ -Dusemorebits -Dusethreads'
hint=recommended, useposix=true, d_sigaction=define
useithreads=define, usemultiplicity=define
useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
use64bitint=define, use64bitall=undef, uselongdouble=define
usemymalloc=n, bincompat5005=undef
Compiler​:
cc='g++', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O0 -ggdb3',
cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
ccversion='', gccversion='4.4.5', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long long', ivsize=8, nvtype='long double', nvsize=12,
Off_t='off_t', lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries​:
ld='g++', ldflags =' -fstack-protector -L/usr/local/lib'
libpth=/usr/local/lib /lib/../lib /usr/lib/../lib /lib /usr/lib
/usr/lib/i686-linux-gnu
libs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
libc=/lib/../lib/libc.so.6, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version='2.12'
Dynamic Linking​:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
cccdlflags='-fPIC', lddlflags='-shared -ggdb3 -ggdb3 -O0
-L/usr/local/lib -fstack-protector'

Locally applied patches​:

---
@​INC for perl 5.17.5​:

/home/khw/blead/lib/perl5/site_perl/5.17.5/i686-linux-thread-multi-64int-ld
/home/khw/blead/lib/perl5/site_perl/5.17.5
/home/khw/blead/lib/perl5/5.17.5/i686-linux-thread-multi-64int-ld
/home/khw/blead/lib/perl5/5.17.5
/home/khw/blead/lib/perl5/site_perl
.

---
Environment for perl 5.17.5​:
HOME=/home/khw
LANG=en_US.UTF-8
LANGUAGE=en_US​:en
LD_LIBRARY_PATH (unset)
LOGDIR (unset)

PATH=/home/khw/bin​:/home/khw/print/bin​:/bin​:/usr/local/sbin​:/usr/local/bin​:/usr/sbin​:/usr/bin​:/sbin​:/usr/games​:/home/khw/cxoffice/bin
PERL5OPT=-w
PERL_BADLANG (unset)
SHELL=/bin/ksh

ill try to take a look, but im on holiday so it could be a while.

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Sep 29, 2012

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Sep 29, 2012

From @demerphq

On 29 September 2012 16​:24, demerphq <demerphq@​gmail.com> wrote​:

On 29 September 2012 06​:29, karl williamson <perlbug-followup@​perl.org> wrote​:

# New Ticket Created by karl williamson
# Please include the string​: [perl #115078]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=115078 >

This is a bug report for perl from khw@​karl.(none),
generated with the help of perlbug 1.39 running under perl 5.17.5.

-----------------------------------------------------------------
The following lines added to regen/regcharclass.pl generate
wrong code​:
DEMO​: This is wrong code
=> UTF8 :safe
"\x{3B7}\x{342}"
"\x{3B9}\x{308}\x{301}"

==============
The generated code is​:
/*
DEMO​: This is wrong code

    "\\x\{3B7\}\\x\{342\}"
    "\\x\{3B9\}\\x\{308\}\\x\{301\}"

*/
/*** GENERATED CODE ***/
#define is_DEMO_utf8_safe(s,e)
\
( ( ((e)-(s) > 5) && ( 0xCE == ((U8*)s)[0] ) ) ? ( ( 0xB7 == ((U8*)s)[1]
) ?\
( ( 0xCD == ((U8*)s)[2] ) ? \
( ( 0x82 == ((U8*)s)[3] ) ? 4 : 0 ) \
: ( 0xB9 == ((U8*)s)[1] ) ? \
( ( ( ( ( 0xCC == ((U8*)s)[2] ) && ( 0x88 == ((U8*)s)[3] ) ) && (
0xCC == ((U8*)s)[4] ) ) && ( 0x81 == ((U8*)s)[5] ) ) ? 6 : 0 )\
: 0 ) \
: 0 )
\
: ((e)-(s) > 3) ?
\
( ( ( ( ( 0xCE == ((U8*)s)[0] ) && ( 0xB7 == ((U8*)s)[1] ) ) && (
0xCD == ((U8*)s)[2] ) ) && ( 0x82 == ((U8*)s)[3] ) ) ? 4 : 0 )\
: 0 )

==============
Here is the (correct) code for just the first string​:
/*
DEMO A​: This is correct code

    "\\x\{3B7\}\\x\{342\}"

*/
/*** GENERATED CODE ***/
#define is_DEMO A_utf8_safe(s,e)
\
( ( ( ( ( ((e)-(s) > 3) && ( 0xCE == ((U8*)s)[0] ) ) && ( 0xB7 ==
((U8*)s)[1] ) ) && ( 0xCD == ((U8*)s)[2] ) ) && ( 0x82 == ((U8*)s)[3] )
) ? 4 : 0 )

=============
Here is the (correct) code for just the second string​:
/*
DEMO B​: This is correct code

    "\\x\{3B9\}\\x\{308\}\\x\{301\}"

*/
/*** GENERATED CODE ***/
#define is_DEMO B_utf8_safe(s,e)
\
( ( ( ( ( ( ( ((e)-(s) > 5) && ( 0xCE == ((U8*)s)[0] ) ) && ( 0xB9 ==
((U8*)s)[1] ) ) && ( 0xCC == ((U8*)s)[2] ) ) && ( 0x88 == ((U8*)s)[3] )
) && ( 0xCC == ((U8*)s)[4] ) ) && ( 0x81 == ((U8*)s)[5] ) ) ? 6 : 0 )

============
The reason it is wrong is there is a parenthesis grouping problem. The
0xB9 in s[1] in the wrong code should be the alternative tested if s[1]
isn't 0xB7, like so​:
( B7 == s[1] )
? ( ... )
: ( B9 == s[1] ? (...) : (...) )
But instead the parentheses group it differently, which you can tell if your
editor shows balanced parentheses.

I get the same results running this on a version of this utility that
predates the recent changes made to it. The reason this is marked
critical severity is it is needed for fixing a regression in 5.17.

Hi Karl, this is fixed (and some additional improvments added) in the
patch series merged in with​:

f6abc0d

Please let me know if there are issues.

cheers,
Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Sep 29, 2012

@cpansprout - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant