Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

5.5.3 vs 5.6.1: A relust of split() changed when split's regexp has capture-()s #4427

Closed
p5pRT opened this issue Sep 20, 2001 · 3 comments
Closed

Comments

@p5pRT
Copy link

p5pRT commented Sep 20, 2001

Migrated from rt.perl.org#7709 (status was 'resolved')

Searchable as RT7709$

@p5pRT
Copy link
Author

p5pRT commented Sep 20, 2001

From hac@subscribe.ru

Description

A result of split() changed when split's regexp has capture-()s

5.005_03 - array elements for unmatched () was undef
5.6.1 - array elements for unmatched () is ''

Test case from real life - parsing HTML's &...; symbol-entity.

Test script source and perl -V below.

How to reproduce

./test -script 'aaa { bbb Ǐ ccc   ddd'

Output​:

5.005_03 5.6.1
--------- ------
 
field 'aaa ' field 'aaa '
$1 UN-defined $1 defined and is ''
$2 defined and is '123' $2 defined and is '123'
$3 UN-defined $3 defined and is ''
 
field ' bbb ' field ' bbb '
$1 UN-defined $1 defined and is ''
$2 UN-defined $2 defined and is ''
$3 defined and is '01CF' $3 defined and is '01CF'
 
field ' ccc ' field ' ccc '
$1 defined and is 'nbsp' $1 defined and is 'nbsp'
$2 UN-defined $2 defined and is ''
$3 UN-defined $3 defined and is ''
 
field ' ddd' field ' ddd'

Perl -V for 5.005_03


Summary of my perl5 (5.0 patchlevel 5 subversion 3) configuration​:
  Platform​:
  osname=linux, osvers=2.2.10, archname=i586-linux
  uname='linux fatou 2.2.10 #2 smp thu jul 15 15​:03​:02 mest 1999 i686 unknown '
  hint=recommended, useposix=true, d_sigaction=define
  usethreads=undef useperlio=undef d_sfio=undef
  Compiler​:
  cc='cc', optimize='-O2 -pipe', gccversion=egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)
  cppflags='-Dbool=char -DHAS_BOOL -I/usr/local/include'
  ccflags ='-Dbool=char -DHAS_BOOL -I/usr/local/include'
  stdchar='char', d_stdstdio=undef, usevfork=false
  intsize=4, longsize=4, ptrsize=4, doublesize=8
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
  alignbytes=4, usemymalloc=n, prototype=define
  Linker and Libraries​:
  ld='cc', ldflags =' -L/usr/local/lib'
  libpth=/usr/local/lib /lib /usr/lib
  libs=-lnsl -lndbm -lgdbm -ldb -ldl -lm -lc -lposix -lcrypt
  libc=, so=so, useshrplib=false, libperl=libperl.a
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
  cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'

Characteristics of this binary (from libperl)​:
  Built under linux
  Compiled at Jul 22 1999 21​:20​:02
  @​INC​:
  /usr/lib/perl5/5.00503/i586-linux
  /usr/lib/perl5/5.00503
  /usr/lib/perl5/site_perl/5.005/i586-linux
  /usr/lib/perl5/site_perl/5.005
  .

Perl -V for 5.6.1


Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration​:
  Platform​:
  osname=freebsd, osvers=4.3-release, archname=i386-freebsd-thread
  uname='freebsd serega.citycat.ru 4.3-release freebsd 4.3-release #0​: mon jun 25 12​:27​:41 msd 2001
root@​serega.citycat.ru​:usrsrcsyscompileserega i386 '
  config_args='-d -Dusethreads -Duse5005threads'
  hint=previous, useposix=true, d_sigaction=define
  usethreads=define use5005threads=define useithreads=undef usemultiplicity=undef
  useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef
  use64bitint=undef use64bitall=undef uselongdouble=undef
  Compiler​:
  cc='cc', ccflags ='-fno-strict-aliasing -I/usr/local/include',
  optimize='-O',
  cppflags='-fno-strict-aliasing -I/usr/local/include'
  ccversion='', gccversion='2.95.3 [FreeBSD] 20010315 (release)', gccosandvers=''
  intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
  ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
  alignbytes=4, usemymalloc=n, prototype=define
  Linker and Libraries​:
  ld='cc', ldflags ='-pthread -Wl,-E -L/usr/local/lib'
  libpth=/usr/lib /usr/local/lib
  libs=-lm -lc_r -lcrypt -lutil
  perllibs=-lm -lc_r -lcrypt -lutil
  libc=, so=so, useshrplib=false, libperl=libperl.a
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
  cccdlflags='-DPIC -fpic', lddlflags='-shared -L/usr/local/lib'

Characteristics of this binary (from libperl)​:
  Compile-time options​: USE_THREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
  Built under freebsd
  Compiled at Jul 6 2001 19​:15​:27
  @​INC​:
  /usr/local/lib/perl5/5.6.1/i386-freebsd-thread
  /usr/local/lib/perl5/5.6.1
  /usr/local/lib/perl5/site_perl/5.6.1/i386-freebsd-thread
  /usr/local/lib/perl5/site_perl/5.6.1
  /usr/local/lib/perl5/site_perl
  .

Test script source​:


#!/usr/bin/perl

@​A = split(/&(?​:([a-zA-Z0-9]+)|#([0-9]+)|#[xX]([0-9a-hA-H]+));/go,@​ARGV[0]);

while(@​A)
{
  print "field '",shift(@​A),"'\n";

  last unless(@​A);

  foreach (qw(1 2 3))
  {
  $a = shift(@​A);
  print '$',$_,' ';
  if (defined $a)
  {
  print "defined and is '$a'\n";
  }
  else
  {
  print "UN-defined\n";
  }
  }
  print "\n";
}



Pavel Yakovlev
mailto​: hac@​subscribe.ru ICQ 8085803
PPY-RIPN PY125-RIPE


"When God talks with me it's the miracle. When I talk with God it's the madness"

@p5pRT
Copy link
Author

p5pRT commented Sep 20, 2001

From [Unknown Contact. See original ticket]

A result of split() changed when split's regexp has capture-()s

5.005_03 - array elements for unmatched () was undef
5.6.1 - array elements for unmatched () is ''

I patched this for bleadperl about a month ago or so. It'll be
"fixed" (it will back to the unmatched-()-returns-undef behavior) in the
next release of Perl.

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2008

p5p@spam.wizbit.be - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant