Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use re 'eval' does not capture lexical environment %^H like eval() does, breaking charnames #20950

Open
mauke opened this issue Mar 21, 2023 · 6 comments

Comments

@mauke
Copy link
Contributor

mauke commented Mar 21, 2023

Module: charnames

Description
You can get charnames to blindly call a stringified coderef as if it were a sub, causing bizarre errors.

Steps to Reproduce

#!/usr/bin/env perl
use strict;
use warnings;
use charnames qw(:full);

use re 'eval';
use overload ();
BEGIN {
    overload::constant qr => sub { 
        qq{(?{ "\\N{EURO SIGN}" })}
    };
}

qr/a/;
__END__

Result:

Undefined subroutine &main::CODE(0x55af6577c700) called at (eval 5) line 1.

Expected behavior
To be honest, I'm not sure. I don't really understand what this code does; it was reduced from Regexp::Grammars. I wasn't expecting an error, though.

Perl configuration

Summary of my perl5 (revision 5 version 36 subversion 0) configuration:
   
  Platform:
    osname=linux
    osvers=5.15.0-53-generic
    archname=x86_64-linux
    uname='linux luum 5.15.0-53-generic #59-ubuntu smp mon oct 17 18:53:30 utc 2022 x86_64 x86_64 x86_64 gnulinux '
    config_args='-de -Dprefix=/home/mauke/perl5/perlbrew/perls/perl-5.36.0 -Duseshrplib -Aeval:scriptdir=/home/mauke/perl5/perlbrew/perls/perl-5.36.0/bin'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
  Compiler:
    cc='cc'
    ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
    optimize='-O2'
    cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
    ccversion=''
    gccversion='11.3.0'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags =' -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/x86_64-linux-gnu /usr/lib /usr/lib64
    libs=-lpthread -lnsl -ldb -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    libc=/lib/x86_64-linux-gnu/libc.so.6
    so=so
    useshrplib=true
    libperl=libperl.so
    gnulibc_version='2.35'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='-Wl,-E -Wl,-rpath,/home/mauke/perl5/perlbrew/perls/perl-5.36.0/lib/5.36.0/x86_64-linux/CORE'
    cccdlflags='-fPIC'
    lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong'


Characteristics of this binary (from libperl): 
  Compile-time options:
    HAS_TIMES
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
  Built under linux
  Compiled at Nov 21 2022 09:12:40
  %ENV:
    PERLBREW_BASHRC_VERSION="0.74"
    PERLBREW_HOME="/home/mauke/.perlbrew"
    PERLBREW_MANPATH="/home/mauke/perl5/perlbrew/perls/perl-5.36.0/man"
    PERLBREW_PATH="/home/mauke/perl5/perlbrew/bin:/home/mauke/perl5/perlbrew/perls/perl-5.36.0/bin"
    PERLBREW_PERL="perl-5.36.0"
    PERLBREW_ROOT="/home/mauke/perl5/perlbrew"
    PERLBREW_VERSION="0.94"
    PERLDOC="-oterm"
  @INC:
    /home/mauke/perl5/perlbrew/perls/perl-5.36.0/lib/site_perl/5.36.0/x86_64-linux
    /home/mauke/perl5/perlbrew/perls/perl-5.36.0/lib/site_perl/5.36.0
    /home/mauke/perl5/perlbrew/perls/perl-5.36.0/lib/5.36.0/x86_64-linux
    /home/mauke/perl5/perlbrew/perls/perl-5.36.0/lib/5.36.0
@mauke
Copy link
Contributor Author

mauke commented Mar 21, 2023

Turns out I don't even need all this overloading business:

$ perl -e 'use re "eval"; use charnames ":full"; my $x = qq<(?{ "\\N{COMMA}" })>; qr/$x/'
Undefined subroutine &main::CODE(0x5576e417d010) called at (eval 5) line 1.

@haarg
Copy link
Contributor

haarg commented Mar 21, 2023

Refs in the hints hash %^H are normally stringified at the end of the compile phase. charnames uses a subref in %^H to work. So if something needs to evaluate strings at runtime, it won't be able to unless it captures the hints at compile time.

eval does capture the hints hash in its original form, so it doesn't suffer from this issue. s///ee also does. But regexes using (?{ }) via interpolation (which only works under use re "eval") do not.

@demerphq
Copy link
Collaborator

But regexes using (?{ }) via interpolation (which only works under use re "eval") do not.

I still consider that a bug tho, at the very least it should give a better error message.

Although there is a bunch of weirdness with this case. Returning a charnames string from a (?{ ... }) doesn't make a lot of sense to me (maybe for $^R). I assume mauke wanted (??{ ... }) (which still has the same issue). But the other thing is why not just call charnames in the double quoted string?

Anyway, imo at the very least this should produce a better error message.

@mauke
Copy link
Contributor Author

mauke commented Mar 22, 2023

It looks like use re 'eval' turns qr/$x/ into something like eval "qr{$x}", but unlike real eval, it does not properly capture the state of %^H and instead stringifies its values, which breaks the charnames translator interface (which requires a coderef in %^H).

I think re 'eval' should make all regexes with interpolation act like eval, so they see the same %^H inside.

@demerphq
Copy link
Collaborator

I think re 'eval' should make all regexes with interpolation act like eval, so they see the same %^H inside.

I can try to take a look, but i suspect this is more appropriate for @iabyn to pick up.

@haarg
Copy link
Contributor

haarg commented Mar 22, 2023

The code related to this seems to be in the call checker for eval adding a new OP which stores a copy of the hints hash:

perl5/op.c

Lines 12214 to 12225 in 0d292c7

if ((PL_hints & HINT_LOCALIZE_HH) != 0
&& !(o->op_private & OPpEVAL_COPHH) && GvHV(PL_hintgv)) {
/* Store a copy of %^H that pp_entereval can pick up. */
HV *hh = hv_copy_hints_hv(GvHV(PL_hintgv));
OP *hhop;
STOREFEATUREBITSHH(hh);
hhop = newSVOP(OP_HINTSEVAL, 0, MUTABLE_SV(hh));
/* append hhop to only child */
op_sibling_splice(o, cUNOPo->op_first, 0, hhop);
o->op_private |= OPpEVAL_HAS_HH;
}

Then the hintseval OP puts that hash on the stack:

perl5/pp_ctl.c

Lines 5007 to 5012 in 0d292c7

PP(pp_hintseval)
{
dSP;
mXPUSHs(MUTABLE_SV(hv_copy_hints_hv(MUTABLE_HV(cSVOP_sv))));
RETURN;
}

And eval knows to take it off the stack based on the OPpEVAL_HAS_HH flag:

perl5/pp_ctl.c

Lines 5052 to 5054 in 0d292c7

if (PL_op->op_private & OPpEVAL_HAS_HH) {
saved_hh = MUTABLE_HV(SvREFCNT_inc(POPs));
}

@mauke mauke changed the title charnames tries to call stringifed coderef as subroutine use re 'eval' does not capture lexical environment %^H like eval() does, breaking charnames Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants