From fa8e6aa1bd1689b8fbc9a0800439e06571a7de46 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Tue, 3 Jun 2025 14:13:43 +0100 Subject: [PATCH 01/42] perlxs.pod: remove section on %v The XS parser supported an extremely obscure bit of functionality which made use of the %v package variable to maintain state between different bits of typemap processing. This was accidentally broken in 5.10.0: refactoring removed the 'use vars "%v"' line, and no one seemed to notice or care. Also, the sole example of its use in the docs seemed to be obscure, confusing and probably wrong. There was a consensus in the discussion at http://nntp.perl.org/group/perl.perl5.porters/267667 that we should stop documenting this feature rather than trying to fix it. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 22 +--------------------- 1 file changed, 1 insertion(+), 21 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 26b8e19a06d3..9af411940c69 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -626,27 +626,7 @@ begins with C<;> or C<+>, then it is performed after all of the input variables have been declared. In the C<;> case the initialization normally supplied by the typemap is not performed. For the C<+> case, the declaration for the variable will include the -initialization from the typemap. A global -variable, C<%v>, is available for the truly rare case where -information from one initialization is needed in another -initialization. - -Here's a truly obscure example: - - bool_t - rpcb_gettime(host,timep) - time_t &timep; /* \$v{timep}=@{[$v{timep}=$arg]} */ - char *host + SvOK($v{timep}) ? SvPVbyte_nolen($arg) : NULL; - OUTPUT: - timep - -The construct C<\$v{timep}=@{[$v{timep}=$arg]}> used in the above -example has a two-fold purpose: first, when this line is processed by -B, the Perl snippet C<$v{timep}=$arg> is evaluated. Second, -the text of the evaluated snippet is output into the generated C file -(inside a C comment)! During the processing of C line, -C<$arg> will evaluate to C, and C<$v{timep}> will evaluate to -C. +initialization from the typemap. =head2 Default Parameter Values From 8f8f7f00164187dadd7c6f7c5e076b5f28e8840b Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Tue, 3 Jun 2025 15:19:19 +0100 Subject: [PATCH 02/42] perlxs.pod: reindent and reformat code examples The various XS code examples had odd and inconsistent indentation (often with 5 leading spaces) and inconsistent formatting, e.g. foo(a,b) vs foo( a, b ) vs foo(a, b). Fix that, and also remove any tab chars. Whitespace-only change. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 1171 +++++++++++++------------- 1 file changed, 586 insertions(+), 585 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 9af411940c69..a9bdbfe4f49e 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -98,22 +98,22 @@ function is used to demonstrate many features of the XS language. This function has two parameters; the first is an input parameter and the second is an output parameter. The function also returns a status value. - bool_t rpcb_gettime(const char *host, time_t *timep); + bool_t rpcb_gettime(const char *host, time_t *timep); From C this function will be called with the following statements. - #include - bool_t status; - time_t timep; - status = rpcb_gettime( "localhost", &timep ); + #include + bool_t status; + time_t timep; + status = rpcb_gettime("localhost", &timep); If an XSUB is created to offer a direct translation between this function and Perl, then this XSUB will be used from Perl with the following code. The $status and $timep variables will contain the output of the function. - use RPC; - $status = rpcb_gettime( "localhost", $timep ); + use RPC; + $status = rpcb_gettime("localhost", $timep); The following XS file shows an XS subroutine, or XSUB, which demonstrates one possible interface to the rpcb_gettime() @@ -128,20 +128,20 @@ should be present to fetch the interpreter context more efficiently, see L for details. - #define PERL_NO_GET_CONTEXT - #include "EXTERN.h" - #include "perl.h" - #include "XSUB.h" - #include + #define PERL_NO_GET_CONTEXT + #include "EXTERN.h" + #include "perl.h" + #include "XSUB.h" + #include - MODULE = RPC PACKAGE = RPC + MODULE = RPC PACKAGE = RPC - bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep - OUTPUT: - timep + bool_t + rpcb_gettime(host, timep) + char *host + time_t &timep + OUTPUT: + timep Any extension to Perl, including those containing XSUBs, should have a Perl module to serve as the bootstrap which @@ -153,15 +153,15 @@ in this document and should be used from Perl with the C command as shown earlier. Perl modules are explained in more detail later in this document. - package RPC; + package RPC; - require Exporter; - require DynaLoader; - @ISA = qw(Exporter DynaLoader); - @EXPORT = qw( rpcb_gettime ); + require Exporter; + require DynaLoader; + @ISA = qw(Exporter DynaLoader); + @EXPORT = qw(rpcb_gettime); - bootstrap RPC; - 1; + bootstrap RPC; + 1; Throughout this document a variety of interfaces to the rpcb_gettime() XSUB will be explored. The XSUBs will take their parameters in different @@ -182,33 +182,33 @@ The following XSUB allows a Perl program to access a C library function called sin(). The XSUB will imitate the C function which takes a single argument and returns a single value. - double - sin(x) - double x + double + sin(x) + double x Optionally, one can merge the description of types and the list of argument names, rewriting this as - double - sin(double x) + double + sin(double x) This makes this XSUB look similar to an ANSI C declaration. An optional semicolon is allowed after the argument list, as in - double - sin(double x); + double + sin(double x); Parameters with C pointer types can have different semantic: C functions with similar declarations - bool string_looks_as_a_number(char *s); - bool make_char_uppercase(char *c); + bool string_looks_as_a_number(char *s); + bool make_char_uppercase(char *c); are used in absolutely incompatible manner. Parameters to these functions could be described to B like this: - char * s - char &c + char *s + char &c Both these XS declarations correspond to the C C type, but they have different semantics, see L<"The & Unary Operator">. @@ -221,21 +221,21 @@ for more info about handling qualifiers and unary operators in C types. The function name and the return type must be placed on separate lines and should be flush left-adjusted. - INCORRECT CORRECT + INCORRECT CORRECT - double sin(x) double - double x sin(x) - double x + double sin(x) double + double x sin(x) + double x The rest of the function description may be indented or left-adjusted. The following example shows a function with its body left-adjusted. Most examples in this document will indent the body for better readability. - CORRECT + CORRECT - double - sin(x) - double x + double + sin(x) + double x More complicated XSUBs may contain many other sections. Each section of an XSUB starts with the corresponding keyword, such as INIT: or CLEANUP:. @@ -305,32 +305,32 @@ easier, the typemap file automatically makes C mortal when you're returning an C. Thus, the following two XSUBs are more or less equivalent: - void - alpha() + void + alpha() PPCODE: - ST(0) = newSVpv("Hello World",0); - sv_2mortal(ST(0)); - XSRETURN(1); + ST(0) = newSVpv("Hello World", 0); + sv_2mortal(ST(0)); + XSRETURN(1); - SV * - beta() + SV * + beta() CODE: - RETVAL = newSVpv("Hello World",0); + RETVAL = newSVpv("Hello World", 0); OUTPUT: - RETVAL + RETVAL This is quite useful as it usually improves readability. While this works fine for an C, it's unfortunately not as easy to have C or C as a return value. You I be able to write: - AV * - array() + AV * + array() CODE: - RETVAL = newAV(); - /* do something with RETVAL */ + RETVAL = newAV(); + /* do something with RETVAL */ OUTPUT: - RETVAL + RETVAL But due to an unfixable bug (fixing it would break lots of existing CPAN modules) in the typemap file, the reference count of the C @@ -342,21 +342,21 @@ In XS code on perls starting with perl 5.16, you can override the typemaps for any of these types with a version that has proper handling of refcounts. In your C section, do - AV* T_AVREF_REFCOUNT_FIXED + AV* T_AVREF_REFCOUNT_FIXED to get the repaired variant. For backward compatibility with older versions of perl, you can instead decrement the reference count manually when you're returning one of the aforementioned types using C: - AV * - array() + AV * + array() CODE: - RETVAL = newAV(); - sv_2mortal((SV*)RETVAL); - /* do something with RETVAL */ + RETVAL = newAV(); + sv_2mortal((SV*)RETVAL); + /* do something with RETVAL */ OUTPUT: - RETVAL + RETVAL Remember that you don't have to do this for an C. The reference documentation for all core typemaps can be found in L. @@ -375,7 +375,7 @@ constant within the same XS file, though this is not required. The following example will start the XS code and will place all functions in a package named RPC. - MODULE = RPC + MODULE = RPC =head2 The PACKAGE Keyword @@ -383,17 +383,17 @@ When functions within an XS source file must be separated into packages the PACKAGE keyword should be used. This keyword is used with the MODULE keyword and must follow immediately after it when used. - MODULE = RPC PACKAGE = RPC + MODULE = RPC PACKAGE = RPC - [ XS code in package RPC ] + [ XS code in package RPC ] - MODULE = RPC PACKAGE = RPCB + MODULE = RPC PACKAGE = RPCB - [ XS code in package RPCB ] + [ XS code in package RPCB ] - MODULE = RPC PACKAGE = RPC + MODULE = RPC PACKAGE = RPC - [ XS code in package RPC ] + [ XS code in package RPC ] The same package name can be used more than once, allowing for non-contiguous code. This is useful if you have a stronger ordering @@ -414,9 +414,9 @@ This keyword should follow the PACKAGE keyword when used. If PACKAGE is not used then PREFIX should follow the MODULE keyword. - MODULE = RPC PREFIX = rpc_ + MODULE = RPC PREFIX = rpc_ - MODULE = RPC PACKAGE = RPCB PREFIX = rpcb_ + MODULE = RPC PACKAGE = RPCB PREFIX = rpcb_ =head2 The OUTPUT: Keyword @@ -440,23 +440,23 @@ are output variables. This may be necessary when a parameter has been modified within the function and the programmer would like the update to be seen by Perl. - bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep - OUTPUT: - timep + bool_t + rpcb_gettime(host, timep) + char *host + time_t &timep + OUTPUT: + timep The OUTPUT: keyword will also allow an output parameter to be mapped to a matching piece of code rather than to a typemap. - bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep - OUTPUT: - timep sv_setnv(ST(1), (double)timep); + bool_t + rpcb_gettime(host, timep) + char *host + time_t &timep + OUTPUT: + timep sv_setnv(ST(1), (double)timep); B emits an automatic C for all parameters in the OUTPUT section of the XSUB, except RETVAL. This is the usually desired @@ -485,11 +485,11 @@ user-supplied code. It is especially useful to make a function interface more Perl-like, especially when the C return value is just an error condition indicator. For example, - NO_OUTPUT int - delete_file(char *name) - POSTCALL: - if (RETVAL != 0) - croak("Error %d while deleting file '%s'", RETVAL, name); + NO_OUTPUT int + delete_file(char *name) + POSTCALL: + if (RETVAL != 0) + croak("Error %d while deleting file '%s'", RETVAL, name); Here the generated XS function returns nothing on success, and will die() with a meaningful error message on error. @@ -504,19 +504,19 @@ in the OUTPUT: section. The following XSUB is for a C function which requires special handling of its parameters. The Perl usage is given first. - $status = rpcb_gettime( "localhost", $timep ); + $status = rpcb_gettime("localhost", $timep); The XSUB follows. - bool_t - rpcb_gettime(host,timep) - char *host - time_t timep - CODE: - RETVAL = rpcb_gettime( host, &timep ); - OUTPUT: - timep - RETVAL + bool_t + rpcb_gettime(host, timep) + char *host + time_t timep + CODE: + RETVAL = rpcb_gettime(host, &timep); + OUTPUT: + timep + RETVAL =head2 The INIT: Keyword @@ -525,26 +525,26 @@ the compiler generates the call to the C function. Unlike the CODE: keyword above, this keyword does not affect the way the compiler handles RETVAL. bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep - INIT: - printf("# Host is %s\n", host ); - OUTPUT: - timep + rpcb_gettime(host, timep) + char *host + time_t &timep + INIT: + printf("# Host is %s\n", host); + OUTPUT: + timep Another use for the INIT: section is to check for preconditions before making a call to the C function: long long - lldiv(a,b) - long long a - long long b + lldiv(a, b) + long long a + long long b INIT: - if (a == 0 && b == 0) - XSRETURN_UNDEF; - if (b == 0) - croak("lldiv: cannot divide by 0"); + if (a == 0 && b == 0) + XSRETURN_UNDEF; + if (b == 0) + croak("lldiv: cannot divide by 0"); =head2 The NO_INIT Keyword @@ -561,12 +561,12 @@ The following example shows a variation of the rpcb_gettime() function. This function uses the timep variable only as an output variable and does not care about its initial contents. - bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep = NO_INIT - OUTPUT: - timep + bool_t + rpcb_gettime(host, timep) + char *host + time_t &timep = NO_INIT + OUTPUT: + timep =head2 The TYPEMAP: Keyword @@ -578,9 +578,9 @@ default typemap, the embedded typemaps may overwrite previous definitions of TYPEMAP, INPUT, and OUTPUT stanzas. The syntax for embedded typemaps is - TYPEMAP: < keyword must appear in the first column of a new line. @@ -607,12 +607,12 @@ which should be interpreted literally [mainly C<$>, C<@>, or C<\\>] must be protected with backslashes. The variables C<$var>, C<$arg>, and C<$type> can be used as in typemaps. - bool_t - rpcb_gettime(host,timep) - char *host = (char *)SvPVbyte_nolen($arg); - time_t &timep = 0; - OUTPUT: - timep + bool_t + rpcb_gettime(host, timep) + char *host = (char *)SvPVbyte_nolen($arg); + time_t &timep = 0; + OUTPUT: + timep This should not be used to supply default values for parameters. One would normally use this when a function parameter must be processed by @@ -641,23 +641,23 @@ XSUB will then call the real rpcb_gettime() function with the parameters in the correct order. This XSUB can be called from Perl with either of the following statements: - $status = rpcb_gettime( $timep, $host ); + $status = rpcb_gettime($timep, $host); - $status = rpcb_gettime( $timep ); + $status = rpcb_gettime($timep); The XSUB will look like the code which follows. A CODE: block is used to call the real rpcb_gettime() function with the parameters in the correct order for that function. - bool_t - rpcb_gettime(timep,host="localhost") - char *host - time_t timep = NO_INIT - CODE: - RETVAL = rpcb_gettime( host, &timep ); - OUTPUT: - timep - RETVAL + bool_t + rpcb_gettime(timep, host="localhost") + char *host + time_t timep = NO_INIT + CODE: + RETVAL = rpcb_gettime(host, &timep); + OUTPUT: + timep + RETVAL =head2 The PREINIT: Keyword @@ -680,43 +680,43 @@ within an XSUB. The following examples are equivalent, but if the code is using complex typemaps then the first example is safer. - bool_t - rpcb_gettime(timep) - time_t timep = NO_INIT - PREINIT: - char *host = "localhost"; - CODE: - RETVAL = rpcb_gettime( host, &timep ); - OUTPUT: - timep - RETVAL + bool_t + rpcb_gettime(timep) + time_t timep = NO_INIT + PREINIT: + char *host = "localhost"; + CODE: + RETVAL = rpcb_gettime(host, &timep); + OUTPUT: + timep + RETVAL For this particular case an INIT: keyword would generate the same C code as the PREINIT: keyword. Another correct, but error-prone example: - bool_t - rpcb_gettime(timep) - time_t timep = NO_INIT - CODE: - char *host = "localhost"; - RETVAL = rpcb_gettime( host, &timep ); - OUTPUT: - timep - RETVAL + bool_t + rpcb_gettime(timep) + time_t timep = NO_INIT + CODE: + char *host = "localhost"; + RETVAL = rpcb_gettime(host, &timep); + OUTPUT: + timep + RETVAL Another way to declare C is to use a C block in the CODE: section: - bool_t - rpcb_gettime(timep) - time_t timep = NO_INIT - CODE: - { + bool_t + rpcb_gettime(timep) + time_t timep = NO_INIT + CODE: + { char *host = "localhost"; - RETVAL = rpcb_gettime( host, &timep ); - } - OUTPUT: - timep - RETVAL + RETVAL = rpcb_gettime(host, &timep); + } + OUTPUT: + timep + RETVAL The ability to put additional declarations before the typemap entries are processed is very handy in the cases when typemap conversions manipulate @@ -724,12 +724,12 @@ some global state: MyObject mutate(o) - PREINIT: - MyState st = global_state; - INPUT: - MyObject o; - CLEANUP: - reset_to(global_state, st); + PREINIT: + MyState st = global_state; + INPUT: + MyObject o; + CLEANUP: + reset_to(global_state, st); Here we suppose that conversion to C in the INPUT: section and from MyObject when processing RETVAL will modify a global variable C. @@ -742,22 +742,22 @@ a subroutine. Thus the above code for mutate() can be rewritten as MyObject mutate(o) - MyState st = global_state; - MyObject o; - CLEANUP: - reset_to(global_state, st); + MyState st = global_state; + MyObject o; + CLEANUP: + reset_to(global_state, st); and the code for rpcb_gettime() can be rewritten as - bool_t - rpcb_gettime(timep) - time_t timep = NO_INIT - char *host = "localhost"; - C_ARGS: - host, &timep - OUTPUT: - timep - RETVAL + bool_t + rpcb_gettime(timep) + time_t timep = NO_INIT + char *host = "localhost"; + C_ARGS: + host, &timep + OUTPUT: + timep + RETVAL =head2 The SCOPE: Keyword @@ -773,19 +773,18 @@ the same paragraph (i.e. no intervening blank lines). For example: void foo() - INPUT: - ... - PREINIT: - ... - SCOPE: ENABLE - CODE: - ... - + INPUT: + ... + PREINIT: + ... + SCOPE: ENABLE + CODE: + ... SCOPE: ENABLE void bar() - ... + ... The first form (within the XSUB body) has been available since perl-5.004, but was broken by perl-5.12.0 (xsubpp v2.21) and fixed in perl-5.44.0 @@ -810,54 +809,54 @@ The following example shows how the input parameter C can be evaluated late, after a PREINIT. bool_t - rpcb_gettime(host,timep) - char *host - PREINIT: - time_t tt; - INPUT: - time_t timep - CODE: - RETVAL = rpcb_gettime( host, &tt ); - timep = tt; - OUTPUT: - timep - RETVAL + rpcb_gettime(host, timep) + char *host + PREINIT: + time_t tt; + INPUT: + time_t timep + CODE: + RETVAL = rpcb_gettime(host, &tt); + timep = tt; + OUTPUT: + timep + RETVAL The next example shows each input parameter evaluated late. bool_t - rpcb_gettime(host,timep) - PREINIT: - time_t tt; - INPUT: - char *host - PREINIT: - char *h; - INPUT: - time_t timep - CODE: - h = host; - RETVAL = rpcb_gettime( h, &tt ); - timep = tt; - OUTPUT: - timep - RETVAL + rpcb_gettime(host, timep) + PREINIT: + time_t tt; + INPUT: + char *host + PREINIT: + char *h; + INPUT: + time_t timep + CODE: + h = host; + RETVAL = rpcb_gettime(h, &tt); + timep = tt; + OUTPUT: + timep + RETVAL Since INPUT sections allow declaration of C variables which do not appear in the parameter list of a subroutine, this may be shortened to: bool_t - rpcb_gettime(host,timep) - time_t tt; - char *host; - char *h = host; - time_t timep; - CODE: - RETVAL = rpcb_gettime( h, &tt ); - timep = tt; - OUTPUT: - timep - RETVAL + rpcb_gettime(host, timep) + time_t tt; + char *host; + char *h = host; + time_t timep; + CODE: + RETVAL = rpcb_gettime(h, &tt); + timep = tt; + OUTPUT: + timep + RETVAL (We used our knowledge that input conversion for C is a "simple" one, thus C is initialized on the declaration line, and our assignment @@ -896,25 +895,25 @@ modified to have the values written by the C function. For example, an XSUB - void - day_month(OUTLIST day, IN unix_time, OUTLIST month) - int day - int unix_time - int month + void + day_month(OUTLIST day, IN unix_time, OUTLIST month) + int day + int unix_time + int month should be used from Perl as - my ($day, $month) = day_month(time); + my ($day, $month) = day_month(time); The C signature of the corresponding function should be - void day_month(int *day, int unix_time, int *month); + void day_month(int *day, int unix_time, int *month); The C/C/C/C/C keywords can be mixed with ANSI-style declarations, as in - void - day_month(OUTLIST int day, int unix_time, OUTLIST int month) + void + day_month(OUTLIST int day, int unix_time, OUTLIST int month) (here the optional C keyword is omitted). @@ -931,23 +930,23 @@ being read (and not being given to the C function - which gets some garbage instead). For example, the same C function as above can be interfaced with as - void day_month(OUT int day, int unix_time, OUT int month); + void day_month(OUT int day, int unix_time, OUT int month); or - void - day_month(day, unix_time, month) - int &day = NO_INIT - int unix_time - int &month = NO_INIT - OUTPUT: - day - month + void + day_month(day, unix_time, month) + int &day = NO_INIT + int unix_time + int &month = NO_INIT + OUTPUT: + day + month However, the generated Perl function is called in very C-ish style: - my ($day, $month); - day_month($day, time, $month); + my ($day, $month); + day_month($day, time, $month); =head2 The C Keyword @@ -956,19 +955,19 @@ argument C, one can substitute the name of the length-argument by C in the XSUB declaration. This argument must be omitted when the generated Perl function is called. E.g., - void - dump_chars(char *s, short l) - { - short n = 0; - while (n < l) { - printf("s[%d] = \"\\%#03o\"\n", n, (int)s[n]); - n++; + void + dump_chars(char *s, short l) + { + short n = 0; + while (n < l) { + printf("s[%d] = \"\\%#03o\"\n", n, (int)s[n]); + n++; + } } - } - MODULE = x PACKAGE = x + MODULE = x PACKAGE = x - void dump_chars(char *s, short length(s)) + void dump_chars(char *s, short length(s)) should be called as C. @@ -988,24 +987,24 @@ optional so the ellipsis can be used to indicate that the XSUB will take a variable number of parameters. Perl should be able to call this XSUB with either of the following statements. - $status = rpcb_gettime( $timep, $host ); + $status = rpcb_gettime($timep, $host); - $status = rpcb_gettime( $timep ); + $status = rpcb_gettime($timep); The XS code, with ellipsis, follows. - bool_t - rpcb_gettime(timep, ...) - time_t timep = NO_INIT - PREINIT: - char *host = "localhost"; - CODE: - if( items > 1 ) - host = (char *)SvPVbyte_nolen(ST(1)); - RETVAL = rpcb_gettime( host, &timep ); - OUTPUT: - timep - RETVAL + bool_t + rpcb_gettime(timep, ...) + time_t timep = NO_INIT + PREINIT: + char *host = "localhost"; + CODE: + if (items > 1) + host = (char *)SvPVbyte_nolen(ST(1)); + RETVAL = rpcb_gettime(host, &timep); + OUTPUT: + timep + RETVAL =head2 The C_ARGS: Keyword @@ -1028,10 +1027,10 @@ To do this, declare the XSUB as symbolic nth_derivative(function, n) - symbolic function - int n + symbolic function + int n C_ARGS: - n, function, default_flags + n, function, default_flags =head2 The PPCODE: Keyword @@ -1067,17 +1066,17 @@ The following XSUB will call the C rpcb_gettime() function and will return its two output values, timep and status, to Perl as a single list. - void - rpcb_gettime(host) - char *host - PREINIT: - time_t timep; - bool_t status; - PPCODE: - status = rpcb_gettime( host, &timep ); - EXTEND(SP, 2); - PUSHs(sv_2mortal(newSViv(status))); - PUSHs(sv_2mortal(newSViv(timep))); + void + rpcb_gettime(host) + char *host + PREINIT: + time_t timep; + bool_t status; + PPCODE: + status = rpcb_gettime(host, &timep); + EXTEND(SP, 2); + PUSHs(sv_2mortal(newSViv(status))); + PUSHs(sv_2mortal(newSViv(timep))); Notice that the programmer must supply the C code necessary to have the real rpcb_gettime() function called and to have @@ -1098,7 +1097,7 @@ macro. Now the rpcb_gettime() function can be used from Perl with the following statement. - ($status, $timep) = rpcb_gettime("localhost"); + ($status, $timep) = rpcb_gettime("localhost"); When handling output parameters with a PPCODE section, be sure to handle 'set' magic properly. See L for details about 'set' magic. @@ -1113,7 +1112,7 @@ to have it return the time and if it fails we would like to have undef returned. In the following Perl code the value of $timep will either be undef or it will be a valid time. - $timep = rpcb_gettime( "localhost" ); + $timep = rpcb_gettime("localhost"); The following XSUB uses the C return type as a mnemonic only, and uses a CODE: block to indicate to the compiler @@ -1121,50 +1120,50 @@ that the programmer has supplied all the necessary code. The sv_newmortal() call will initialize the return value to undef, making that the default return value. - SV * - rpcb_gettime(host) - char * host - PREINIT: - time_t timep; - bool_t x; - CODE: - ST(0) = sv_newmortal(); - if( rpcb_gettime( host, &timep ) ) - sv_setnv( ST(0), (double)timep); + SV * + rpcb_gettime(host) + char *host + PREINIT: + time_t timep; + bool_t x; + CODE: + ST(0) = sv_newmortal(); + if (rpcb_gettime(host, &timep)) + sv_setnv(ST(0), (double)timep); The next example demonstrates how one would place an explicit undef in the return value, should the need arise. - SV * - rpcb_gettime(host) - char * host - PREINIT: - time_t timep; - bool_t x; - CODE: - if( rpcb_gettime( host, &timep ) ){ - ST(0) = sv_newmortal(); - sv_setnv( ST(0), (double)timep); - } - else{ - ST(0) = &PL_sv_undef; - } + SV * + rpcb_gettime(host) + char *host + PREINIT: + time_t timep; + bool_t x; + CODE: + if (rpcb_gettime(host, &timep)) { + ST(0) = sv_newmortal(); + sv_setnv(ST(0), (double)timep); + } + else { + ST(0) = &PL_sv_undef; + } To return an empty list one must use a PPCODE: block and then not push return values on the stack. - void - rpcb_gettime(host) - char *host - PREINIT: - time_t timep; - PPCODE: - if( rpcb_gettime( host, &timep ) ) - PUSHs(sv_2mortal(newSViv(timep))); - else{ - /* Nothing pushed on stack, so an empty - * list is implicitly returned. */ - } + void + rpcb_gettime(host) + char *host + PREINIT: + time_t timep; + PPCODE: + if (rpcb_gettime(host, &timep)) + PUSHs(sv_2mortal(newSViv(timep))); + else { + /* Nothing pushed on stack, so an empty + * list is implicitly returned. */ + } Some people may be inclined to include an explicit C in the above XSUB, rather than letting control fall through to the end. In those @@ -1175,28 +1174,28 @@ C macros. Since C macros can be used with CODE blocks as well, one can rewrite this example as: - int - rpcb_gettime(host) - char *host - PREINIT: - time_t timep; - CODE: - RETVAL = rpcb_gettime( host, &timep ); - if (RETVAL == 0) - XSRETURN_UNDEF; - OUTPUT: - RETVAL + int + rpcb_gettime(host) + char *host + PREINIT: + time_t timep; + CODE: + RETVAL = rpcb_gettime(host, &timep); + if (RETVAL == 0) + XSRETURN_UNDEF; + OUTPUT: + RETVAL In fact, one can put this check into a POSTCALL: section as well. Together with PREINIT: simplifications, this leads to: - int - rpcb_gettime(host) - char *host - time_t timep; - POSTCALL: - if (RETVAL == 0) - XSRETURN_UNDEF; + int + rpcb_gettime(host) + char *host + time_t timep; + POSTCALL: + if (RETVAL == 0) + XSRETURN_UNDEF; =head2 The REQUIRE: Keyword @@ -1205,7 +1204,7 @@ B compiler needed to compile the XS module. An XS module which contains the following statement will compile with only B version 1.922 or greater: - REQUIRE: 1.922 + REQUIRE: 1.922 =head2 The CLEANUP: Keyword @@ -1239,10 +1238,10 @@ This keyword may be used any time after the first MODULE keyword and should appear on a line by itself. The first blank line after the keyword will terminate the code block. - BOOT: - # The following message will be printed when the - # bootstrap function executes. - printf("Hello from the bootstrap!\n"); + BOOT: + # The following message will be printed when the + # bootstrap function executes. + printf("Hello from the bootstrap!\n"); =head2 The VERSIONCHECK: Keyword @@ -1296,17 +1295,17 @@ prototypes. bool_t rpcb_gettime(timep, ...) - time_t timep = NO_INIT - PROTOTYPE: $;$ - PREINIT: - char *host = "localhost"; - CODE: - if( items > 1 ) - host = (char *)SvPVbyte_nolen(ST(1)); - RETVAL = rpcb_gettime( host, &timep ); - OUTPUT: - timep - RETVAL + time_t timep = NO_INIT + PROTOTYPE: $;$ + PREINIT: + char *host = "localhost"; + CODE: + if (items > 1) + host = (char *)SvPVbyte_nolen(ST(1)); + RETVAL = rpcb_gettime(host, &timep); + OUTPUT: + timep + RETVAL If the prototypes are enabled, you can disable it locally for a given XSUB as in the following example: @@ -1329,16 +1328,16 @@ The following example will create aliases C and C for this function. bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep - ALIAS: - FOO::gettime = 1 - BAR::getit = 2 - INIT: - printf("# ix = %d\n", ix ); - OUTPUT: - timep + rpcb_gettime(host, timep) + char *host + time_t &timep + ALIAS: + FOO::gettime = 1 + BAR::getit = 2 + INIT: + printf("# ix = %d\n", ix); + OUTPUT: + timep A warning will be produced when you create more than one alias to the same value. This may be worked around in a backwards compatible way by creating @@ -1347,38 +1346,38 @@ of ExtUtils::ParseXS you can use a symbolic alias, which are denoted with a C<< => >> instead of a C<< = >>. For instance you could change the above so that the alias section looked like this: - ALIAS: - FOO::gettime = 1 - BAR::getit = 2 - BAZ::gettime => FOO::gettime + ALIAS: + FOO::gettime = 1 + BAR::getit = 2 + BAZ::gettime => FOO::gettime this would have the same effect as this: - ALIAS: - FOO::gettime = 1 - BAR::getit = 2 - BAZ::gettime = 1 + ALIAS: + FOO::gettime = 1 + BAR::getit = 2 + BAZ::gettime = 1 except that the latter will produce warnings during the build process. A mechanism that would work in a backwards compatible way with older versions of our tool chain would be to do this: #define FOO_GETTIME 1 - #define BAR_GETIT 2 + #define BAR_GETIT 2 #define BAZ_GETTIME 1 bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep - ALIAS: - FOO::gettime = FOO_GETTIME - BAR::getit = BAR_GETIT - BAZ::gettime = BAZ_GETTIME - INIT: - printf("# ix = %d\n", ix ); - OUTPUT: - timep + rpcb_gettime(host, timep) + char *host + time_t &timep + ALIAS: + FOO::gettime = FOO_GETTIME + BAR::getit = BAR_GETIT + BAZ::gettime = BAZ_GETTIME + INIT: + printf("# ix = %d\n", ix); + OUTPUT: + timep =head2 The OVERLOAD: Keyword @@ -1407,11 +1406,11 @@ this: SV * cmp (lobj, robj, swap) - My_Module_obj lobj - My_Module_obj robj - IV swap - OVERLOAD: cmp <=> - { /* function defined here */} + My_Module_obj lobj + My_Module_obj robj + IV swap + OVERLOAD: cmp <=> + { /* function defined here */} In this case, the function will overload both of the three way comparison operators. For all overload operations using non-alpha @@ -1456,11 +1455,11 @@ you can make them all to use the same XSUB using this: symbolic interface_s_ss(arg1, arg2) - symbolic arg1 - symbolic arg2 - INTERFACE: - multiply divide - add subtract + symbolic arg1 + symbolic arg2 + INTERFACE: + multiply divide + add subtract (This is the complete XSUB code for 4 Perl functions!) Four generated Perl function share names with corresponding C functions. @@ -1471,7 +1470,7 @@ the same XSUB) knows which C function it should call. Additionally, one can attach an extra function remainder() at runtime by using CV *mycv = newXSproto("Symbolic::remainder", - XS_Symbolic_interface_s_ss, __FILE__, "$$"); + XS_Symbolic_interface_s_ss, __FILE__, "$$"); XSINTERFACE_FUNC_SET(mycv, remainder); say, from another XSUB. (This example supposes that there was no @@ -1496,23 +1495,23 @@ multiply(), divide(), add(), subtract() are kept in a global C array C with offsets being C, C, C, C. Then one can use - #define XSINTERFACE_FUNC_BYOFFSET(ret,cv,f) \ - ((XSINTERFACE_CVT_ANON(ret))fp[CvXSUBANY(cv).any_i32]) - #define XSINTERFACE_FUNC_BYOFFSET_set(cv,f) \ - CvXSUBANY(cv).any_i32 = CAT2( f, _off ) + #define XSINTERFACE_FUNC_BYOFFSET(ret, cv, f) \ + ((XSINTERFACE_CVT_ANON(ret))fp[CvXSUBANY(cv).any_i32]) + #define XSINTERFACE_FUNC_BYOFFSET_set(cv, f) \ + CvXSUBANY(cv).any_i32 = CAT2(f, _off) in C section, symbolic interface_s_ss(arg1, arg2) - symbolic arg1 - symbolic arg2 + symbolic arg1 + symbolic arg2 INTERFACE_MACRO: - XSINTERFACE_FUNC_BYOFFSET - XSINTERFACE_FUNC_BYOFFSET_set + XSINTERFACE_FUNC_BYOFFSET + XSINTERFACE_FUNC_BYOFFSET_set INTERFACE: - multiply divide - add subtract + multiply divide + add subtract in XSUB section. @@ -1525,11 +1524,11 @@ generate the XS code to be pulled into the module. The file F contains our C function: bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep - OUTPUT: - timep + rpcb_gettime(host, timep) + char *host + time_t &timep + OUTPUT: + timep The XS module can use INCLUDE: to pull that file into it. @@ -1575,33 +1574,33 @@ but when the function is called as C its parameters are reversed, C<(time_t *timep, char *host)>. long - rpcb_gettime(a,b) + rpcb_gettime(a, b) CASE: ix == 1 - ALIAS: - x_gettime = 1 - INPUT: - # 'a' is timep, 'b' is host - char *b - time_t a = NO_INIT - CODE: - RETVAL = rpcb_gettime( b, &a ); - OUTPUT: - a - RETVAL + ALIAS: + x_gettime = 1 + INPUT: + # 'a' is timep, 'b' is host + char *b + time_t a = NO_INIT + CODE: + RETVAL = rpcb_gettime(b, &a); + OUTPUT: + a + RETVAL CASE: - # 'a' is host, 'b' is timep - char *a - time_t &b = NO_INIT - OUTPUT: - b - RETVAL + # 'a' is host, 'b' is timep + char *a + time_t &b = NO_INIT + OUTPUT: + b + RETVAL That function can be called with either of the following statements. Note the different argument lists. - $status = rpcb_gettime( $host, $timep ); + $status = rpcb_gettime($host, $timep); - $status = x_gettime( $timep, $host ); + $status = x_gettime($timep, $host); =head2 The EXPORT_XSUB_SYMBOLS: Keyword @@ -1610,12 +1609,12 @@ In perl versions earlier than 5.16.0, this keyword does nothing. Starting with 5.16, XSUB symbols are no longer exported by default. That is, they are C functions. If you include - EXPORT_XSUB_SYMBOLS: ENABLE + EXPORT_XSUB_SYMBOLS: ENABLE in your XS code, the XSUBs following this line will not be declared C. You can later disable this with - EXPORT_XSUB_SYMBOLS: DISABLE + EXPORT_XSUB_SYMBOLS: DISABLE which, again, is the default that you should probably never change. You cannot use this keyword on versions of perl before 5.16 to make @@ -1637,11 +1636,11 @@ turn this into code which calls C with parameters C<(char parameter to be of type C rather than C. bool_t - rpcb_gettime(host,timep) - char *host - time_t timep - OUTPUT: - timep + rpcb_gettime(host, timep) + char *host + time_t timep + OUTPUT: + timep That problem is corrected by using the C<&> operator. The B compiler will now turn this into code which calls C correctly with @@ -1649,11 +1648,11 @@ parameters C<(char *host, time_t *timep)>. It does this by carrying the C<&> through, so the function call looks like C. bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep - OUTPUT: - timep + rpcb_gettime(host, timep) + char *host + time_t &timep + OUTPUT: + timep =head2 Inserting POD, Comments and C Preprocessor Directives @@ -1711,91 +1710,91 @@ the function will be called using the THIS-Emethod() syntax. The next examples will use the following C++ class. - class color { - public: - color(); - ~color(); - int blue(); - void set_blue( int ); + class color { + public: + color(); + ~color(); + int blue(); + void set_blue(int); - private: - int c_blue; - }; + private: + int c_blue; + }; The XSUBs for the blue() and set_blue() methods are defined with the class name but the parameter for the object (THIS, or "self") is implicit and is not listed. - int - color::blue() + int + color::blue() - void - color::set_blue( val ) - int val + void + color::set_blue(val) + int val Both Perl functions will expect an object as the first parameter. In the generated C++ code the object is called C, and the method call will be performed on this object. So in the C++ code the blue() and set_blue() methods will be called as this: - RETVAL = THIS->blue(); + RETVAL = THIS->blue(); - THIS->set_blue( val ); + THIS->set_blue(val); You could also write a single get/set method using an optional argument: - int - color::blue( val = NO_INIT ) - int val - PROTOTYPE $;$ - CODE: - if (items > 1) - THIS->set_blue( val ); - RETVAL = THIS->blue(); - OUTPUT: - RETVAL + int + color::blue(val = NO_INIT) + int val + PROTOTYPE $;$ + CODE: + if (items > 1) + THIS->set_blue(val); + RETVAL = THIS->blue(); + OUTPUT: + RETVAL If the function's name is B then the C++ C function will be called and C will be given as its parameter. The generated C++ code for - void - color::DESTROY() + void + color::DESTROY() will look like this: - color *THIS = ...; // Initialized as in typemap + color *THIS = ...; // Initialized as in typemap - delete THIS; + delete THIS; If the function's name is B then the C++ C function will be called to create a dynamic C++ object. The XSUB will expect the class name, which will be kept in a variable called C, to be given as the first argument. - color * - color::new() + color * + color::new() The generated C++ code will call C. - RETVAL = new color(); + RETVAL = new color(); The following is an example of a typemap that could be used for this C++ example. TYPEMAP - color * O_OBJECT + color * O_OBJECT OUTPUT # The Perl object is blessed into 'CLASS', which should be a # char* having the name of the package for the blessing. O_OBJECT - sv_setref_pv( $arg, CLASS, (void*)$var ); + sv_setref_pv($arg, CLASS, (void*)$var); INPUT O_OBJECT - if( sv_isobject($arg) && (SvTYPE(SvRV($arg)) == SVt_PVMG) ) - $var = ($type)SvIV((SV*)SvRV( $arg )); - else{ + if (sv_isobject($arg) && (SvTYPE(SvRV($arg)) == SVt_PVMG)) + $var = ($type)SvIV((SV*)SvRV($arg)); + else { warn(\"${Package}::$func_name() -- \" \"$var is not a blessed SV reference\"); XSRETURN_UNDEF; @@ -1826,7 +1825,7 @@ failure. In more complicated cases use CODE: or PPCODE: sections. If many functions use the same failure indication based on the return value, you may want to create a special typedef to handle this situation. Put - typedef int negative_is_failure; + typedef int negative_is_failure; near the beginning of XS file, and create an OUTPUT typemap entry for C which converts negative values to C, or @@ -1880,7 +1879,7 @@ Destructors in XS can be created by specifying an XSUB function whose name ends with the word B. XS destructors can be used to free memory which may have been malloc'd by another XSUB. - struct netconfig *getnetconfigent(const char *netid); + struct netconfig *getnetconfigent(const char *netid); A C will be created for C. The Perl object will be blessed in a class matching the name of the C @@ -1890,34 +1889,34 @@ destructor will be placed in a class corresponding to the class of the object and the PREFIX keyword will be used to trim the name to the word DESTROY as Perl will expect. - typedef struct netconfig Netconfig; + typedef struct netconfig Netconfig; - MODULE = RPC PACKAGE = RPC + MODULE = RPC PACKAGE = RPC - Netconfig * - getnetconfigent(netid) - char *netid + Netconfig * + getnetconfigent(netid) + char *netid - MODULE = RPC PACKAGE = NetconfigPtr PREFIX = rpcb_ + MODULE = RPC PACKAGE = NetconfigPtr PREFIX = rpcb_ - void - rpcb_DESTROY(netconf) - Netconfig *netconf - CODE: - printf("Now in NetconfigPtr::DESTROY\n"); - free( netconf ); + void + rpcb_DESTROY(netconf) + Netconfig *netconf + CODE: + printf("Now in NetconfigPtr::DESTROY\n"); + free(netconf); This example requires the following typemap entry. Consult L for more information about adding new typemaps for an extension. - TYPEMAP - Netconfig * T_PTROBJ + TYPEMAP + Netconfig * T_PTROBJ This example will be used with the following Perl statements. - use RPC; - $netconf = getnetconfigent("udp"); + use RPC; + $netconf = getnetconfigent("udp"); When Perl destroys the object referenced by $netconf it will send the object to the supplied XSUB DESTROY function. Perl cannot determine, and @@ -1958,7 +1957,7 @@ Below is an example module that makes use of the macros. START_MY_CXT - MODULE = BlindMice PACKAGE = BlindMice + MODULE = BlindMice PACKAGE = BlindMice BOOT: { @@ -1970,38 +1969,38 @@ Below is an example module that makes use of the macros. } int - newMouse(char * name) - PREINIT: - dMY_CXT; - CODE: - if (MY_CXT.count >= 3) { - warn("Already have 3 blind mice"); - RETVAL = 0; - } - else { - RETVAL = ++ MY_CXT.count; - strcpy(MY_CXT.name[MY_CXT.count - 1], name); - } - OUTPUT: - RETVAL + newMouse(char *name) + PREINIT: + dMY_CXT; + CODE: + if (MY_CXT.count >= 3) { + warn("Already have 3 blind mice"); + RETVAL = 0; + } + else { + RETVAL = ++MY_CXT.count; + strcpy(MY_CXT.name[MY_CXT.count - 1], name); + } + OUTPUT: + RETVAL char * get_mouse_name(index) - int index - PREINIT: - dMY_CXT; - CODE: - if (index > MY_CXT.count) + int index + PREINIT: + dMY_CXT; + CODE: + if (index > MY_CXT.count) croak("There are only 3 blind mice."); - else + else RETVAL = MY_CXT.name[index - 1]; - OUTPUT: - RETVAL + OUTPUT: + RETVAL void CLONE(...) - CODE: - MY_CXT_CLONE; + CODE: + MY_CXT_CLONE; =head3 MY_CXT REFERENCE @@ -2082,13 +2081,13 @@ onto other functions using the C/C macros, eg =for apidoc Amnh||MY_CXT void sub1() { - dMY_CXT; - MY_CXT.index = 1; - sub2(aMY_CXT); + dMY_CXT; + MY_CXT.index = 1; + sub2(aMY_CXT); } void sub2(pMY_CXT) { - MY_CXT.index = 2; + MY_CXT.index = 2; } Analogously to C, there are equivalent forms for when the macro is the @@ -2122,71 +2121,73 @@ than a dMY_CTX in another source file. File C: Interface to some ONC+ RPC bind library functions. - #define PERL_NO_GET_CONTEXT - #include "EXTERN.h" - #include "perl.h" - #include "XSUB.h" + #define PERL_NO_GET_CONTEXT + #include "EXTERN.h" + #include "perl.h" + #include "XSUB.h" + + /* Note: On glibc 2.13 and earlier, this needs be */ + #include - /* Note: On glibc 2.13 and earlier, this needs be */ - #include + typedef struct netconfig Netconfig; - typedef struct netconfig Netconfig; + MODULE = RPC PACKAGE = RPC - MODULE = RPC PACKAGE = RPC + SV * + rpcb_gettime(host = "localhost") + char *host + PREINIT: + time_t timep; + CODE: + ST(0) = sv_newmortal(); + if (rpcb_gettime(host, &timep)) + sv_setnv(ST(0), (double)timep); - SV * - rpcb_gettime(host="localhost") - char *host - PREINIT: - time_t timep; - CODE: - ST(0) = sv_newmortal(); - if( rpcb_gettime( host, &timep ) ) - sv_setnv( ST(0), (double)timep ); - Netconfig * - getnetconfigent(netid="udp") - char *netid + Netconfig * + getnetconfigent(netid="udp") + char *netid - MODULE = RPC PACKAGE = NetconfigPtr PREFIX = rpcb_ - void - rpcb_DESTROY(netconf) - Netconfig *netconf - CODE: - printf("NetconfigPtr::DESTROY\n"); - free( netconf ); + MODULE = RPC PACKAGE = NetconfigPtr PREFIX = rpcb_ + + void + rpcb_DESTROY(netconf) + Netconfig *netconf + CODE: + printf("NetconfigPtr::DESTROY\n"); + free(netconf); File C: Custom typemap for RPC.xs. (cf. L) TYPEMAP - Netconfig * T_PTROBJ + Netconfig * T_PTROBJ File C: Perl module for the RPC extension. - package RPC; + package RPC; - require Exporter; - require DynaLoader; - @ISA = qw(Exporter DynaLoader); - @EXPORT = qw(rpcb_gettime getnetconfigent); + require Exporter; + require DynaLoader; + @ISA = qw(Exporter DynaLoader); + @EXPORT = qw(rpcb_gettime getnetconfigent); - bootstrap RPC; - 1; + bootstrap RPC; + 1; File C: Perl test program for the RPC extension. - use RPC; + use RPC; - $netconf = getnetconfigent(); - $a = rpcb_gettime(); - print "time = $a\n"; - print "netconf = $netconf\n"; + $netconf = getnetconfigent(); + $a = rpcb_gettime(); + print "time = $a\n"; + print "netconf = $netconf\n"; - $netconf = getnetconfigent("tcp"); - $a = rpcb_gettime("poplar"); - print "time = $a\n"; - print "netconf = $netconf\n"; + $netconf = getnetconfigent("tcp"); + $a = rpcb_gettime("poplar"); + print "time = $a\n"; + print "netconf = $netconf\n"; In Makefile.PL add -ltirpc and -I/usr/include/tirpc. From 9710af7ee98f21a43f5073d32d61ea4833282006 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Wed, 4 Jun 2025 14:53:05 +0100 Subject: [PATCH 03/42] perlxs.pod: delete most non-ref sections This commit is a simple cut which deletes several '=head2' sections from perlxs.pod. The next commit will tidy up and fix any broken links etc. These sections are more tutorial-like, and aren't in line with the goal of this branch that perlxs.pod becomes purely a reference manual for XS. Any relevant information from these sections may be incorporated later into new sections in perlxs.pod and/or be included in a future rewrite of perlxstut.pod. The sections deleted are: =head2 Introduction =head2 On The Road =head2 The Anatomy of an XSUB =head2 The Argument Stack =head2 The RETVAL Variable =head2 Returning SVs, AVs and HVs through RETVAL =head2 Returning Undef And Empty Lists =head2 Interface Strategy =head2 Perl Objects And C Structures --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 575 --------------------------- 1 file changed, 575 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index a9bdbfe4f49e..0cc4fc5c7159 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -4,363 +4,6 @@ perlxs - XS language reference manual =head1 DESCRIPTION -=head2 Introduction - -XS is an interface description file format used to create an extension -interface between Perl and C code (or a C library) which one wishes -to use with Perl. The XS interface is combined with the library to -create a new library which can then be either dynamically loaded -or statically linked into perl. The XS interface description is -written in the XS language and is the core component of the Perl -extension interface. - -This documents the XS language, but it's important to first note that XS -code has full access to system calls including C library functions. It -thus has the capability of interfering with things that the Perl core or -other modules have set up, such as signal handlers or file handles. It -could mess with the memory, or any number of harmful things. Don't. -Further detail is in L, which you should read before actually -writing any production XS. - -An B forms the basic unit of the XS interface. After compilation -by the B compiler, each XSUB amounts to a C function definition -which will provide the glue between Perl calling conventions and C -calling conventions. - -The glue code pulls the arguments from the Perl stack, converts these -Perl values to the formats expected by a C function, calls this C function, -and then transfers the return values of the C function back to Perl. -Return values here may be a conventional C return value or any C -function arguments that may serve as output parameters. These return -values may be passed back to Perl either by putting them on the -Perl stack, or by modifying the arguments supplied from the Perl side. - -The above is a somewhat simplified view of what really happens. Since -Perl allows more flexible calling conventions than C, XSUBs may do much -more in practice, such as checking input parameters for validity, -throwing exceptions (or returning undef/empty list) if the return value -from the C function indicates failure, calling different C functions -based on numbers and types of the arguments, providing an object-oriented -interface, etc. - -Of course, one could write such glue code directly in C. However, this -would be a tedious task, especially if one needs to write glue for -multiple C functions, and/or one is not familiar enough with the Perl -stack discipline and other such arcana. XS comes to the rescue here: -instead of writing this glue C code in long-hand, one can write -a more concise short-hand I of what should be done by -the glue, and let the XS compiler B handle the rest. - -The XS language allows one to describe the mapping between how the C -routine is used, and how the corresponding Perl routine is used. It -also allows creation of Perl routines which are directly translated to -C code and which are not related to a pre-existing C function. In cases -when the C interface coincides with the Perl interface, the XSUB -declaration is almost identical to a declaration of a C function (in K&R -style). In such circumstances, there is another tool called C -that is able to translate an entire C header file into a corresponding -XS file that will provide glue to the functions/macros described in -the header file. - -The XS compiler is called B. This compiler creates -the constructs necessary to let an XSUB manipulate Perl values, and -creates the glue necessary to let Perl call the XSUB. The compiler -uses B to determine how to map C function parameters -and output values to Perl values and back. The default typemap -(which comes with Perl) handles many common C types. A supplementary -typemap may also be needed to handle any special structures and types -for the library being linked. For more information on typemaps, -see L. - -A file in XS format starts with a C language section which goes until the -first C> directive. Other XS directives and XSUB definitions -may follow this line. The "language" used in this part of the file -is usually referred to as the XS language. B recognizes and -skips POD (see L) in both the C and XS language sections, which -allows the XS file to contain embedded documentation. - -See L for a tutorial on the whole extension creation process. - -Note: For some extensions, Dave Beazley's SWIG system may provide a -significantly more convenient mechanism for creating the extension -glue code. See L for more information. - -For simple bindings to C libraries as well as other machine code libraries, -consider instead using the much simpler -L interface via CPAN modules like -L or L. - -=head2 On The Road - -Many of the examples which follow will concentrate on creating an interface -between Perl and the ONC+ RPC bind library functions. The rpcb_gettime() -function is used to demonstrate many features of the XS language. This -function has two parameters; the first is an input parameter and the second -is an output parameter. The function also returns a status value. - - bool_t rpcb_gettime(const char *host, time_t *timep); - -From C this function will be called with the following -statements. - - #include - bool_t status; - time_t timep; - status = rpcb_gettime("localhost", &timep); - -If an XSUB is created to offer a direct translation between this function -and Perl, then this XSUB will be used from Perl with the following code. -The $status and $timep variables will contain the output of the function. - - use RPC; - $status = rpcb_gettime("localhost", $timep); - -The following XS file shows an XS subroutine, or XSUB, which -demonstrates one possible interface to the rpcb_gettime() -function. This XSUB represents a direct translation between -C and Perl and so preserves the interface even from Perl. -This XSUB will be invoked from Perl with the usage shown -above. Note that the first three #include statements, for -C, C, and C, will always be present at the -beginning of an XS file. This approach and others will be -expanded later in this document. A #define for C -should be present to fetch the interpreter context more efficiently, -see L for details. - - #define PERL_NO_GET_CONTEXT - #include "EXTERN.h" - #include "perl.h" - #include "XSUB.h" - #include - - MODULE = RPC PACKAGE = RPC - - bool_t - rpcb_gettime(host, timep) - char *host - time_t &timep - OUTPUT: - timep - -Any extension to Perl, including those containing XSUBs, -should have a Perl module to serve as the bootstrap which -pulls the extension into Perl. This module will export the -extension's functions and variables to the Perl program and -will cause the extension's XSUBs to be linked into Perl. -The following module will be used for most of the examples -in this document and should be used from Perl with the C -command as shown earlier. Perl modules are explained in -more detail later in this document. - - package RPC; - - require Exporter; - require DynaLoader; - @ISA = qw(Exporter DynaLoader); - @EXPORT = qw(rpcb_gettime); - - bootstrap RPC; - 1; - -Throughout this document a variety of interfaces to the rpcb_gettime() -XSUB will be explored. The XSUBs will take their parameters in different -orders or will take different numbers of parameters. In each case the -XSUB is an abstraction between Perl and the real C rpcb_gettime() -function, and the XSUB must always ensure that the real rpcb_gettime() -function is called with the correct parameters. This abstraction will -allow the programmer to create a more Perl-like interface to the C -function. - -=head2 The Anatomy of an XSUB - -The simplest XSUBs consist of 3 parts: a description of the return -value, the name of the XSUB routine and the names of its arguments, -and a description of types or formats of the arguments. - -The following XSUB allows a Perl program to access a C library function -called sin(). The XSUB will imitate the C function which takes a single -argument and returns a single value. - - double - sin(x) - double x - -Optionally, one can merge the description of types and the list of -argument names, rewriting this as - - double - sin(double x) - -This makes this XSUB look similar to an ANSI C declaration. An optional -semicolon is allowed after the argument list, as in - - double - sin(double x); - -Parameters with C pointer types can have different semantic: C functions -with similar declarations - - bool string_looks_as_a_number(char *s); - bool make_char_uppercase(char *c); - -are used in absolutely incompatible manner. Parameters to these functions -could be described to B like this: - - char *s - char &c - -Both these XS declarations correspond to the C C type, but they have -different semantics, see L<"The & Unary Operator">. - -It is convenient to think that the indirection operator -C<*> should be considered as a part of the type and the address operator C<&> -should be considered part of the variable. See L -for more info about handling qualifiers and unary operators in C types. - -The function name and the return type must be placed on -separate lines and should be flush left-adjusted. - - INCORRECT CORRECT - - double sin(x) double - double x sin(x) - double x - -The rest of the function description may be indented or left-adjusted. The -following example shows a function with its body left-adjusted. Most -examples in this document will indent the body for better readability. - - CORRECT - - double - sin(x) - double x - -More complicated XSUBs may contain many other sections. Each section of -an XSUB starts with the corresponding keyword, such as INIT: or CLEANUP:. -However, the first two lines of an XSUB always contain the same data: -descriptions of the return type and the names of the function and its -parameters. Whatever immediately follows these is considered to be -an INPUT: section unless explicitly marked with another keyword. -(See L.) - -An XSUB section continues until another section-start keyword is found. - -=head2 The Argument Stack - -The Perl argument stack is used to store the values which are -sent as parameters to the XSUB and to store the XSUB's -return value(s). In reality all Perl functions (including non-XSUB -ones) keep their values on this stack all the same time, each limited -to its own range of positions on the stack. In this document the -first position on that stack which belongs to the active -function will be referred to as position 0 for that function. - -XSUBs refer to their stack arguments with the macro B, where I -refers to a position in this XSUB's part of the stack. Position 0 for that -function would be known to the XSUB as ST(0). The XSUB's incoming -parameters and outgoing return values always begin at ST(0). For many -simple cases the B compiler will generate the code necessary to -handle the argument stack by embedding code fragments found in the -typemaps. In more complex cases the programmer must supply the code. - -=head2 The RETVAL Variable - -The RETVAL variable is a special C variable that is declared automatically -for you. The C type of RETVAL matches the return type of the C library -function. The B compiler will declare this variable in each XSUB -with non-C return type. By default the generated C function -will use RETVAL to hold the return value of the C library function being -called. In simple cases the value of RETVAL will be placed in ST(0) of -the argument stack where it can be received by Perl as the return value -of the XSUB. - -If the XSUB has a return type of C then the compiler will -not declare a RETVAL variable for that function. When using -a PPCODE: section no manipulation of the RETVAL variable is required, the -section may use direct stack manipulation to place output values on the stack. - -If PPCODE: directive is not used, C return value should be used -only for subroutines which do not return a value, I CODE: -directive is used which sets ST(0) explicitly. - -Older versions of this document recommended to use C return -value in such cases. It was discovered that this could lead to -segfaults in cases when XSUB was I C. This practice is -now deprecated, and may be not supported at some future version. Use -the return value C in such cases. (Currently C contains -some heuristic code which tries to disambiguate between "truly-void" -and "old-practice-declared-as-void" functions. Hence your code is at -mercy of this heuristics unless you use C as return value.) - -=head2 Returning SVs, AVs and HVs through RETVAL - -When you're using RETVAL to return an C, there's some magic -going on behind the scenes that should be mentioned. When you're -manipulating the argument stack using the ST(x) macro, for example, -you usually have to pay special attention to reference counts. (For -more about reference counts, see L.) To make your life -easier, the typemap file automatically makes C mortal when -you're returning an C. Thus, the following two XSUBs are more -or less equivalent: - - void - alpha() - PPCODE: - ST(0) = newSVpv("Hello World", 0); - sv_2mortal(ST(0)); - XSRETURN(1); - - SV * - beta() - CODE: - RETVAL = newSVpv("Hello World", 0); - OUTPUT: - RETVAL - -This is quite useful as it usually improves readability. While -this works fine for an C, it's unfortunately not as easy -to have C or C as a return value. You I be -able to write: - - AV * - array() - CODE: - RETVAL = newAV(); - /* do something with RETVAL */ - OUTPUT: - RETVAL - -But due to an unfixable bug (fixing it would break lots of existing -CPAN modules) in the typemap file, the reference count of the C -is not properly decremented. Thus, the above XSUB would leak memory -whenever it is being called. The same problem exists for C, -C, and C (which indicates a scalar reference, not -a general C). -In XS code on perls starting with perl 5.16, you can override the -typemaps for any of these types with a version that has proper -handling of refcounts. In your C section, do - - AV* T_AVREF_REFCOUNT_FIXED - -to get the repaired variant. For backward compatibility with older -versions of perl, you can instead decrement the reference count -manually when you're returning one of the aforementioned -types using C: - - AV * - array() - CODE: - RETVAL = newAV(); - sv_2mortal((SV*)RETVAL); - /* do something with RETVAL */ - OUTPUT: - RETVAL - -Remember that you don't have to do this for an C. The reference -documentation for all core typemaps can be found in L. - =head2 The MODULE Keyword The MODULE keyword is used to start the XS code and to specify the package @@ -1102,101 +745,6 @@ the following statement. When handling output parameters with a PPCODE section, be sure to handle 'set' magic properly. See L for details about 'set' magic. -=head2 Returning Undef And Empty Lists - -Occasionally the programmer will want to return simply -C or an empty list if a function fails rather than a -separate status value. The rpcb_gettime() function offers -just this situation. If the function succeeds we would like -to have it return the time and if it fails we would like to -have undef returned. In the following Perl code the value -of $timep will either be undef or it will be a valid time. - - $timep = rpcb_gettime("localhost"); - -The following XSUB uses the C return type as a mnemonic only, -and uses a CODE: block to indicate to the compiler -that the programmer has supplied all the necessary code. The -sv_newmortal() call will initialize the return value to undef, making that -the default return value. - - SV * - rpcb_gettime(host) - char *host - PREINIT: - time_t timep; - bool_t x; - CODE: - ST(0) = sv_newmortal(); - if (rpcb_gettime(host, &timep)) - sv_setnv(ST(0), (double)timep); - -The next example demonstrates how one would place an explicit undef in the -return value, should the need arise. - - SV * - rpcb_gettime(host) - char *host - PREINIT: - time_t timep; - bool_t x; - CODE: - if (rpcb_gettime(host, &timep)) { - ST(0) = sv_newmortal(); - sv_setnv(ST(0), (double)timep); - } - else { - ST(0) = &PL_sv_undef; - } - -To return an empty list one must use a PPCODE: block and -then not push return values on the stack. - - void - rpcb_gettime(host) - char *host - PREINIT: - time_t timep; - PPCODE: - if (rpcb_gettime(host, &timep)) - PUSHs(sv_2mortal(newSViv(timep))); - else { - /* Nothing pushed on stack, so an empty - * list is implicitly returned. */ - } - -Some people may be inclined to include an explicit C in the above -XSUB, rather than letting control fall through to the end. In those -situations C should be used, instead. This will ensure that -the XSUB stack is properly adjusted. Consult L for other -C macros. - -Since C macros can be used with CODE blocks as well, one can -rewrite this example as: - - int - rpcb_gettime(host) - char *host - PREINIT: - time_t timep; - CODE: - RETVAL = rpcb_gettime(host, &timep); - if (RETVAL == 0) - XSRETURN_UNDEF; - OUTPUT: - RETVAL - -In fact, one can put this check into a POSTCALL: section as well. Together -with PREINIT: simplifications, this leads to: - - int - rpcb_gettime(host) - char *host - time_t timep; - POSTCALL: - if (RETVAL == 0) - XSRETURN_UNDEF; - =head2 The REQUIRE: Keyword The REQUIRE: keyword is used to indicate the minimum version of the @@ -1800,129 +1348,6 @@ example. XSRETURN_UNDEF; } -=head2 Interface Strategy - -When designing an interface between Perl and a C library a straight -translation from C to XS (such as created by C) is often sufficient. -However, sometimes the interface will look -very C-like and occasionally nonintuitive, especially when the C function -modifies one of its parameters, or returns failure inband (as in "negative -return values mean failure"). In cases where the programmer wishes to -create a more Perl-like interface the following strategy may help to -identify the more critical parts of the interface. - -Identify the C functions with input/output or output parameters. The XSUBs for -these functions may be able to return lists to Perl. - -Identify the C functions which use some inband info as an indication -of failure. They may be -candidates to return undef or an empty list in case of failure. If the -failure may be detected without a call to the C function, you may want to use -an INIT: section to report the failure. For failures detectable after the C -function returns one may want to use a POSTCALL: section to process the -failure. In more complicated cases use CODE: or PPCODE: sections. - -If many functions use the same failure indication based on the return value, -you may want to create a special typedef to handle this situation. Put - - typedef int negative_is_failure; - -near the beginning of XS file, and create an OUTPUT typemap entry -for C which converts negative values to C, or -maybe croak()s. After this the return value of type C -will create more Perl-like interface. - -Identify which values are used by only the C and XSUB functions -themselves, say, when a parameter to a function should be a contents of a -global variable. If Perl does not need to access the contents of the value -then it may not be necessary to provide a translation for that value -from C to Perl. - -Identify the pointers in the C function parameter lists and return -values. Some pointers may be used to implement input/output or -output parameters, they can be handled in XS with the C<&> unary operator, -and, possibly, using the NO_INIT keyword. -Some others will require handling of types like C, and one needs -to decide what a useful Perl translation will do in such a case. When -the semantic is clear, it is advisable to put the translation into a typemap -file. - -Identify the structures used by the C functions. In many -cases it may be helpful to use the T_PTROBJ typemap for -these structures so they can be manipulated by Perl as -blessed objects. (This is handled automatically by C.) - -If the same C type is used in several different contexts which require -different translations, C several new types mapped to this C type, -and create separate F entries for these new types. Use these -types in declarations of return type and parameters to XSUBs. - -=head2 Perl Objects And C Structures - -When dealing with C structures one should select either -B or B for the XS type. Both types are -designed to handle pointers to complex objects. The -T_PTRREF type will allow the Perl object to be unblessed -while the T_PTROBJ type requires that the object be blessed. -By using T_PTROBJ one can achieve a form of type-checking -because the XSUB will attempt to verify that the Perl object -is of the expected type. - -The following XS code shows the getnetconfigent() function which is used -with ONC+ TIRPC. The getnetconfigent() function will return a pointer to a -C structure and has the C prototype shown below. The example will -demonstrate how the C pointer will become a Perl reference. Perl will -consider this reference to be a pointer to a blessed object and will -attempt to call a destructor for the object. A destructor will be -provided in the XS source to free the memory used by getnetconfigent(). -Destructors in XS can be created by specifying an XSUB function whose name -ends with the word B. XS destructors can be used to free memory -which may have been malloc'd by another XSUB. - - struct netconfig *getnetconfigent(const char *netid); - -A C will be created for C. The Perl -object will be blessed in a class matching the name of the C -type, with the tag C appended, and the name should not -have embedded spaces if it will be a Perl package name. The -destructor will be placed in a class corresponding to the -class of the object and the PREFIX keyword will be used to -trim the name to the word DESTROY as Perl will expect. - - typedef struct netconfig Netconfig; - - MODULE = RPC PACKAGE = RPC - - Netconfig * - getnetconfigent(netid) - char *netid - - MODULE = RPC PACKAGE = NetconfigPtr PREFIX = rpcb_ - - void - rpcb_DESTROY(netconf) - Netconfig *netconf - CODE: - printf("Now in NetconfigPtr::DESTROY\n"); - free(netconf); - -This example requires the following typemap entry. Consult -L for more information about adding new typemaps -for an extension. - - TYPEMAP - Netconfig * T_PTROBJ - -This example will be used with the following Perl statements. - - use RPC; - $netconf = getnetconfigent("udp"); - -When Perl destroys the object referenced by $netconf it will send the -object to the supplied XSUB DESTROY function. Perl cannot determine, and -does not care, that this object is a C struct and not a Perl object. In -this sense, there is no difference between the object created by the -getnetconfigent() XSUB and an object created by a normal Perl subroutine. =head2 Safely Storing Static Data in XS From 305e19cdf1f3882e767cefe22ca62d781c2b169e Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Wed, 4 Jun 2025 15:02:33 +0100 Subject: [PATCH 04/42] perlxs.pod: fix up links after deleting sections The previous commit deleted several sections from perlxs.pod. This commit fixes things up; done as a separate commit so that the changes aren't drowned out in the diff listing. --- XSUB.h | 2 +- dist/ExtUtils-ParseXS/lib/perlxs.pod | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/XSUB.h b/XSUB.h index 7bb60e5156f4..87893d435106 100644 --- a/XSUB.h +++ b/XSUB.h @@ -30,7 +30,7 @@ C>. =for apidoc Amnu|type|RETVAL Variable which is setup by C to hold the return value for an XSUB. This is always the proper type for the XSUB. See -L. +L. =for apidoc Amnu|type|THIS Variable which is setup by C to designate the object in a C++ diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 0cc4fc5c7159..01e7d6d0cc14 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -119,7 +119,7 @@ indicates that while the C subroutine we provide an interface to has a non-C return type, the return value of this C subroutine should not be returned from the generated Perl subroutine. -With this keyword present L is created, and in the +With this keyword present the C variable is created, and in the generated call to the subroutine this variable is assigned to, but the value of this variable is not going to be used in the auto-generated code. @@ -697,8 +697,8 @@ the XSUB returns back to Perl. The generated trailer for a CODE: section ensures that the number of return values Perl will see is either 0 or 1 (depending on the Cness of the -return value of the C function, and heuristics mentioned in -L<"The RETVAL Variable">). The trailer generated for a PPCODE: section +return value of the C function, and heuristics to work around CODE +setting C on a C XSUB. The trailer generated for a PPCODE: section is based on the number of return values and on the number of times C was updated by C<[X]PUSH*()> macros. @@ -769,7 +769,7 @@ executed after the C subroutine call is performed. When the POSTCALL: keyword is used it must precede OUTPUT: and CLEANUP: blocks which are present in the XSUB. -See examples in L<"The NO_OUTPUT Keyword"> and L<"Returning Undef And Empty Lists">. +See examples in L<"The NO_OUTPUT Keyword">. The POSTCALL: block does not make a lot of sense when the C subroutine call is supplied by user by providing either CODE: or PPCODE: section. From 853f692246d2dc6c2ab2ea8c254f459021f4e3f1 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Sat, 7 Jun 2025 15:55:26 +0100 Subject: [PATCH 05/42] perlxs.pod: reorder sections This big commit does a series of plain cut+pastes to reorder all the =head2 sections within the file. This changes the order from semi-random into roughly the order the various XS keywords would appear within an XS file, and then within an XSUB declaration/definition. No changes have been made to the text: simply that all lines from a particular '^=head2' up until the next head2 have been cut+paste as a single unit. No attempt has been made yet to make the text consistent with the new ordering; that will be done by the subsequent commits of this branch. The previous ordering in this file was: =head1 NAME =head1 DESCRIPTION =head2 The MODULE Keyword =head2 The PACKAGE Keyword =head2 The PREFIX Keyword =head2 The OUTPUT: Keyword =head2 The NO_OUTPUT Keyword =head2 The CODE: Keyword =head2 The INIT: Keyword =head2 The NO_INIT Keyword =head2 The TYPEMAP: Keyword =head2 Initializing Function Parameters =head2 Default Parameter Values =head2 The PREINIT: Keyword =head2 The SCOPE: Keyword =head2 The INPUT: Keyword =head2 The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords =head2 The C Keyword =head2 Variable-length Parameter Lists =head2 The C_ARGS: Keyword =head2 The PPCODE: Keyword =head2 The REQUIRE: Keyword =head2 The CLEANUP: Keyword =head2 The POSTCALL: Keyword =head2 The BOOT: Keyword =head2 The VERSIONCHECK: Keyword =head2 The PROTOTYPES: Keyword =head2 The PROTOTYPE: Keyword =head2 The ALIAS: Keyword =head2 The OVERLOAD: Keyword =head2 The FALLBACK: Keyword =head2 The INTERFACE: Keyword =head2 The INTERFACE_MACRO: Keyword =head2 The INCLUDE: Keyword =head2 The INCLUDE_COMMAND: Keyword =head2 The CASE: Keyword =head2 The EXPORT_XSUB_SYMBOLS: Keyword =head2 The & Unary Operator =head2 Inserting POD, Comments and C Preprocessor Directives =head2 Using XS With C++ =head2 Safely Storing Static Data in XS =head3 MY_CXT REFERENCE =head1 EXAMPLES =head1 CAVEATS =head2 Use of standard C library functions =head2 Event loops and control flow =head1 XS VERSION =head1 AUTHOR DIAGNOSTICS =head1 AUTHOR and is now: =head1 NAME =head1 DESCRIPTION =head2 The MODULE Keyword =head2 The PACKAGE Keyword =head2 The PREFIX Keyword =head2 Inserting POD, Comments and C Preprocessor Directives =head2 The REQUIRE: Keyword =head2 The VERSIONCHECK: Keyword =head2 The PROTOTYPES: Keyword =head2 The EXPORT_XSUB_SYMBOLS: Keyword =head2 The INCLUDE: Keyword =head2 The INCLUDE_COMMAND: Keyword =head2 The TYPEMAP: Keyword =head2 The BOOT: Keyword =head2 The FALLBACK: Keyword =head2 The NO_OUTPUT Keyword =head2 The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords =head2 Default Parameter Values =head2 The C Keyword =head2 Variable-length Parameter Lists =head2 The PREINIT: Keyword =head2 The INPUT: Keyword =head2 The NO_INIT Keyword =head2 Initializing Function Parameters =head2 The & Unary Operator =head2 The SCOPE: Keyword =head2 The INIT: Keyword =head2 The C_ARGS: Keyword =head2 The CODE: Keyword =head2 The PPCODE: Keyword =head2 The POSTCALL: Keyword =head2 The OUTPUT: Keyword =head2 The CLEANUP: Keyword =head2 The PROTOTYPE: Keyword =head2 The OVERLOAD: Keyword =head2 The ALIAS: Keyword =head2 The INTERFACE: Keyword =head2 The INTERFACE_MACRO: Keyword =head2 The CASE: Keyword =head2 Using XS With C++ =head2 Safely Storing Static Data in XS =head3 MY_CXT REFERENCE =head1 EXAMPLES =head1 CAVEATS =head2 Use of standard C library functions =head2 Event loops and control flow =head1 XS VERSION =head1 AUTHOR DIAGNOSTICS =head1 AUTHOR --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 1111 +++++++++++++------------- 1 file changed, 556 insertions(+), 555 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 01e7d6d0cc14..0c1feb85a632 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -61,27 +61,120 @@ keyword. MODULE = RPC PACKAGE = RPCB PREFIX = rpcb_ -=head2 The OUTPUT: Keyword +=head2 Inserting POD, Comments and C Preprocessor Directives -The OUTPUT: keyword indicates that certain function parameters should be -updated (new values made visible to Perl) when the XSUB terminates or that -certain values should be returned to the calling Perl function. For -simple functions which have no CODE: or PPCODE: section, -such as the sin() function above, the RETVAL variable is -automatically designated as an output value. For more complex functions -the B compiler will need help to determine which variables are output -variables. +C preprocessor directives are allowed within BOOT:, PREINIT: INIT:, CODE:, +PPCODE:, POSTCALL:, and CLEANUP: blocks, as well as outside the functions. +Comments are allowed anywhere after the MODULE keyword. The compiler will +pass the preprocessor directives through untouched and will remove the +commented lines. POD documentation is allowed at any point, both in the +C and XS language sections. POD must be terminated with a C<=cut> command; +C will exit with an error if it does not. It is very unlikely that +human generated C code will be mistaken for POD, as most indenting styles +result in whitespace in front of any line starting with C<=>. Machine +generated XS files may fall into this trap unless care is taken to +ensure that a space breaks the sequence "\n=". -This keyword will normally be used to complement the CODE: keyword. -The RETVAL variable is not recognized as an output variable when the -CODE: keyword is present. The OUTPUT: keyword is used in this -situation to tell the compiler that RETVAL really is an output -variable. +Comments can be added to XSUBs by placing a C<#> as the first +non-whitespace of a line. Care should be taken to avoid making the +comment look like a C preprocessor directive, lest it be interpreted as +such. The simplest way to prevent this is to put whitespace in front of +the C<#>. -The OUTPUT: keyword can also be used to indicate that function parameters -are output variables. This may be necessary when a parameter has been -modified within the function and the programmer would like the update to -be seen by Perl. +If you use preprocessor directives to choose one of two +versions of a function, use + + #if ... version1 + #else /* ... version2 */ + #endif + +and not + + #if ... version1 + #endif + #if ... version2 + #endif + +because otherwise B will believe that you made a duplicate +definition of the function. Also, put a blank line before the +#else/#endif so it will not be seen as part of the function body. + +=head2 The REQUIRE: Keyword + +The REQUIRE: keyword is used to indicate the minimum version of the +B compiler needed to compile the XS module. An XS module which +contains the following statement will compile with only B version +1.922 or greater: + + REQUIRE: 1.922 + +=head2 The VERSIONCHECK: Keyword + +The VERSIONCHECK: keyword corresponds to B's C<-versioncheck> and +C<-noversioncheck> options. This keyword overrides the command line +options. Version checking is enabled by default. When version checking is +enabled the XS module will attempt to verify that its version matches the +version of the PM module. + +To enable version checking: + + VERSIONCHECK: ENABLE + +To disable version checking: + + VERSIONCHECK: DISABLE + +Note that if the version of the PM module is an NV (a floating point +number), it will be stringified with a possible loss of precision +(currently chopping to nine decimal places) so that it may not match +the version of the XS module anymore. Quoting the $VERSION declaration +to make it a string is recommended if long version numbers are used. + +=head2 The PROTOTYPES: Keyword + +The PROTOTYPES: keyword corresponds to B's C<-prototypes> and +C<-noprototypes> options. This keyword overrides the command line options. +Prototypes are disabled by default. When prototypes are enabled, XSUBs will +be given Perl prototypes. This keyword may be used multiple times in an XS +module to enable and disable prototypes for different parts of the module. +Note that B will nag you if you don't explicitly enable or disable +prototypes, with: + + Please specify prototyping behavior for Foo.xs (see perlxs manual) + +To enable prototypes: + + PROTOTYPES: ENABLE + +To disable prototypes: + + PROTOTYPES: DISABLE + +=head2 The EXPORT_XSUB_SYMBOLS: Keyword + +The EXPORT_XSUB_SYMBOLS: keyword is likely something you will never need. +In perl versions earlier than 5.16.0, this keyword does nothing. Starting +with 5.16, XSUB symbols are no longer exported by default. That is, they +are C functions. If you include + + EXPORT_XSUB_SYMBOLS: ENABLE + +in your XS code, the XSUBs following this line will not be declared C. +You can later disable this with + + EXPORT_XSUB_SYMBOLS: DISABLE + +which, again, is the default that you should probably never change. +You cannot use this keyword on versions of perl before 5.16 to make +XSUBs C. + +=head2 The INCLUDE: Keyword + +This keyword can be used to pull other files into the XS module. The other +files may have XS code. INCLUDE: can also be used to run a command to +generate the XS code to be pulled into the module. + +The file F contains our C function: bool_t rpcb_gettime(host, timep) @@ -90,27 +183,83 @@ be seen by Perl. OUTPUT: timep -The OUTPUT: keyword will also allow an output parameter to -be mapped to a matching piece of code rather than to a -typemap. +The XS module can use INCLUDE: to pull that file into it. - bool_t - rpcb_gettime(host, timep) - char *host - time_t &timep - OUTPUT: - timep sv_setnv(ST(1), (double)timep); + INCLUDE: Rpcb1.xsh -B emits an automatic C for all parameters in the -OUTPUT section of the XSUB, except RETVAL. This is the usually desired -behavior, as it takes care of properly invoking 'set' magic on output -parameters (needed for hash or array element parameters that must be -created if they didn't exist). If for some reason, this behavior is -not desired, the OUTPUT section may contain a C line -to disable it for the remainder of the parameters in the OUTPUT section. -Likewise, C can be used to reenable it for the -remainder of the OUTPUT section. See L for more details -about 'set' magic. +If the parameters to the INCLUDE: keyword are followed by a pipe (C<|>) then +the compiler will interpret the parameters as a command. This feature is +mildly deprecated in favour of the C directive, as documented +below. + + INCLUDE: cat Rpcb1.xsh | + +Do not use this to run perl: C will run the perl that +happens to be the first in your path and not necessarily the same perl that is +used to run C. See L<"The INCLUDE_COMMAND: Keyword">. + +=head2 The INCLUDE_COMMAND: Keyword + +Runs the supplied command and includes its output into the current XS +document. C assigns special meaning to the C<$^X> token +in that it runs the same perl interpreter that is running C: + + INCLUDE_COMMAND: cat Rpcb1.xsh + + INCLUDE_COMMAND: $^X -e ... + +=head2 The TYPEMAP: Keyword + +Starting with Perl 5.16, you can embed typemaps into your XS code +instead of or in addition to typemaps in a separate file. Multiple +such embedded typemaps will be processed in order of appearance in +the XS code and like local typemap files take precedence over the +default typemap, the embedded typemaps may overwrite previous +definitions of TYPEMAP, INPUT, and OUTPUT stanzas. The syntax for +embedded typemaps is + + TYPEMAP: < keyword must appear in the first column of a +new line. + +Refer to L for details on writing typemaps. + +=head2 The BOOT: Keyword + +The BOOT: keyword is used to add code to the extension's bootstrap +function. The bootstrap function is generated by the B compiler and +normally holds the statements necessary to register any XSUBs with Perl. +With the BOOT: keyword the programmer can tell the compiler to add extra +statements to the bootstrap function. + +This keyword may be used any time after the first MODULE keyword and should +appear on a line by itself. The first blank line after the keyword will +terminate the code block. + + BOOT: + # The following message will be printed when the + # bootstrap function executes. + printf("Hello from the bootstrap!\n"); + +=head2 The FALLBACK: Keyword + +In addition to the OVERLOAD keyword, if you need to control how +Perl autogenerates missing overloaded operators, you can set the +FALLBACK keyword in the module header section, like this: + + MODULE = RPC PACKAGE = RPC + + FALLBACK: TRUE + ... + +where FALLBACK can take any of the three values TRUE, FALSE, or +UNDEF. If you do not set any FALLBACK value when using OVERLOAD, +it defaults to UNDEF. FALLBACK is not used except when one or +more functions using OVERLOAD have been defined. Please see +L for more details. =head2 The NO_OUTPUT Keyword @@ -137,139 +286,90 @@ indicator. For example, Here the generated XS function returns nothing on success, and will die() with a meaningful error message on error. -=head2 The CODE: Keyword - -This keyword is used in more complicated XSUBs which require -special handling for the C function. The RETVAL variable is -still declared, but it will not be returned unless it is specified -in the OUTPUT: section. - -The following XSUB is for a C function which requires special handling of -its parameters. The Perl usage is given first. - - $status = rpcb_gettime("localhost", $timep); - -The XSUB follows. +=head2 The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords - bool_t - rpcb_gettime(host, timep) - char *host - time_t timep - CODE: - RETVAL = rpcb_gettime(host, &timep); - OUTPUT: - timep - RETVAL +In the list of parameters for an XSUB, one can precede parameter names +by the C/C/C/C/C keywords. +C keyword is the default, the other keywords indicate how the Perl +interface should differ from the C interface. -=head2 The INIT: Keyword +Parameters preceded by C/C/C/C +keywords are considered to be used by the C subroutine I. C/C keywords indicate that the C subroutine +does not inspect the memory pointed by this parameter, but will write +through this pointer to provide additional return values. -The INIT: keyword allows initialization to be inserted into the XSUB before -the compiler generates the call to the C function. Unlike the CODE: keyword -above, this keyword does not affect the way the compiler handles RETVAL. +Parameters preceded by C keyword do not appear in the usage +signature of the generated Perl function. - bool_t - rpcb_gettime(host, timep) - char *host - time_t &timep - INIT: - printf("# Host is %s\n", host); - OUTPUT: - timep +Parameters preceded by C/C/C I appear as +parameters to the Perl function. With the exception of +C-parameters, these parameters are converted to the corresponding +C type, then pointers to these data are given as arguments to the C +function. It is expected that the C function will write through these +pointers. -Another use for the INIT: section is to check for preconditions before -making a call to the C function: +The return list of the generated Perl function consists of the C return value +from the function (unless the XSUB is of C return type or +C was used) followed by all the C +and C parameters (in the order of appearance). On the +return from the XSUB the C/C Perl parameter will be +modified to have the values written by the C function. - long long - lldiv(a, b) - long long a - long long b - INIT: - if (a == 0 && b == 0) - XSRETURN_UNDEF; - if (b == 0) - croak("lldiv: cannot divide by 0"); +For example, an XSUB -=head2 The NO_INIT Keyword + void + day_month(OUTLIST day, IN unix_time, OUTLIST month) + int day + int unix_time + int month -The NO_INIT keyword is used to indicate that a function -parameter is being used only as an output value. The B -compiler will normally generate code to read the values of -all function parameters from the argument stack and assign -them to C variables upon entry to the function. NO_INIT -will tell the compiler that some parameters will be used for -output rather than for input and that they will be handled -before the function terminates. +should be used from Perl as -The following example shows a variation of the rpcb_gettime() function. -This function uses the timep variable only as an output variable and does -not care about its initial contents. + my ($day, $month) = day_month(time); - bool_t - rpcb_gettime(host, timep) - char *host - time_t &timep = NO_INIT - OUTPUT: - timep +The C signature of the corresponding function should be -=head2 The TYPEMAP: Keyword + void day_month(int *day, int unix_time, int *month); -Starting with Perl 5.16, you can embed typemaps into your XS code -instead of or in addition to typemaps in a separate file. Multiple -such embedded typemaps will be processed in order of appearance in -the XS code and like local typemap files take precedence over the -default typemap, the embedded typemaps may overwrite previous -definitions of TYPEMAP, INPUT, and OUTPUT stanzas. The syntax for -embedded typemaps is +The C/C/C/C/C keywords can be +mixed with ANSI-style declarations, as in - TYPEMAP: < keyword must appear in the first column of a -new line. +(here the optional C keyword is omitted). -Refer to L for details on writing typemaps. +The C parameters are identical with parameters introduced with +L and put into the C section (see +L). The C parameters are very similar, +the only difference being that the value C function writes through the +pointer would not modify the Perl parameter, but is put in the output +list. -=head2 Initializing Function Parameters +The C/C parameter differ from C/C +parameters only by the initial value of the Perl parameter not +being read (and not being given to the C function - which gets some +garbage instead). For example, the same C function as above can be +interfaced with as -C function parameters are normally initialized with their values from -the argument stack (which in turn contains the parameters that were -passed to the XSUB from Perl). The typemaps contain the -code segments which are used to translate the Perl values to -the C parameters. The programmer, however, is allowed to -override the typemaps and supply alternate (or additional) -initialization code. Initialization code starts with the first -C<=>, C<;> or C<+> on a line in the INPUT: section. The only -exception happens if this C<;> terminates the line, then this C<;> -is quietly ignored. + void day_month(OUT int day, int unix_time, OUT int month); -The following code demonstrates how to supply initialization code for -function parameters. The initialization code is eval'ed within double -quotes by the compiler before it is added to the output so anything -which should be interpreted literally [mainly C<$>, C<@>, or C<\\>] -must be protected with backslashes. The variables C<$var>, C<$arg>, -and C<$type> can be used as in typemaps. +or - bool_t - rpcb_gettime(host, timep) - char *host = (char *)SvPVbyte_nolen($arg); - time_t &timep = 0; - OUTPUT: - timep + void + day_month(day, unix_time, month) + int &day = NO_INIT + int unix_time + int &month = NO_INIT + OUTPUT: + day + month -This should not be used to supply default values for parameters. One -would normally use this when a function parameter must be processed by -another library function before it can be used. Default parameters are -covered in the next section. +However, the generated Perl function is called in very C-ish style: -If the initialization begins with C<=>, then it is output in -the declaration for the input variable, replacing the initialization -supplied by the typemap. If the initialization -begins with C<;> or C<+>, then it is performed after -all of the input variables have been declared. In the C<;> -case the initialization normally supplied by the typemap is not performed. -For the C<+> case, the declaration for the variable will include the -initialization from the typemap. + my ($day, $month); + day_month($day, time, $month); =head2 Default Parameter Values @@ -302,6 +402,64 @@ the parameters in the correct order for that function. timep RETVAL +=head2 The C Keyword + +If one of the input arguments to the C function is the length of a string +argument C, one can substitute the name of the length-argument by +C in the XSUB declaration. This argument must be omitted when +the generated Perl function is called. E.g., + + void + dump_chars(char *s, short l) + { + short n = 0; + while (n < l) { + printf("s[%d] = \"\\%#03o\"\n", n, (int)s[n]); + n++; + } + } + + MODULE = x PACKAGE = x + + void dump_chars(char *s, short length(s)) + +should be called as C. + +This directive is supported with ANSI-type function declarations only. + +=head2 Variable-length Parameter Lists + +XSUBs can have variable-length parameter lists by specifying an ellipsis +C<(...)> in the parameter list. This use of the ellipsis is similar to that +found in ANSI C. The programmer is able to determine the number of +arguments passed to the XSUB by examining the C variable which the +B compiler supplies for all XSUBs. By using this mechanism one can +create an XSUB which accepts a list of parameters of unknown length. + +The I parameter for the rpcb_gettime() XSUB can be +optional so the ellipsis can be used to indicate that the +XSUB will take a variable number of parameters. Perl should +be able to call this XSUB with either of the following statements. + + $status = rpcb_gettime($timep, $host); + + $status = rpcb_gettime($timep); + +The XS code, with ellipsis, follows. + + bool_t + rpcb_gettime(timep, ...) + time_t timep = NO_INIT + PREINIT: + char *host = "localhost"; + CODE: + if (items > 1) + host = (char *)SvPVbyte_nolen(ST(1)); + RETVAL = rpcb_gettime(host, &timep); + OUTPUT: + timep + RETVAL + =head2 The PREINIT: Keyword The PREINIT: keyword allows extra variables to be declared immediately @@ -402,44 +560,6 @@ and the code for rpcb_gettime() can be rewritten as timep RETVAL -=head2 The SCOPE: Keyword - -The SCOPE: keyword allows scoping to be enabled for a particular XSUB. -Its effect is to wrap the main body of the XSUB (i.e. the C or -C or implicit) with an C and C pair. This has the -effect of clearing any accumulated savestack entries at the end of the -code body. It is disabled by default. - -The SCOPE keyword may appear either within the XSUB body (anywhere before -a C could appear), or just before the XSUB declaration, but part of -the same paragraph (i.e. no intervening blank lines). For example: - - void - foo() - INPUT: - ... - PREINIT: - ... - SCOPE: ENABLE - CODE: - ... - - SCOPE: ENABLE - void - bar() - ... - -The first form (within the XSUB body) has been available since perl-5.004, -but was broken by perl-5.12.0 (xsubpp v2.21) and fixed in perl-5.44.0 -(xsubpp v3.58). The second form has been available since perl-5.12.0 . - -Note that to support potentially complex type mappings, if a typemap entry -used by an XSUB contains a comment like C, then scoping will -be automatically enabled for any XSUB which uses that typemap entry for an -C parameter. This currently only works for parameters whose type -is specified in a separate C line rather than any ANSI-style -declaration (C). - =head2 The INPUT: Keyword The XSUB's parameters are usually evaluated immediately after entering the @@ -506,148 +626,168 @@ thus C is initialized on the declaration line, and our assignment C is not performed too early. Otherwise one would need to have the assignment C in a CODE: or INIT: section.) -=head2 The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords - -In the list of parameters for an XSUB, one can precede parameter names -by the C/C/C/C/C keywords. -C keyword is the default, the other keywords indicate how the Perl -interface should differ from the C interface. - -Parameters preceded by C/C/C/C -keywords are considered to be used by the C subroutine I. C/C keywords indicate that the C subroutine -does not inspect the memory pointed by this parameter, but will write -through this pointer to provide additional return values. - -Parameters preceded by C keyword do not appear in the usage -signature of the generated Perl function. - -Parameters preceded by C/C/C I appear as -parameters to the Perl function. With the exception of -C-parameters, these parameters are converted to the corresponding -C type, then pointers to these data are given as arguments to the C -function. It is expected that the C function will write through these -pointers. - -The return list of the generated Perl function consists of the C return value -from the function (unless the XSUB is of C return type or -C was used) followed by all the C -and C parameters (in the order of appearance). On the -return from the XSUB the C/C Perl parameter will be -modified to have the values written by the C function. - -For example, an XSUB - - void - day_month(OUTLIST day, IN unix_time, OUTLIST month) - int day - int unix_time - int month - -should be used from Perl as - - my ($day, $month) = day_month(time); +=head2 The NO_INIT Keyword -The C signature of the corresponding function should be +The NO_INIT keyword is used to indicate that a function +parameter is being used only as an output value. The B +compiler will normally generate code to read the values of +all function parameters from the argument stack and assign +them to C variables upon entry to the function. NO_INIT +will tell the compiler that some parameters will be used for +output rather than for input and that they will be handled +before the function terminates. - void day_month(int *day, int unix_time, int *month); +The following example shows a variation of the rpcb_gettime() function. +This function uses the timep variable only as an output variable and does +not care about its initial contents. -The C/C/C/C/C keywords can be -mixed with ANSI-style declarations, as in + bool_t + rpcb_gettime(host, timep) + char *host + time_t &timep = NO_INIT + OUTPUT: + timep - void - day_month(OUTLIST int day, int unix_time, OUTLIST int month) +=head2 Initializing Function Parameters -(here the optional C keyword is omitted). +C function parameters are normally initialized with their values from +the argument stack (which in turn contains the parameters that were +passed to the XSUB from Perl). The typemaps contain the +code segments which are used to translate the Perl values to +the C parameters. The programmer, however, is allowed to +override the typemaps and supply alternate (or additional) +initialization code. Initialization code starts with the first +C<=>, C<;> or C<+> on a line in the INPUT: section. The only +exception happens if this C<;> terminates the line, then this C<;> +is quietly ignored. -The C parameters are identical with parameters introduced with -L and put into the C section (see -L). The C parameters are very similar, -the only difference being that the value C function writes through the -pointer would not modify the Perl parameter, but is put in the output -list. +The following code demonstrates how to supply initialization code for +function parameters. The initialization code is eval'ed within double +quotes by the compiler before it is added to the output so anything +which should be interpreted literally [mainly C<$>, C<@>, or C<\\>] +must be protected with backslashes. The variables C<$var>, C<$arg>, +and C<$type> can be used as in typemaps. -The C/C parameter differ from C/C -parameters only by the initial value of the Perl parameter not -being read (and not being given to the C function - which gets some -garbage instead). For example, the same C function as above can be -interfaced with as + bool_t + rpcb_gettime(host, timep) + char *host = (char *)SvPVbyte_nolen($arg); + time_t &timep = 0; + OUTPUT: + timep - void day_month(OUT int day, int unix_time, OUT int month); +This should not be used to supply default values for parameters. One +would normally use this when a function parameter must be processed by +another library function before it can be used. Default parameters are +covered in the next section. -or +If the initialization begins with C<=>, then it is output in +the declaration for the input variable, replacing the initialization +supplied by the typemap. If the initialization +begins with C<;> or C<+>, then it is performed after +all of the input variables have been declared. In the C<;> +case the initialization normally supplied by the typemap is not performed. +For the C<+> case, the declaration for the variable will include the +initialization from the typemap. - void - day_month(day, unix_time, month) - int &day = NO_INIT - int unix_time - int &month = NO_INIT - OUTPUT: - day - month +=head2 The & Unary Operator -However, the generated Perl function is called in very C-ish style: +The C<&> unary operator in the INPUT: section is used to tell B +that it should convert a Perl value to/from C using the C type to the left +of C<&>, but provide a pointer to this value when the C function is called. - my ($day, $month); - day_month($day, time, $month); +This is useful to avoid a CODE: block for a C function which takes a parameter +by reference. Typically, the parameter should be not a pointer type (an +C or C but not an C or C). -=head2 The C Keyword +The following XSUB will generate incorrect C code. The B compiler will +turn this into code which calls C with parameters C<(char +*host, time_t timep)>, but the real C wants the C +parameter to be of type C rather than C. -If one of the input arguments to the C function is the length of a string -argument C, one can substitute the name of the length-argument by -C in the XSUB declaration. This argument must be omitted when -the generated Perl function is called. E.g., + bool_t + rpcb_gettime(host, timep) + char *host + time_t timep + OUTPUT: + timep - void - dump_chars(char *s, short l) - { - short n = 0; - while (n < l) { - printf("s[%d] = \"\\%#03o\"\n", n, (int)s[n]); - n++; - } - } +That problem is corrected by using the C<&> operator. The B compiler +will now turn this into code which calls C correctly with +parameters C<(char *host, time_t *timep)>. It does this by carrying the +C<&> through, so the function call looks like C. - MODULE = x PACKAGE = x + bool_t + rpcb_gettime(host, timep) + char *host + time_t &timep + OUTPUT: + timep - void dump_chars(char *s, short length(s)) +=head2 The SCOPE: Keyword -should be called as C. +The SCOPE: keyword allows scoping to be enabled for a particular XSUB. +Its effect is to wrap the main body of the XSUB (i.e. the C or +C or implicit) with an C and C pair. This has the +effect of clearing any accumulated savestack entries at the end of the +code body. It is disabled by default. -This directive is supported with ANSI-type function declarations only. +The SCOPE keyword may appear either within the XSUB body (anywhere before +a C could appear), or just before the XSUB declaration, but part of +the same paragraph (i.e. no intervening blank lines). For example: -=head2 Variable-length Parameter Lists + void + foo() + INPUT: + ... + PREINIT: + ... + SCOPE: ENABLE + CODE: + ... -XSUBs can have variable-length parameter lists by specifying an ellipsis -C<(...)> in the parameter list. This use of the ellipsis is similar to that -found in ANSI C. The programmer is able to determine the number of -arguments passed to the XSUB by examining the C variable which the -B compiler supplies for all XSUBs. By using this mechanism one can -create an XSUB which accepts a list of parameters of unknown length. + SCOPE: ENABLE + void + bar() + ... -The I parameter for the rpcb_gettime() XSUB can be -optional so the ellipsis can be used to indicate that the -XSUB will take a variable number of parameters. Perl should -be able to call this XSUB with either of the following statements. +The first form (within the XSUB body) has been available since perl-5.004, +but was broken by perl-5.12.0 (xsubpp v2.21) and fixed in perl-5.44.0 +(xsubpp v3.58). The second form has been available since perl-5.12.0 . - $status = rpcb_gettime($timep, $host); +Note that to support potentially complex type mappings, if a typemap entry +used by an XSUB contains a comment like C, then scoping will +be automatically enabled for any XSUB which uses that typemap entry for an +C parameter. This currently only works for parameters whose type +is specified in a separate C line rather than any ANSI-style +declaration (C). - $status = rpcb_gettime($timep); +=head2 The INIT: Keyword -The XS code, with ellipsis, follows. +The INIT: keyword allows initialization to be inserted into the XSUB before +the compiler generates the call to the C function. Unlike the CODE: keyword +above, this keyword does not affect the way the compiler handles RETVAL. bool_t - rpcb_gettime(timep, ...) - time_t timep = NO_INIT - PREINIT: - char *host = "localhost"; - CODE: - if (items > 1) - host = (char *)SvPVbyte_nolen(ST(1)); - RETVAL = rpcb_gettime(host, &timep); + rpcb_gettime(host, timep) + char *host + time_t &timep + INIT: + printf("# Host is %s\n", host); OUTPUT: timep - RETVAL + +Another use for the INIT: section is to check for preconditions before +making a call to the C function: + + long long + lldiv(a, b) + long long a + long long b + INIT: + if (a == 0 && b == 0) + XSRETURN_UNDEF; + if (b == 0) + croak("lldiv: cannot divide by 0"); =head2 The C_ARGS: Keyword @@ -675,6 +815,30 @@ To do this, declare the XSUB as C_ARGS: n, function, default_flags +=head2 The CODE: Keyword + +This keyword is used in more complicated XSUBs which require +special handling for the C function. The RETVAL variable is +still declared, but it will not be returned unless it is specified +in the OUTPUT: section. + +The following XSUB is for a C function which requires special handling of +its parameters. The Perl usage is given first. + + $status = rpcb_gettime("localhost", $timep); + +The XSUB follows. + + bool_t + rpcb_gettime(host, timep) + char *host + time_t timep + CODE: + RETVAL = rpcb_gettime(host, &timep); + OUTPUT: + timep + RETVAL + =head2 The PPCODE: Keyword The PPCODE: keyword is an alternate form of the CODE: keyword and is used @@ -745,23 +909,6 @@ the following statement. When handling output parameters with a PPCODE section, be sure to handle 'set' magic properly. See L for details about 'set' magic. -=head2 The REQUIRE: Keyword - -The REQUIRE: keyword is used to indicate the minimum version of the -B compiler needed to compile the XS module. An XS module which -contains the following statement will compile with only B version -1.922 or greater: - - REQUIRE: 1.922 - -=head2 The CLEANUP: Keyword - -This keyword can be used when an XSUB requires special cleanup procedures -before it terminates. When the CLEANUP: keyword is used it must follow -any CODE:, or OUTPUT: blocks which are present in the XSUB. The code -specified for the cleanup block will be added as the last statements in -the XSUB. - =head2 The POSTCALL: Keyword This keyword can be used when an XSUB requires special procedures @@ -774,64 +921,65 @@ See examples in L<"The NO_OUTPUT Keyword">. The POSTCALL: block does not make a lot of sense when the C subroutine call is supplied by user by providing either CODE: or PPCODE: section. -=head2 The BOOT: Keyword - -The BOOT: keyword is used to add code to the extension's bootstrap -function. The bootstrap function is generated by the B compiler and -normally holds the statements necessary to register any XSUBs with Perl. -With the BOOT: keyword the programmer can tell the compiler to add extra -statements to the bootstrap function. - -This keyword may be used any time after the first MODULE keyword and should -appear on a line by itself. The first blank line after the keyword will -terminate the code block. - - BOOT: - # The following message will be printed when the - # bootstrap function executes. - printf("Hello from the bootstrap!\n"); - -=head2 The VERSIONCHECK: Keyword - -The VERSIONCHECK: keyword corresponds to B's C<-versioncheck> and -C<-noversioncheck> options. This keyword overrides the command line -options. Version checking is enabled by default. When version checking is -enabled the XS module will attempt to verify that its version matches the -version of the PM module. - -To enable version checking: - - VERSIONCHECK: ENABLE -To disable version checking: +=head2 The OUTPUT: Keyword - VERSIONCHECK: DISABLE +The OUTPUT: keyword indicates that certain function parameters should be +updated (new values made visible to Perl) when the XSUB terminates or that +certain values should be returned to the calling Perl function. For +simple functions which have no CODE: or PPCODE: section, +such as the sin() function above, the RETVAL variable is +automatically designated as an output value. For more complex functions +the B compiler will need help to determine which variables are output +variables. -Note that if the version of the PM module is an NV (a floating point -number), it will be stringified with a possible loss of precision -(currently chopping to nine decimal places) so that it may not match -the version of the XS module anymore. Quoting the $VERSION declaration -to make it a string is recommended if long version numbers are used. +This keyword will normally be used to complement the CODE: keyword. +The RETVAL variable is not recognized as an output variable when the +CODE: keyword is present. The OUTPUT: keyword is used in this +situation to tell the compiler that RETVAL really is an output +variable. -=head2 The PROTOTYPES: Keyword +The OUTPUT: keyword can also be used to indicate that function parameters +are output variables. This may be necessary when a parameter has been +modified within the function and the programmer would like the update to +be seen by Perl. -The PROTOTYPES: keyword corresponds to B's C<-prototypes> and -C<-noprototypes> options. This keyword overrides the command line options. -Prototypes are disabled by default. When prototypes are enabled, XSUBs will -be given Perl prototypes. This keyword may be used multiple times in an XS -module to enable and disable prototypes for different parts of the module. -Note that B will nag you if you don't explicitly enable or disable -prototypes, with: + bool_t + rpcb_gettime(host, timep) + char *host + time_t &timep + OUTPUT: + timep - Please specify prototyping behavior for Foo.xs (see perlxs manual) +The OUTPUT: keyword will also allow an output parameter to +be mapped to a matching piece of code rather than to a +typemap. -To enable prototypes: + bool_t + rpcb_gettime(host, timep) + char *host + time_t &timep + OUTPUT: + timep sv_setnv(ST(1), (double)timep); - PROTOTYPES: ENABLE +B emits an automatic C for all parameters in the +OUTPUT section of the XSUB, except RETVAL. This is the usually desired +behavior, as it takes care of properly invoking 'set' magic on output +parameters (needed for hash or array element parameters that must be +created if they didn't exist). If for some reason, this behavior is +not desired, the OUTPUT section may contain a C line +to disable it for the remainder of the parameters in the OUTPUT section. +Likewise, C can be used to reenable it for the +remainder of the OUTPUT section. See L for more details +about 'set' magic. -To disable prototypes: +=head2 The CLEANUP: Keyword - PROTOTYPES: DISABLE +This keyword can be used when an XSUB requires special cleanup procedures +before it terminates. When the CLEANUP: keyword is used it must follow +any CODE:, or OUTPUT: blocks which are present in the XSUB. The code +specified for the cleanup block will be added as the last statements in +the XSUB. =head2 The PROTOTYPE: Keyword @@ -863,6 +1011,49 @@ XSUB as in the following example: PROTOTYPE: DISABLE ... +=head2 The OVERLOAD: Keyword + +Instead of writing an overloaded interface using pure Perl, you +can also use the OVERLOAD keyword to define additional Perl names +for your functions (like the ALIAS: keyword above). However, the +overloaded functions must be defined in such a way as to accept the number +of parameters supplied by perl's overload system. For most overload +methods, it will be three parameters; for the C function it will +be four. However, the bitwise operators C<&>, C<|>, C<^>, and C<~> may be +called with three I five arguments (see L). + +If any +function has the OVERLOAD: keyword, several additional lines +will be defined in the c file generated by xsubpp in order to +register with the overload magic. + +Since blessed objects are actually stored as RV's, it is useful +to use the typemap features to preprocess parameters and extract +the actual SV stored within the blessed RV. See the sample for +T_PTROBJ_SPECIAL in L. + +To use the OVERLOAD: keyword, create an XS function which takes +three input parameters (or use the C-style '...' definition) like +this: + + SV * + cmp (lobj, robj, swap) + My_Module_obj lobj + My_Module_obj robj + IV swap + OVERLOAD: cmp <=> + { /* function defined here */} + +In this case, the function will overload both of the three way +comparison operators. For all overload operations using non-alpha +characters, you must type the parameter without quoting, separating +multiple overloads with whitespace. Note that "" (the stringify +overload) should be entered as \"\" (i.e. escaped). + +Since, as mentioned above, bitwise operators may take extra arguments, you +may want to use something like C<(lobj, robj, swap, ...)> (with +literal C<...>) as your parameter list. + =head2 The ALIAS: Keyword The ALIAS: keyword allows an XSUB to have two or more unique Perl names @@ -927,66 +1118,6 @@ versions of our tool chain would be to do this: OUTPUT: timep -=head2 The OVERLOAD: Keyword - -Instead of writing an overloaded interface using pure Perl, you -can also use the OVERLOAD keyword to define additional Perl names -for your functions (like the ALIAS: keyword above). However, the -overloaded functions must be defined in such a way as to accept the number -of parameters supplied by perl's overload system. For most overload -methods, it will be three parameters; for the C function it will -be four. However, the bitwise operators C<&>, C<|>, C<^>, and C<~> may be -called with three I five arguments (see L). - -If any -function has the OVERLOAD: keyword, several additional lines -will be defined in the c file generated by xsubpp in order to -register with the overload magic. - -Since blessed objects are actually stored as RV's, it is useful -to use the typemap features to preprocess parameters and extract -the actual SV stored within the blessed RV. See the sample for -T_PTROBJ_SPECIAL in L. - -To use the OVERLOAD: keyword, create an XS function which takes -three input parameters (or use the C-style '...' definition) like -this: - - SV * - cmp (lobj, robj, swap) - My_Module_obj lobj - My_Module_obj robj - IV swap - OVERLOAD: cmp <=> - { /* function defined here */} - -In this case, the function will overload both of the three way -comparison operators. For all overload operations using non-alpha -characters, you must type the parameter without quoting, separating -multiple overloads with whitespace. Note that "" (the stringify -overload) should be entered as \"\" (i.e. escaped). - -Since, as mentioned above, bitwise operators may take extra arguments, you -may want to use something like C<(lobj, robj, swap, ...)> (with -literal C<...>) as your parameter list. - -=head2 The FALLBACK: Keyword - -In addition to the OVERLOAD keyword, if you need to control how -Perl autogenerates missing overloaded operators, you can set the -FALLBACK keyword in the module header section, like this: - - MODULE = RPC PACKAGE = RPC - - FALLBACK: TRUE - ... - -where FALLBACK can take any of the three values TRUE, FALSE, or -UNDEF. If you do not set any FALLBACK value when using OVERLOAD, -it defaults to UNDEF. FALLBACK is not used except when one or -more functions using OVERLOAD have been defined. Please see -L for more details. - =head2 The INTERFACE: Keyword This keyword declares the current XSUB as a keeper of the given @@ -1063,46 +1194,6 @@ in C section, in XSUB section. -=head2 The INCLUDE: Keyword - -This keyword can be used to pull other files into the XS module. The other -files may have XS code. INCLUDE: can also be used to run a command to -generate the XS code to be pulled into the module. - -The file F contains our C function: - - bool_t - rpcb_gettime(host, timep) - char *host - time_t &timep - OUTPUT: - timep - -The XS module can use INCLUDE: to pull that file into it. - - INCLUDE: Rpcb1.xsh - -If the parameters to the INCLUDE: keyword are followed by a pipe (C<|>) then -the compiler will interpret the parameters as a command. This feature is -mildly deprecated in favour of the C directive, as documented -below. - - INCLUDE: cat Rpcb1.xsh | - -Do not use this to run perl: C will run the perl that -happens to be the first in your path and not necessarily the same perl that is -used to run C. See L<"The INCLUDE_COMMAND: Keyword">. - -=head2 The INCLUDE_COMMAND: Keyword - -Runs the supplied command and includes its output into the current XS -document. C assigns special meaning to the C<$^X> token -in that it runs the same perl interpreter that is running C: - - INCLUDE_COMMAND: cat Rpcb1.xsh - - INCLUDE_COMMAND: $^X -e ... - =head2 The CASE: Keyword The CASE: keyword allows an XSUB to have multiple distinct parts with each @@ -1150,96 +1241,6 @@ the different argument lists. $status = x_gettime($timep, $host); -=head2 The EXPORT_XSUB_SYMBOLS: Keyword - -The EXPORT_XSUB_SYMBOLS: keyword is likely something you will never need. -In perl versions earlier than 5.16.0, this keyword does nothing. Starting -with 5.16, XSUB symbols are no longer exported by default. That is, they -are C functions. If you include - - EXPORT_XSUB_SYMBOLS: ENABLE - -in your XS code, the XSUBs following this line will not be declared C. -You can later disable this with - - EXPORT_XSUB_SYMBOLS: DISABLE - -which, again, is the default that you should probably never change. -You cannot use this keyword on versions of perl before 5.16 to make -XSUBs C. - -=head2 The & Unary Operator - -The C<&> unary operator in the INPUT: section is used to tell B -that it should convert a Perl value to/from C using the C type to the left -of C<&>, but provide a pointer to this value when the C function is called. - -This is useful to avoid a CODE: block for a C function which takes a parameter -by reference. Typically, the parameter should be not a pointer type (an -C or C but not an C or C). - -The following XSUB will generate incorrect C code. The B compiler will -turn this into code which calls C with parameters C<(char -*host, time_t timep)>, but the real C wants the C -parameter to be of type C rather than C. - - bool_t - rpcb_gettime(host, timep) - char *host - time_t timep - OUTPUT: - timep - -That problem is corrected by using the C<&> operator. The B compiler -will now turn this into code which calls C correctly with -parameters C<(char *host, time_t *timep)>. It does this by carrying the -C<&> through, so the function call looks like C. - - bool_t - rpcb_gettime(host, timep) - char *host - time_t &timep - OUTPUT: - timep - -=head2 Inserting POD, Comments and C Preprocessor Directives - -C preprocessor directives are allowed within BOOT:, PREINIT: INIT:, CODE:, -PPCODE:, POSTCALL:, and CLEANUP: blocks, as well as outside the functions. -Comments are allowed anywhere after the MODULE keyword. The compiler will -pass the preprocessor directives through untouched and will remove the -commented lines. POD documentation is allowed at any point, both in the -C and XS language sections. POD must be terminated with a C<=cut> command; -C will exit with an error if it does not. It is very unlikely that -human generated C code will be mistaken for POD, as most indenting styles -result in whitespace in front of any line starting with C<=>. Machine -generated XS files may fall into this trap unless care is taken to -ensure that a space breaks the sequence "\n=". - -Comments can be added to XSUBs by placing a C<#> as the first -non-whitespace of a line. Care should be taken to avoid making the -comment look like a C preprocessor directive, lest it be interpreted as -such. The simplest way to prevent this is to put whitespace in front of -the C<#>. - -If you use preprocessor directives to choose one of two -versions of a function, use - - #if ... version1 - #else /* ... version2 */ - #endif - -and not - - #if ... version1 - #endif - #if ... version2 - #endif - -because otherwise B will believe that you made a duplicate -definition of the function. Also, put a blank line before the -#else/#endif so it will not be seen as part of the function body. - =head2 Using XS With C++ If an XSUB name contains C<::>, it is considered to be a C++ method. From 21ecdb84f03015acbacccf22d005813a3f63f9a1 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Sat, 7 Jun 2025 16:40:23 +0100 Subject: [PATCH 06/42] perlxs.pod: add group headers Following the previous commit's reordering of the all the =head2 sections, demote most of the =head2 headers to =head3, and add some new =head2 headers which group together related headers. Also add some =head3's for a few missing keywords. Subsequent commits will flesh out the new sections. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 159 +++++++++++++++++++++------ 1 file changed, 123 insertions(+), 36 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 0c1feb85a632..9cb901058d76 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -20,7 +20,7 @@ all functions in a package named RPC. MODULE = RPC -=head2 The PACKAGE Keyword +=head3 The PACKAGE Keyword When functions within an XS source file must be separated into packages the PACKAGE keyword should be used. This keyword is used with the MODULE @@ -46,7 +46,7 @@ Although this keyword is optional and in some cases provides redundant information it should always be used. This keyword will ensure that the XSUBs appear in the desired package. -=head2 The PREFIX Keyword +=head3 The PREFIX Keyword The PREFIX keyword designates prefixes which should be removed from the Perl function names. If the C function is @@ -61,7 +61,14 @@ keyword. MODULE = RPC PACKAGE = RPCB PREFIX = rpcb_ -=head2 Inserting POD, Comments and C Preprocessor Directives +=head2 File-scoped XS Keywords and Directives + +XXX TBC + +XXX NB: L can technically be a file-scoped +keyword too, but is dealt with elsewhere. + +=head3 Inserting POD, Comments and C Preprocessor Directives C preprocessor directives are allowed within BOOT:, PREINIT: INIT:, CODE:, PPCODE:, POSTCALL:, and CLEANUP: blocks, as well as outside the functions. @@ -99,7 +106,7 @@ because otherwise B will believe that you made a duplicate definition of the function. Also, put a blank line before the #else/#endif so it will not be seen as part of the function body. -=head2 The REQUIRE: Keyword +=head3 The REQUIRE: Keyword The REQUIRE: keyword is used to indicate the minimum version of the B compiler needed to compile the XS module. An XS module which @@ -108,7 +115,7 @@ contains the following statement will compile with only B version REQUIRE: 1.922 -=head2 The VERSIONCHECK: Keyword +=head3 The VERSIONCHECK: Keyword The VERSIONCHECK: keyword corresponds to B's C<-versioncheck> and C<-noversioncheck> options. This keyword overrides the command line @@ -130,7 +137,7 @@ number), it will be stringified with a possible loss of precision the version of the XS module anymore. Quoting the $VERSION declaration to make it a string is recommended if long version numbers are used. -=head2 The PROTOTYPES: Keyword +=head3 The PROTOTYPES: Keyword The PROTOTYPES: keyword corresponds to B's C<-prototypes> and C<-noprototypes> options. This keyword overrides the command line options. @@ -150,7 +157,7 @@ To disable prototypes: PROTOTYPES: DISABLE -=head2 The EXPORT_XSUB_SYMBOLS: Keyword +=head3 The EXPORT_XSUB_SYMBOLS: Keyword The EXPORT_XSUB_SYMBOLS: keyword is likely something you will never need. In perl versions earlier than 5.16.0, this keyword does nothing. Starting @@ -168,7 +175,7 @@ which, again, is the default that you should probably never change. You cannot use this keyword on versions of perl before 5.16 to make XSUBs C. -=head2 The INCLUDE: Keyword +=head3 The INCLUDE: Keyword This keyword can be used to pull other files into the XS module. The other files may have XS code. INCLUDE: can also be used to run a command to @@ -198,7 +205,7 @@ Do not use this to run perl: C will run the perl that happens to be the first in your path and not necessarily the same perl that is used to run C. See L<"The INCLUDE_COMMAND: Keyword">. -=head2 The INCLUDE_COMMAND: Keyword +=head3 The INCLUDE_COMMAND: Keyword Runs the supplied command and includes its output into the current XS document. C assigns special meaning to the C<$^X> token @@ -208,7 +215,7 @@ in that it runs the same perl interpreter that is running C: INCLUDE_COMMAND: $^X -e ... -=head2 The TYPEMAP: Keyword +=head3 The TYPEMAP: Keyword Starting with Perl 5.16, you can embed typemaps into your XS code instead of or in addition to typemaps in a separate file. Multiple @@ -227,7 +234,7 @@ new line. Refer to L for details on writing typemaps. -=head2 The BOOT: Keyword +=head3 The BOOT: Keyword The BOOT: keyword is used to add code to the extension's bootstrap function. The bootstrap function is generated by the B compiler and @@ -244,7 +251,7 @@ terminate the code block. # bootstrap function executes. printf("Hello from the bootstrap!\n"); -=head2 The FALLBACK: Keyword +=head3 The FALLBACK: Keyword In addition to the OVERLOAD keyword, if you need to control how Perl autogenerates missing overloaded operators, you can set the @@ -261,7 +268,15 @@ it defaults to UNDEF. FALLBACK is not used except when one or more functions using OVERLOAD have been defined. Please see L for more details. -=head2 The NO_OUTPUT Keyword +=head2 The Structure of an XSUB + +XXX TBC + +=head2 An XSUB Declaration + +XXX TBC + +=head3 The NO_OUTPUT Keyword The NO_OUTPUT can be placed as the first token of the XSUB. This keyword indicates that while the C subroutine we provide an interface to has @@ -286,7 +301,9 @@ indicator. For example, Here the generated XS function returns nothing on success, and will die() with a meaningful error message on error. -=head2 The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords +=head2 XSUB Parameters + +=head3 The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords In the list of parameters for an XSUB, one can precede parameter names by the C/C/C/C/C keywords. @@ -371,7 +388,7 @@ However, the generated Perl function is called in very C-ish style: my ($day, $month); day_month($day, time, $month); -=head2 Default Parameter Values +=head3 Default Parameter Values Default values for XSUB arguments can be specified by placing an assignment statement in the parameter list. The default value may @@ -402,7 +419,7 @@ the parameters in the correct order for that function. timep RETVAL -=head2 The C Keyword +=head3 The C Keyword If one of the input arguments to the C function is the length of a string argument C, one can substitute the name of the length-argument by @@ -427,7 +444,7 @@ should be called as C. This directive is supported with ANSI-type function declarations only. -=head2 Variable-length Parameter Lists +=head3 Variable-length Parameter Lists XSUBs can have variable-length parameter lists by specifying an ellipsis C<(...)> in the parameter list. This use of the ellipsis is similar to that @@ -460,7 +477,15 @@ The XS code, with ellipsis, follows. timep RETVAL -=head2 The PREINIT: Keyword +=head2 The XSUB Input Part + +XXX TBC + +XXX NB: the keywords described in L and L may also appear in this part, plus C and +C. + +=head3 The PREINIT: Keyword The PREINIT: keyword allows extra variables to be declared immediately before or after the declarations of the parameters from the INPUT: section @@ -560,7 +585,7 @@ and the code for rpcb_gettime() can be rewritten as timep RETVAL -=head2 The INPUT: Keyword +=head3 The INPUT: Keyword The XSUB's parameters are usually evaluated immediately after entering the XSUB. The INPUT: keyword can be used to force those parameters to be @@ -626,7 +651,7 @@ thus C is initialized on the declaration line, and our assignment C is not performed too early. Otherwise one would need to have the assignment C in a CODE: or INIT: section.) -=head2 The NO_INIT Keyword +=head4 The NO_INIT Keyword The NO_INIT keyword is used to indicate that a function parameter is being used only as an output value. The B @@ -648,7 +673,7 @@ not care about its initial contents. OUTPUT: timep -=head2 Initializing Function Parameters +=head4 Initializing Function Parameters C function parameters are normally initialized with their values from the argument stack (which in turn contains the parameters that were @@ -689,7 +714,7 @@ case the initialization normally supplied by the typemap is not performed. For the C<+> case, the declaration for the variable will include the initialization from the typemap. -=head2 The & Unary Operator +=head4 The & Unary Operator The C<&> unary operator in the INPUT: section is used to tell B that it should convert a Perl value to/from C using the C type to the left @@ -723,7 +748,7 @@ C<&> through, so the function call looks like C. OUTPUT: timep -=head2 The SCOPE: Keyword +=head3 The SCOPE: Keyword The SCOPE: keyword allows scoping to be enabled for a particular XSUB. Its effect is to wrap the main body of the XSUB (i.e. the C or @@ -761,7 +786,15 @@ C parameter. This currently only works for parameters whose type is specified in a separate C line rather than any ANSI-style declaration (C). -=head2 The INIT: Keyword +=head2 The XSUB Init Part + +XXX TBC + +XXX NB: the keywords described in L and L may also appear in this part, plus C C and +C. + +=head3 The INIT: Keyword The INIT: keyword allows initialization to be inserted into the XSUB before the compiler generates the call to the C function. Unlike the CODE: keyword @@ -789,7 +822,18 @@ making a call to the C function: if (b == 0) croak("lldiv: cannot divide by 0"); -=head2 The C_ARGS: Keyword +=head2 The XSUB Code Part + +XXX TBC + +XXX NB: the keywords described in L and L may also appear in this part. + +=head3 Auto-calling a C function + +XXX TBC + +=head4 The C_ARGS: Keyword The C_ARGS: keyword allows creating of XSUBS which have different calling sequence from Perl than from C, without a need to write @@ -815,7 +859,7 @@ To do this, declare the XSUB as C_ARGS: n, function, default_flags -=head2 The CODE: Keyword +=head3 The CODE: Keyword This keyword is used in more complicated XSUBs which require special handling for the C function. The RETVAL variable is @@ -839,7 +883,7 @@ The XSUB follows. timep RETVAL -=head2 The PPCODE: Keyword +=head3 The PPCODE: Keyword The PPCODE: keyword is an alternate form of the CODE: keyword and is used to tell the B compiler that the programmer is supplying the code to @@ -909,7 +953,18 @@ the following statement. When handling output parameters with a PPCODE section, be sure to handle 'set' magic properly. See L for details about 'set' magic. -=head2 The POSTCALL: Keyword +=head3 NOT_IMPLEMENTED_YET + +XXX TBC + +=head2 The XSUB Output Part + +XXX TBC + +XXX NB: the keywords described in L and L may also appear in this part. + +=head3 The POSTCALL: Keyword This keyword can be used when an XSUB requires special procedures executed after the C subroutine call is performed. When the POSTCALL: @@ -922,7 +977,7 @@ The POSTCALL: block does not make a lot of sense when the C subroutine call is supplied by user by providing either CODE: or PPCODE: section. -=head2 The OUTPUT: Keyword +=head3 The OUTPUT: Keyword The OUTPUT: keyword indicates that certain function parameters should be updated (new values made visible to Perl) when the XSUB terminates or that @@ -973,7 +1028,14 @@ Likewise, C can be used to reenable it for the remainder of the OUTPUT section. See L for more details about 'set' magic. -=head2 The CLEANUP: Keyword +=head2 The XSUB Cleanup Part + +XXX TBC + +XXX NB: the keywords described in L and L may also appear in this part. + +=head3 The CLEANUP: Keyword This keyword can be used when an XSUB requires special cleanup procedures before it terminates. When the CLEANUP: keyword is used it must follow @@ -981,7 +1043,13 @@ any CODE:, or OUTPUT: blocks which are present in the XSUB. The code specified for the cleanup block will be added as the last statements in the XSUB. -=head2 The PROTOTYPE: Keyword +=head2 XSUB Generic Keywords + +XXX TBC + +These keywords can appear anywhere within the body of an XSUB. + +=head3 The PROTOTYPE: Keyword This keyword is similar to the PROTOTYPES: keyword above but can be used to force B to use a specific prototype for the XSUB. This keyword @@ -1011,7 +1079,7 @@ XSUB as in the following example: PROTOTYPE: DISABLE ... -=head2 The OVERLOAD: Keyword +=head3 The OVERLOAD: Keyword Instead of writing an overloaded interface using pure Perl, you can also use the OVERLOAD keyword to define additional Perl names @@ -1054,7 +1122,17 @@ Since, as mentioned above, bitwise operators may take extra arguments, you may want to use something like C<(lobj, robj, swap, ...)> (with literal C<...>) as your parameter list. -=head2 The ALIAS: Keyword +=head3 The ATTRS: Keyword + +XXX TBC + +=head2 Sharing XSUB bodies + +XXX TBC + +=head3 The ALIAS: Keyword + +XXX this keyword can appear anywhere within the body of an XSUB. The ALIAS: keyword allows an XSUB to have two or more unique Perl names and to know which of those names was used when it was invoked. The Perl @@ -1118,7 +1196,9 @@ versions of our tool chain would be to do this: OUTPUT: timep -=head2 The INTERFACE: Keyword +=head3 The INTERFACE: Keyword + +XXX this keyword can appear anywhere within the init part of an XSUB. This keyword declares the current XSUB as a keeper of the given calling signature. If some text follows this keyword, it is @@ -1156,7 +1236,10 @@ say, from another XSUB. (This example supposes that there was no INTERFACE_MACRO: section, otherwise one needs to use something else instead of C, see the next section.) -=head2 The INTERFACE_MACRO: Keyword +=head3 The INTERFACE_MACRO: Keyword + +XXX this keyword can appear anywhere within the input and init parts of an +XSUB. This keyword allows one to define an INTERFACE using a different way to extract a function pointer from an XSUB. The text which follows @@ -1194,7 +1277,7 @@ in C section, in XSUB section. -=head2 The CASE: Keyword +=head3 The CASE: Keyword The CASE: keyword allows an XSUB to have multiple distinct parts with each part acting as a virtual XSUB. CASE: is greedy and if it is used then all @@ -1241,6 +1324,10 @@ the different argument lists. $status = x_gettime($timep, $host); +=head2 Using Typemaps + +XXX TBC + =head2 Using XS With C++ If an XSUB name contains C<::>, it is considered to be a C++ method. From 3769e97658b11528fdd05366a36380898bc7f932 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Mon, 30 Jun 2025 14:27:18 +0100 Subject: [PATCH 07/42] perlxs.pod: add a new introductory part Four commits ago, I removed most of the general text sections in perlxs (i.e. the ones not specifically about a particular keyword). Now this commit adds a completely new introductory part to perlxs, about 1200 lines long. It represents an attempt to write a background to what XS and XSUBs, SVs, typemaps etc are, in a complete and modern way. The existing reference section for each keyword follows it. I tried to avoid getting too tutorial-like (that's what perlxstut is for), but I may have crossed the line in various places. In particular it has a new section which could have been titled "all the bits of perlguts you need to know in order to write non-trivial XSUBs without having to actually read perlguts". --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 1347 +++++++++++++++++++++++++- 1 file changed, 1346 insertions(+), 1 deletion(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 9cb901058d76..f75dfce415f2 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -1,9 +1,1354 @@ =head1 NAME -perlxs - XS language reference manual +perlxs - the XS Language Reference Manual + +=head1 SYNOPSIS + + /* This is a simple example of an XS file. The first half of an XS + * file is uninterpreted C code; all lines are passed through + * unprocessed. */ + + =pod + Except that any POD is stripped. + =cut + + /* Standard boilerplate: */ + + /* For efficiency, always define PERL_NO_GET_CONTEXT: not enabled by + * default for backwards compatibility. For details, see + * L + */ + #define PERL_NO_GET_CONTEXT + + #include "EXTERN.h" + #include "perl.h" + #include "XSUB.h" + #include "ppport.h" + + /* Any general C code here; for example: */ + + #define FOO 1 + static int + my_helper_function(int i) { /* do stuff */ } + + /* The first MODULE line starts the XS half of the file: */ + + MODULE = Foo::Bar PACKAGE = Foo::Bar + + # Indented '#' are XS code comments. + # C preprocessor directives are still allowed and are passed + # through: + #define BAR 2 + + # File-scoped XS directives + PROTOTYPES: DISABLE + + # A simple XSUB: generate a wrapper for the strlen() C library + # function. + + int + strlen(char *s) + + =pod + A more complex example: + C: do a 16-bit multiply + =cut + + unsigned int + multi16(unsigned int i, \ + unsigned int j) + CODE: + i = i & 0xFFFF; + j = j & 0xFFFF; + RETVAL = (i * j) & 0xFFFF; + OUTPUT: + RETVAL =head1 DESCRIPTION +This is the reference manual for the XS language. This is a type of +template language from which is generated a C code source file that +contains functions written in C, but which can be called from Perl, and +which behave just like Perl subs. These are known as I or +I subs, or XSUBs for short. + +Note that this POD file was heavily rewritten and modernised in 2025. +Various old practices, such as "K&R" XSUB function signature declarations, +are no longer encouraged; but much old code will still be using them, so +be cautious when using old code as examples for writing new code. + +The syntax described in this document is valid for Perl installations back +to 5.8.0 unless otherwise specified. + +=head1 THE FORMAL SYNTAX OF AN XS FILE + +XXX TBC + +=head1 OVERVIEW OF XS AND XSUBS + +=head2 Initial and Further Reading + +This document is structured on the assumption that you are already +familiar with the very basics of XS and XSUBs; in particular, the code +examples may make use of common keywords that are only described later in +the file. But once you have that basic familiarity, then this document +may be read through in order. + +It is in two main parts. First there is a long overview part, which +explains (in great detail) what XS and XSUBs are, how the perl interpreter +calls XSUBs, and how data is passed to and from an XSUB. There is much +more detail here than is strictly necessary for writing simple XSUBs, but +this document is intended to be comprehensive. Then comes the reference +manual proper, which has a section for each keyword and other parts of an +XSUB declaration and definition, plus a few more general topics, such as +using typemaps and storing static data. + +If necessary, read L first for a gentler tutorial introduction. +In addition, you may find the following Perl documents useful. + +=over + +=item * + +L: this describes typemap files and how to create new +typemaps. These are code templates which are used by the XS compiler to +automatically generate code which converts between Perl and C data types. +Creating an interface to a C library is sometimes mainly a case of adding +new typemap entries to handle the new data types which the library uses. + +=item * + +L: this contains details of selected parts of the internals of +the Perl interpreter. A better understanding of that may help when writing +more complex XSUBs, or when debugging. + +Note that much of the L part of this document is just a summary of +the parts of of perlguts which are most relevant to writing XS code, +possibly saving you from having to actually read that document. + +=item * + +L: this describes how and when to use the basic C and OS library +functions from XS. Often, the Perl API contains functions which you should +use I of the standard C library ones, e.g. using C +instead of C. + +=item * + +L: this describes how to call Perl functions and do the +equivalent of C from C. + +=item * + +L: this describes how to embed a complete Perl interpreter +within another application. + +=back + +=head2 An Introduction to XS and XSUBs + +Formally, an XSUB is a compiled function, typically written in C or C++, +which can be called from Perl as if it was a Perl function. A collection +of them are compiled into a C<.so> or C<.dll> library file and are usually +dynamically loaded at C time (but can in principle be +statically linked into the perl interpreter). + +From Perl, an XSUB looks just like any other sub and is called in the same +way. In the most general case, an XSUB can be passed and return arbitrary +lists of values. More commonly, such as when XSUBs are being used as thin +wrappers to call existing C library functions, they might take a fixed list +of arguments and return a single result (or zero items for a C +function). + +An XS file is a template file format which contains a mixture of C code +and XSUB declarations. It is used to generate XSUBs where most boilerplate +code is handled automatically: e.g. converting argument values between C +and Perl. + +Note that this document refers to both the thing in the XS file, and to +the C function generated from it, as an XSUB. It should be clear from the +context which is being referred to. + +When XSUBs are being used as a thin wrapper between Perl and the functions +in a particular C library, the XSUB definitions in the XS file are often +just a couple of lines, consisting of a declaration of the name, +parameters and return type. The XS parser will do almost all the heavy +lifting for you, + +XS is optional; in principle you can write your own C code directly, or +use other systems such as C or SWIG. For creating simple +bindings to existing compiled libraries, there is also the +L interface via CPAN modules like +L or L. Note that creating XS may initially take +more effort than those, but it is lightweight in terms of dependencies. + +XSUBs have three main roles. They can be used as thin wrappers for C +library functions, e.g. L. They can be used to write +functions which are faster than pure Perl or easier to do in C, e.g. +L. Or they can be used to extend the Perl interpreter itself, +e.g. L. + +XS has extensive support for the first role, and makes writing the second +need less boilerplate code. This document doesn't cover the third role, +which often requires extensive knowledge of the Perl interpreter's +internals. + +The F utility bundled with Perl can in principle be used to generate +an initial XS file from a C header file, which (with possibly only minor +edits) can be used to wrap an entire C API. But note that this utility is +rather old and may not handle more modern C header code. + +F (as well as other tools) can also be used to generate an initial +"empty" skeleton distribution even when not deriving from a header file +(see L for more details). + +Typemaps are sets of rules which map C types such as C to logical XS +types such as C, and from there to C and C templates +such as C<$var = ($type)SvIV($arg)> and C +which, after variable expansion, generate C code which converts back and +forth between Perl arguments and C auto variables. + +There is a standard system typemap file which contains rules for common C +and Perl types, but you can add your own typemap file in addition, and +from perl 5.16.0 onwards you can also add typemap declarations inline +within the XS file. You can either just add mappings from new C types to +existing XS types to make use of existing templates, or you can add new +templates too. As an example of the former, if you're using a C header +file which has: + + typedef int my_int; + +then adding this typemap entry: + + my_int T_IV + +is sufficient for the XS parser to know to use the existing C +templates when processing an XSUB which has a C parameter type. +See L and L for more information. + +An XS file is parsed by L, or by the F utility +(which is a thin wrapper over the module), and generates a C<.c> file. +F is typically called at build time from the Makefile of a +distribution (as generated by L); or +L can be used directly, e.g. by L. The C +file is then compiled into a C<.so> or C<.dll>, again at module build and +install time. + +=head2 The Structure of an XS File + +An XS file has two parts, which are parsed and treated completely +differently: the C half and the XS half. + +Anything before the first L directive line +is treated as pure C (except for any sections of POD, which are +discarded). All such lines, including C preprocessor directives and C code +comments, are passed through unprocessed into the destination C file. XS +comments (as described below) aren't recognised by the XS parser, and are +just passed through unprocessed. + +It is possible that machine-generated C code inserted in this section +could include an equal sign character in column one, which would be +misinterpreted as POD; if this is a risk, make sure that this hypothetical +code generator includes a leading space character. + +This half of the file is the place to put things which will be of use to +the XSUB code further down: such as C<#include>, C<#define>, C, +and static C functions. Note that you should (in general) avoid declaring +static I in XS files; see L for +details and workarounds. + +After the first C line, the rest of the file is interpreted as XS +syntax. Further C keywords may appear where needed to change the +current package (in a similar fashion to a single Perl F file +having multiple C statements). + +This second half consists mostly of a series of XSUB definitions. Between +these XSUBs, there can be a few file-scope keywords (including further +C lines), POD, C preprocessor directives, XS (C<#>) comments, and +blank lines. See L for more +details. + +The XS half of the file can be thought of as being parsed in two stages. +In the initial processing step, the XS parser does the following basic +text processing actions. + +=over + +=item * + +A trailing backslash (i.e. C) within the XS part of the file is +treated as a line continuation: any such series of lines are concatenated. +They are then treated as single line by the main (line-orientated) XS +Parser. This means that the next line loses any special significance; for +example, it may not be recognised as a keyword or end-of-XSUB blank line. + +Note however that the two characters C<"\\\n"> are kept in the +concatenated line, to be subsequently interpreted by the main part of the +XS parser. Mostly these two extra characters will just confuse the parser; +but where such lines are just passed-through to the output C file as-is +(such as for C blocks and C preprocessor directives), this results +in the backslash and newline appearing in the C code. + +The main exception to leaving the continuation characters in the line is +in an XSUB's signature, where any trailing backslashes are stripped away +before parsing. + +=item * + +The parser discards any POD lines. Any sequence of lines matching +C is considered POD. + +=item * + +It discards any XS comment lines. Any line starting with C which +I recognised by the XS parser as a valid C preprocessor directive, +is treated as an XS comment line. + +It is recommended to include at least one space before an XS C<#> comment +to avoid any possible confusion with C preprocessor directives. + +If an XS comment line ends with a backslash, then the line following it is +treated as part of that comment line and is also discarded. + +=back + +Once that basic textual preprocessing has been performed, the main XS +parsing takes place. XS syntax is very line-orientated. XS lines and +sections mostly start with a keyword of the form; + + /^\s*[A-Z_]+:/ + +It is best to position file-scoped keywords at column one, while +XSUB-scoped keywords are best indented. This may avoid surprises with edge +cases in the XS parser. + +Keywords can be either single line, e.g. C, or +multi-line. The latter consume lines until the next keyword, or until the +possible start of a new XSUB (C), or to EOF. Multi-line keywords +treat the rest of the text on the line which follows the keyword as the +first line of data. The exception to this is keywords which introduce a +block of code, such as C or C, which silently ignore the +rest of the first line. (Yes, this is a implementation flaw.) + +It is best to include a blank line between each file-scoped item, and +before the start of each XSUB. While some items I processed correctly +if they are on the line immediately preceding the start of an XSUB, the +parser is inconsistent in their handling. + +An XSUB ends when C is encountered: i.e. a blank line followed +by something on column one. (This is why it's recommended to indent +XSUB-scoped keywords.) If the thing at column one matches any of the items +which can appear in between XSUBs (such as file-scoped keywords) then it, +and any subsequent lines, are processed as such. Anything starting on +column one which isn't otherwise recognised, is interpreted as the first +line of the next XSUB definition. In particular it is interpreted as the +return type of the XSUB: this can lead to weird errors when something is +unexpectedly interpreted as the start of a new XSUB, such as C, +which isn't valid in the XS half of the file apart from within code +blocks. + +Some multi-line keywords, such as C, are treated as just a single +uninterpreted multi-line string. Others, such as C, have a +specific per-line syntax, where each line within the section is parsed. +Finally, code blocks such as C are just copied as-is to the output C +file (possibly sandwiched between C<#line> directives to ensure that +compiler error messages report from the correct location). + +The XS parser doesn't recognise C comments, so don't use them apart from +in C code (e.g. not in an XSUB signature). More generally, the XS parser +doesn't understand C syntax or semantics; it just uses crude regexes to +parse the XS file. For example in an XSUB declaration like + + int + foo(int a, char *b = "),") + +the parser just extracts out everything between (...) and splits on +commas, with just enough intelligence to ignore commas etc within matching +pairs of double-quotes. The parser doesn't understand C type declaration +syntax; for example it typically just extracts everything before what +appears to be a parameter name, and assumes that it must be the type. That +"type" will later be looked up in a typemap, and if no entry is found, +will only then raise an error. + +In addition, the XS parser has historically been very permissive, even to +the point of accepting nonsense as input. Since Perl 5.44, more things are +likely to warn or raise errors during XS parsing, rather than silently +generating a non-compilable C code file. + +As mentioned earlier, an XSUB definition typically starts with C +and continues until the next C. The XSUB definition consists of +a declaration, followed by an optional body. The declaration gives the +function's name, parameters and return type, and is intended to mimic a C +function declaration. It is usually two lines long. + +The XSUB's body consists of a series of keywords. The main C code of an +XSUB is specified by a C or C section. In the absence of +this, a short body is generated automatically, which consists of a call to +a C function with the same name and arguments as the XSUB. In this way, +the XSUB becomes a short wrapper function between Perl and the C library +function, with the wrapper handling the conversion been Perl and C +arguments. This is referred to in this document as I. + +Other keywords can be used to modify the code generated for the XSUB, or +to alter how it is registered with the interpreter (e.g. adding +attributes). + +So that is the basic structure of an XSUB. What a real XSUB looks like +will be covered later in L, but first a slight +digression follows. + +=head2 Overview of how data is passed to and from an XSUB + +This section contains a basic background on how XSUBs are invoked, what +their arguments consist of, and how XSUB arguments are passed to and from +Perl. It is essentially a summary of some relevant sections within +L; see that document for a more detailed exploration. + +Note that most of the information in this section isn't needed to create +basic XSUBs; but for more complex needs or for debugging, it helps to +understand what's happening behind the scenes. + +=head3 Perl OPs + +An C is is a data structure within the perl interpreter. It is used to +hold the nodes within a tree structure created when the perl source is +compiled. It usually represents a single operation within the perl source, +such as an add, or a function call. The structure has various flags and +data, and a pointer to a C function (called the PP function) which is used +to implement the actions of that OP. The main loop of the perl interpreter +consists of calling the PP function associated with the current OP +(C) and then updating it, typically to C<< PL_op->op_next >>. In +particular, the C performs (or at least starts) a function +call. + +=head3 SVs and the Perl interpreter's argument stack + +Almost all runtime data within the Perl interpreter, including all Perl +variables, are stored in an I structure. These SVs can hold data of +many different types, including integers (IV - integer value), strings (PV +- pointer value), references (RV), arrays (AV), elements of arrays, +subroutines (CV - code value) etc. These will be discussed in more detail +below. + +Perl has an I stack, which is a C array of SV pointers. Most of +the run-time actions of the PP functions consist of pushing SV pointers +onto the stack or popping them off and processing them. There is a +companion I stack, which is an array of integers which are argument +stack offsets. These marks serve to delineate the stack into frames. + +Consider this subroutine call: + + @a = foo(1, $x); + +The various OPs executed by the Perl interpreter up until the function is +called will: push a mark indicating the start of a new argument stack +frame; push an SV containing the integer value 1; push the SV currently +bound to the variable C<$x>; and push the C<*foo> typeglob. Then the +PP function C associated with the C will pop +that typeglob, extract the C<&foo> CV from it, and see whether it is a +normal CV or an XSUB CV. + +For a normal Perl subroutine call, C will then: pop the +topmost mark off the mark stack; pop the SV pointers between that mark and +the top of the stack and store them in C<@_>; then set C to the +first OP pointed to by the C<&foo> CV. Those OPs will then be run by the +main loop, until the OPs associated with the last statement of the +function (or an explicit return) will leave any return values as SV +pointers on the stack. + +For an XSUB sub, C will instead note the value of the +topmost mark (but not pop it) and call the C function pointed to from the +CV; this is the XSUB which has been generated by the XS parser. The XSUB +itself is responsible for popping the mark stack, doing any processing of +its arguments on the stack, and then pushing return values. But note that +for straightforward XSUBs, this is usually all done by boilerplate code +generated by the XS parser. Exactly what is done automatically and what +can be overridden and handled manually if needed, is one of the themes of +this document. Finally, C will do any post-processing of +the returned values; for example discarding all but the top-most stack +item if the function call was in scalar context. + +=head3 An SV's reference count + +Perl uses reference counting as its garbage collection method. One of the +always-present fields in an SV is its reference count, accessible as +C. + +Usually an SV's reference count is incremented each time a pointer to the +SV is stored somewhere, and decremented any time such a pointer is +removed. When the reference count reaches zero, any destructor associated +with that SV is called, then the SV is freed. Mismanaging reference counts +can lead to SVs leaking or being prematurely freed. + +When relying on XS to generate all the boilerplate code, reference count +bookkeeping is usually handled for you automatically. Once you start +handling this yourself, then there are some specific considerations. + +Functions which create a new SV, such as as C, return an SV +that has an initial C of I. This is actually one too +high, since there are not yet any pointers to this SV stored anywhere. The +expectation is that the SV will shortly be embedded somewhere - such as +stored in an array - which will take "ownership" of that one count. If the +program calls C or similar before the new SV has been embedded, +then it will leak. Note that C can be trapped by C, so +it's possible that C could be called many times, leaking each +time. Note also that many things may indirectly trigger a C. For +example accessing the value of an SV associated with a tied variable may +trigger a call to its C method, which could call C. So a new +SV needs to be embedded quickly. + +Since such new SVs already have a reference count of one, when embedding +them it should be done in a way which doesn't increase its reference +count. For example, this modifies C to be a reference to a +newly-created SV holding an integer value, i.e. the perl equivalent of +C<$sv =\99>: + + sv_setrv_noinc(sv, newSViv(99)); + +The C<_noinc> variant is used here as it doesn't increment the reference +count of the integer-valued SV when creating a reference to it. + +Where appropriate, reference counts can be adjusted with C +and C and their variants. + +An exception to this system is the argument stack. Pointers on the +argument stack to SVs do I contribute to the reference count of that +SV. The code typically generated by XS takes advantage of this. For +example when ready to return a single value, the XSUB just stores a new SV +pointer at the base of the current stack frame, overwriting the old value, +then resets the argument stack pointer to the base of the frame plus one, +and returns. All the original values on the stack are discarded, without +adjusting any reference counts. + +This can be a problem if the XSUB is returning a new SV. Since this SV +isn't embedded anywhere apart from on the stack (which doesn't hold a +reference count to it), then if the code croaks, the SV on the stack will +leak. To avoid this, there is a separate I stack in the Perl +interpreter. Items on this stack I reference counted. Typically the +temps stack is reset at start of each statement, back to some particular +level. Each SV above this level has its reference count decremented. +Putting an SV on the temps stack is referred to as I it. It +is common to create a new SV and mortalise it at the same time: here are +some examples: + + SV *sv_99 = sv_2mortal(newSViv(99)); + SV *sv_abc = newSVpvn_flags("abc", 3, SVs_TEMP); + +Many OPs have an SV attached to them called a I. This SV has a +lifetime which is the same as the sub which the OP is a part of, and +usually has a reference count of one. It is used by many OPs to avoid +having to create (and later free) a temporary SV to return a value. For +example the C op in C<$a + $b> typically extracts the integer values +of its two arguments, calculates its sum, sets its C to that value +and pushes it onto the stack. The C which typically invokes +an XSUB usually has a C attached to it, and when returning a +value, the XSUB's boilerplate code generated by XS will usually try to use +it to return the value, rather than creating a fresh mortal SV on each +call. + +Note that there is a I perl interpreter build option, +C, under which the argument stack I reference counted, +but that is currently beyond the scope of this document. + +=head3 The IV, NV etc types + +An C (Integer Value) is a typedef in the perl interpreter's header +files that maps to a C integer. The exact integer type and size will +depend on the build configuration of the interpreter. It is guaranteed to +be large enough to hold a pointer. A C is the same but unsigned. An +C (numeric value) is a floating-point value; usually a C. +These types are used widely within the perl interpreter. + +A PV (pointer value) "type" is often used informally within documentation +and within the names of structure fields etc to refer to a string pointer +(C), but it is not actually a declared type. Similarly, RV +(reference value) is informally a pointer to another SV. + +There are also C and C, which are large enough to hold +signed and unsigned integer values representing the number of items in a C +array. C is used specifically for variables which store the number +of characters in a string (it is typically just an alias for C). + +=head3 The SV scalar value structure + +As mentioned above, almost runtime data within the perl interpreter is +stored in an SV (scalar value) structure. The head of an SV structure +consists of three or four words: a reference count; a type and flags; a +pointer to a body; and since perl 5.10.0, a general payload field. There +are around 17 types, and the type indicates what body (if any) is pointed +to from the SV's head. The body type only indicates what I of data +the body is capable of holding; the actual "type" of the SV (IV, NV, PV, +RV etc) is mostly indicated by what flags are set. + +Simple SVs may not have a body. Undefined values typically don't have one. +Also, some IV, NV, and RV values are stored directly in the payload field, +with the body pointer pointing back to the head with a suitable offset so +that the SV appears to have a body with an IV or whatever at a suitable +offset. + +For SVs which have a body, the payload field in the head is usually used +to store one common value which would otherwise have to be stored in the +body and require a further pointer indirection to access. For example, the +C pointer of a perl string SV is stored in the head, while the +length is stored in the body. + +The fields of an SV (both in the head and in the body) are usually +accessed via macros, which has allowed various rearrangements of the head +and body fields over the years while maintaining backwards compatibility. +Always use the macros. For example, C directly accesses the IV +field of the SV (which may be in the head or body depending on the SV's +type). If the SV has a valid integer value, then the C flag will +be set, which can be tested with the macro C. + +The body of an SV may be upgraded to a "bigger" one during the SV's +lifetime, but it is not usually downgraded. For example, during the course +of executing this perl code: + + my $x; + $x = "1"; + $u = $x + 1; + undef $x; + +Initially the SV has no body and none of the C, C, +C, nor C flags are set, indicating that it has neither +an IV, NV, PV or RV value. The complete lack of those flags indicates an +undefined value. After the string is assigned to it, its body type is set +to C, and it is given the corresponding body. The string pointer +and length are stored in the body (or perhaps the pointer in the payload +word), and the C flag is set, indicating that the SV holds a +valid string value. + +When Perl wants to use that SV as an integer, it uses a macro like +C to return the integer value. Unlike the direct C +macro, this first checks C, and if not true, calls a function +which calculates the integer value from its current string value. The +effect of this call is to update the SV's type and body to C +which is capable of holding both a string I and integer value, and +then to set the C flag in addition to the C. + +Finally, the C frees the string and turns off the C and +C flags, but leaves the body type as C. (Hence why an +SV's current Perl-level type should be determined by its flags, not its +body type.) + +Note that you should never directly access fields using macros like +C (the C implies direct) I you have just tested for +C. In general, always use macros such as C, which will do +any checking and conversion for you. + +There is a further complication with SVs: they can have one or more items +of I attached to them. These are small payloads, along with a +pointer to a jump table of pointers to functions with get/set etc actions. +They are used to implement things like C<$1>, C<$.> and tied variables. +The idea is that macros like C will first check whether the SV has +I magic (using C); and if so call its get method +first. For example, for a tied variable, this C-level get function will +call the perl-level C method and assign the return value of that +to the SV. Only then will C do its C check. + +When presented with an unknown SV, it should always have its magic checked +before examining the values of the SVs flags. + +In total, the C macro does roughly the equivalent of: + + if (SvGMAGICAL(sv)) + mg_get(sv); /* do FETCH() etc; update the SV's value / flags */ + if (!SvIOK(sv)) + sv_2iv(sv); /* convert undef to 0, "1" to 1 etc */ + return SvIVX(sv); /* use the raw value */ + +You will see soon that XS's typemap templates mostly use high-level macros +like C, so this is usually all handled automatically for you. Only +if you start to do your own type conversions will you need to worry about +these details. + +Forgetting to test for, and to call, C magic will typically appear to +work fine until the first time someone passes a tied variable or similar +to your XSUB, and C doesn't get called. Accessing fields with +C etc without testing for C first may access a field in +a body which doesn't exist and possibly trigger a SEGV. + +Magic should only be called once per "use"; for example if a tied scalar +is passed as an argument to your XSUB, you would expect C to only +be called once. Normally this is easy because you (or the typemap code) +does a single C call. Occasionally you may have explicitly called +C first, perhaps in order to check some flags; if so, you can +skip a second magic call with variants like C. For example: + + SvGETMAGIC(sv); /* this calls mg_get() if SvGMAGICAL() */ + if (SvNOK(sv)) + /* special-case: do something with a floating-point value */ + else { + IV i = SvIV_nomg(sv); + /* fall-back to treating it as an integer value */ + } + + +A Perl reference is just another type of scalar. It is indicated by +C being true, and the pointer to the referent SV is accessed +using C. + +The equivalent of C for strings is + + STRLEN len; + char *pv = SvPV(sv, len); + +(and variants) which both retrieves a string pointer and sets C to +its length. Note that there is no guarantee that after this call +C is true, nor that C. For example, C may +be a reference to a blessed object with an overloaded stringify (C<"">) +method. In which case, behind the scenes there may be a temporary SV +containing the result of the call to the method, with C pointing to +I SV's string buffer. C remains a reference. Similarly, a +non-overloaded reference to an array may return a temporary string like +C<"ARRAY(0x12345678)">. + +If you need to coerce an SV to a string (e.g. before modifying its string +value) then use C or one of its variants. For example if +used on an array reference, the SV will be converted from a reference into +a plain string SV with an C value of C<"ARRAY(0x12345678)">, and +the array's reference count decremented. + +Once an SV has been coerced into a PV (C is true), then +C represents the size of the allocated buffer, while +C represents the current length (in bytes) of the string. Note +that with Unicode, C may not necessarily equal the value +returned by the Perl built-in C, which is the length in +I. That can be obtained using the C function. +See L below for more details. + +The SV structure can also be used to store things which aren't simple +scalar values: in particular, arrays, hashes and code values. There are +typedefs for AV, HV and CV structures (plus a few others). These +structures are identical to SVs and can generally be used interchangeably +with suitable casting, e.g. C. The main feature of +these non-scalar SVs is that the value of the type field in these cases, +C, C, C etc, actually I indicate the +Perl type, rather than just indicating what sort of body they have. + +An important thing to note is that AVs and HVs are never directly pushed +onto the stack when calling and returning from subroutines and XSUBs. +Instead where necessary, references to them are pushed. You will likely +first spot such an error when you start getting "Bizarre copy of ..." +error messages. + +=head3 Unicode and UTF-8 + +A simple Perl string SV uses what is sometimes referred to as byte +encoding: each character is represented using a single byte. But when a +Perl string contains code points >= 0x100, each character of the string is +stored as a variable number of bytes using the UTF-8 encoding scheme, with +the C flag being set to indicate this. Other strings may or +may not be using UTF-8 encoding, depending on the history of the string. +For example, with: + + my $s = "A\x80"; + $s .= "\x{100}"; + chop $s; + +the string starts off in byte encoding, with C and with each byte representing one character. When +the extra character is appended, the string gets upgraded to UTF-8, with +C and the second and third +characters each using two bytes of storage. Once the third character is +removed, the string stays in UTF-8 encoding, with C and the second character using two bytes. So such a +string SV when passed to an XSUB has two possible representations; and +which will be used is somewhat unpredictable. + +Unfortunately XS currently has no support for UTF-8. All the standard +typemap entries, such as C, assume that the buffer of a string SV +is just an array of bytes to be manipulated by the XSUB or passed on +uninterpreted to a C function. If it is necessary for the XSUB to control +the UTF-8 status of an argument, then it is best to declare the parameter +as type C and do your own manipulation of it. Similarly for returning +string values. + +An SV's string representation can be forced to bytes using C +and variants; if the string contains any characters not representable in +a single byte, then that call croaks with a C error. +Conversely, C and variants will force the string to UTF-8. + +See L for more details. + +=head2 The Anatomy of an XSUB + +The previous section has explained how arguments are pushed onto the +stack, what those arguments look like, and how XSUBs are called. We will +now look at what happens I an XSUB function once called; in +particular, how it retrieves values from its arguments on the stack and +later returns a value or values on the stack; and how XS and typemaps +automate most of this. + +This section will provide both an overview of what an XSUB looks like in +XS, I what sort of C code is generated for it. The majority of the +rest of this document will then describe in more detail the various parts +of an XSUB mentioned here. Note that the various keywords within an XSUB's +definition usually correspond closely (and in the same order) to what C +code is generated for the XSUB. Most of the boilerplate code generated for +an XSUB is concerned with getting argument values off the stack at the +start, then returning zero or one result values on the stack at the end. + +A typical XSUB definition might look like: + + MODULE = Foo::Bar PACKAGE = Foo::Bar + + short + baz(int a, char *b = "") + PREINIT: + long z = ...; + CODE: + ... do stuff ...; + RETVAL = some_function(a, b, z); + OUTPUT: + RETVAL + +The first two lines of an XSUB are its declaration, which must be preceded +by a blank line. It gives the XSUB's return type, its name, and its +parameters (including any default values). While it is modelled on C +syntax, it is actually XS syntax (so for example C isn't +recognised). The return type and name must both start on column one, +although The XS parser actually allows both to be on the same line, such +as + + short baz(...) + +This XSUB definition will be translated into a C function whose start may +look something like this (the exact details may vary across XS parser +releases): + + void + XS_Foo__Bar_baz(pTHX_ CV* cv) + { + dVAR; dXSARGS; + if (items < 1 || items > 2) + croak_xs_usage(cv, "a, b= \"\""); + +Note that the first line of the function is actually specified using a +macro such as C, but for explanatory purposes, what is +shown above is one possible expansion of that macro, depending on the Perl +version and XS configuration. + +The important thing to note is that the XSUB's arguments are I passed +as arguments of the C function; they are still on the Perl argument stack. +Nor is the XSUB's return value returned by the C function. + +The C function's name is based on the XSUB's name plus the current XS +package (with C). Apart from debugging, you don't generally need +to know this name. + +The function's parameters are the CV associated with this XSUB (i.e. +C<&Foo::Bar::baz>) and, on MULTIPLICITY/threaded builds, a pointer to the +current Perl interpreter context. You won't need to directly use these +most of the time. + +The first few lines of code in the C function are standard boilerplate +added to to all XSUBs. Note that the naming convention for Perl +interpreter macros is that ones starting with a C are declarations; they +go in places where a variable can be declared, and typically declare one +or more variables and possibly their initialisations. + +C pops one index off the mark stack and sets up some auto +variables to allow the arguments on the stack to be accessed: +specifically, the variable C is declared, which indicates how many +arguments were passed, and some hidden variables are also declared which +are used by the macro C to retrieve a pointer to argument C from +the stack (counting from 0). The stack pointer is not actually decremented +yet. + +For a generic list-processing XSUB, these argument-accessing variables and +macros may be used directly. But more commonly, for an XSUB which has a +fixed signature (as in the example above), the parser will declare an auto +C variable for each parameter, and (using the system or a user typemap) +assign them values extracted from C etc. It will also declare a +variable called C with the XSUB's return type (unless that is +C), which is typically assigned to by the coder and then whose value +is automatically returned. Continuing the example above, the generated +code for the input part of the XSUB is similar to: + + { + long z = ...; + short RETVAL; + int a = (int)SvIV(ST(0)); + char *b; + + if (items < 2) + b = ""; + else + b = (char *)SvPV_nolen(ST(1)); + +This consists of declarations for C, C, C and C, plus +code to initialise them. The part of the code which extracts a value from +an SV on the stack, such as C<(int)SvIV(ST(0))>, is derived from a typemap +entry. For a simple entry such the one for C, the code may be added as +part of the declaration of the variable itself; otherwise the +initialisation may be done as a separate statement after all the variable +declarations (such as for C). + +Variable declarations appear in the order they appear in C and +C blocks, followed by C and then any parameters defined +completely within the signature (i.e. which don't use an C section +to specify their type). + +Perls before 5.36 used C89 compiler semantics, which didn't allow variable +declarations after statements. To work round this, the C keyword +allows you to inject additional variable declaration code. + +Following on from the input part, the main body of the function is output; +this is copied exactly as-is from the C or C section, if +present. If neither is present, the parser will assume that this XSUB is +just wrapping a C library function of the same name as the XSUB, and will +automatically generate some code like the following: + + RETVAL = baz(a, b); + +The C and C keywords may be used to add code just before +and after the main code; typically only useful for autocall. + +C is the same as C except that after argument processing, +the stack pointer is reset to the base of the frame, and the coder becomes +responsible for pushing any return values onto the stack. No further +keywords can follow C. This is typically used for XSUBs which +need to return a list or have other complex requirements beyond just +returning a single value. + +For C and autocall, unless the return type is void, the parser will +generate code to return the value of C. This is automatic in the +case of autocall, but for C you have to ask the parser to do so +with C. The code generated in either case may look +something like + + { + SV *RETVALSV = sv_newmortal(); + sv_setiv(RETVALSV, (IV)RETVAL); + ST(0) = RETVALSV; + } + +A temporary SV will be created, set to the value of C (again, +using a typemap template), then placed on the stack. In practice, various +optimisations may be used; in particular, the C target SV which is +attached to the calling C may be used instead of allocating +and freeing an SV for each call, as explained earlier. + +XSUB parameters declared as C or C will cause additional +output code to be generated which respectively: updates the value of one +of the passed arguments; or pushes the value of that parameter onto the +stack (in addition to C). + +Finally, (apart from C), a macro like this is added to the end of +the C function: + + XSRETURN(1); + +This resets the stack pointer to one above the base of the frame (so the +top item on the stack is C), then does C. + +For a C XSUB, C is used instead. + +=head2 Returning Values from an XSUB + +An XSUB's declared return type is typically a C type such as C or +C. XS is very good at automating this common case of returning a +single C-ish value: behind the scenes it creates a temporary SV; then, +using an appropriate typemap template, sets that SV to the value of +C and returns that SV on the stack. + +But sometimes you want to return a Perl-ish value rather than a C-ish +value, for example, Perl's undef value or a Perl array reference. Or you +may want to return multiple values, or update one of the passed +arguments. The following subsections describe various such cases. + +Note that XSUBs are somewhat like Perl lvalue subs, in that they return +the actual SV to the caller, while normal Perl subs return a temporary +copy of each return value. When returning a C value like C this +doesn't matter, since the XSUB is returning a temporary SV anyway; but +when returning your own SV, it could in theory make a visible difference. +For example, + + sub foo { $_[0]++ } + foo(an_xsub_which_returns_element_0_of_an_array(\@a)); + +would increment C<$a[0]>. + +=head3 Returning undef / TRUE / FALSE / empty list + +Sometimes you need to return an undefined value, e.g. to indicate failure. +It's possible to return early from a CODE block with an undefined value, +bypassing the normal creation of a temporary SV and the setting of its +value. For example: + + int + file_size(char *filename) + CODE: + RETVAL = file_size(filename); + if (RETVAL == -1) + XSRETURN_UNDEF; + OUTPUT: + RETVAL + +The C macro causes the address of the special Perl SV +C to be stored at C (this is the same value that the +Perl function C returns), and then to immediately return. + +If using autocall, then you can instead return early in a C +section: + + int + file_size(char *filename) + POSTCALL: + if (RETVAL == -1) + XSRETURN_UNDEF; + +There are similar macros + + XSRETURN_YES + XSRETURN_NO + XSRETURN_EMPTY + +which allow you to return Perl's true and false values, or to return +an empty list. + +If your XSUB will always explicitly return a special SV and won't ever +require typemap conversions (e.g. it always returns via C or +C), then just declare the return type as C. + +Note that any early return from an XSUB should always be via one of the +C macros and not directly via C; the former will do any +bookkeeping associated with the argument stack. + +=head3 Returning an SV* + +More generally, you may want to create and return an SV yourself, rather +than relying on the boilerplate XSUB code to generate a temporary SV and +set it to a C-ish value. Here you would declare the return type as C. +For example: + + SV* + abc(bool uc) + CODE: + RETVAL = newSVpv(uc ? "ABC" : "abc", 3); + OUTPUT: + RETVAL + +There is some special processing which happens when using a return type +such as C. First, consider that for a C return type like C, the +typemap template which sets the temporary SV's value may look something +like: + + sv_setiv($arg, (IV)$var); + +which after expansion may look like: + + sv_setiv(RETVALSV, (IV)RETVAL); + +where the temporary SV has previously been assigned to C. + +Now, if you declare an XSUB with a return type of C, you I +expect the typemap template to look something like: + + sv_setsv($arg, (SV*)$var); + +This Perl library function copies the value of one SV to another (the +XS user's equivalent of the Perl C<$a = $b>). + +However, the design decision was made that for the C type in +particular, the typemap template would be + + $arg = $var; + +Here is where the special processing comes in. The XS compiler, in the +case of an output template beginning C<$arg = ...>, skips creating a +temporary SV, and just returns the SV in C directly. So the +typemap template would be expanded to + + ST(0) = RETVAL; + +This is faster than copying. + +But in addition, for I C<$arg = ...> template (not just the template +for C), the XS compiler makes one further assumption: that the +expression to the right of the assign evaluates to an SV with a reference +count count I, and so in addition, the XS compiler emits: + + sv_2mortal(RETVAL); + +or similar, which causes the reference count of the SV to be decremented +by one at (typically) the start of the next statement. This makes sense if +the SV is newly created with one of the C family of functions: +see the discussion on this in L + +However, if the SV comes from elsewhere, for example via a Perl array +lookup, then its reference count doesn't need to be adjusted, and so the +mortalising will cause it to be prematurely freed. In this case, you need +to artificially increase the SV's reference count. + +The previous example showed creating a new SV using C; here's an +example where the SV pre-exists in an array: + + SV* + lookup(int i) + CODE: + { + SV** svp = av_fetch(some_array_AV, i, 0); + if (!svp) + XSRETURN_UNDEF; + /* compensate for the implicit mortalisation */ + RETVAL = SvREFCNT_inc(*svp); + } + OUTPUT: + RETVAL + +Finally, note that some very old (pre-1996) XS documentation suggested +that you could return your own SV using code like: + + void + foo(...) + CODE: + ST(0) = some_SV; + +This is very wrong, as the C declaration tells the XS code to expect +to return I items on the stack. There is still come code like this +in the wild, and to work around it, the XS compiler does a very special +and ugly hack for a C XSUB when it sees C being assigned to +within a C block: it pretends that the XSUB was actually declared as +returning C and so emits C rather than +C. But don't rely on this: it is likely to warn +eventually. If your XSUB is doing its own setting of C, then always +declare the return type as C. + +The mark stack isn't used when returning arguments; instead, the caller of +the XSUB (usually the C) notes the offset of the base of +the argument stack frame before calling the XSUB and the offset of the +stack pointer on return, and can deduce the number of returned arguments +from that. + +=head3 Returning AV* etc refs + +Sometimes you want to return a non-scalar SV, such as an AV, HV or CV. +However, these aren't allowed directly on the argument stack. You are +supposed instead to return a I to the AV: a bit like a Perl sub +returning C<\@foo>. + +The standard typemaps can create this reference for you automatically. So +for example an XSUB with a return type of C will actually create and +return an RV scalar which references the AV in C. So the XS +equivalent of Perl's C might be: + + AV * + array89() + CODE: + RETVAL = newAV(); + /* see text below for why this line is needed */ + sv_2mortal((SV*)RETVAL); + av_store(RETVAL, 0, newSViv(8)); + av_store(RETVAL, 1, newSViv(9)); + OUTPUT: + RETVAL + +Note that the C variable is declared as type C, but what is +actually returned to the caller is a temporary SV which is a reference to +C. The standard output typemap template for the C type looks +like: + + $arg = newRV((SV*)$var); + +This means it creates a new RV which refers to to the AV. Because of the +rule for C<$arg = ...> typemaps, the RV will be correctly mortalised +before being returned. However, the C function increments the +reference count of the thing being referred to (the C AV in this +case). Since the AV has just been created by C with a reference +count one too high, it will leak. This why the C is +required. Conversely for a pre-existing AV, the mortalisation isn't +required. + +Since perl 5.16, there are a set of alternative XS types which can be used +for AVs etc which I increment the reference count of the AV when +being pointed to from the new RV. These can be enabled by mapping the +C etc C types to these new XS types: + + TYPEMAP: < and +handle the RV generation yourself: + + SV * + create_array_ref() + CODE: + RETVAL = newRV_noinc((SV*)newAV()); + OUTPUT: + RETVAL + +If instead you want to return a flattened array (the equivalent of Perl's +C) then you would have to push the elements of the array +individually onto the stack in a C block. See L below. + +Finally, the C C type in the standard typemap is a way of creating +and returning a reference to a scalar (as opposed to the C type, +which just returns a scalar). In this case you have to tell the C compiler +that C is just another name for C: + + typedef SV *SVREF; + +Then in an XSUB like + + SVREF + foo() + CODE: + RETVAL = newSViv(9); + sv_2mortal(RETVAL); + OUTPUT: + RETVAL + +C will be declared with type C, and the XSUB will return a +reference to an integer: the perl equivalent of C. + +=head3 Updating arguments and returning multiple values. + +By using the C and similar parameter modifiers, XS provides +limited support for returning extra values in addition to (or instead of) +C, either by updating the values of passed arguments (C), or +by returning some of the parameters (and pseudo-parameters) as extra +return values (C). For returning an arbitrary list of values, see +the next section. + +Here are a couple of simple XS examples with their approximate perl +equivalents: + + # Update a passed argument + + void sub inc9 { + inc9(IN_OUT int i) my $i = $_[0]; + CODE: $i += 9; + i += 9 $_[0] = $i; + } + + # Return (2*$i, 3*$i) + + void sub mul23 { + mul23(int i, \ my $i = $_[0]; + OUTLIST int x, \ my ($x, $y); + OUTLIST int y) $x = $i * 2; + CODE $y = $i * 3;: + x = i * 2; return $x, $y; + y = i * 3; } + +See L +for the full details, + +=head3 Returning a list + +If you want to return a list, i.e. an arbitrary number of items on the +stack, you generally have to forgo the convenience of some of the +boilerplate code generated by XS, which is biased towards returning a +single value. Instead you will have to create and push the SVs yourself. +The L keyword is specifically intended for +this purpose. Here is a simple example which does the same as the +Perl-level C: + + void + one_to_n(int n) + PPCODE: + { + int i; + if (n < 1) + Perl_croak_nocontext( + "one_to_n(): argument %d must be >= 1", n); + EXTEND(SP, n); + for (i = 1; i <= n; i++) + mPUSHi(i); + } + +The C keyword causes the argument stack pointer to be initially +reset to the base of the frame (discarding any passed arguments), and +suppresses any automatic return code generation. The return type of the +XSUB is ignored, except that declaring it C suppresses the +declaration of a C variable. + +The C macro makes sure that there are at least that many free +slots on the stack (its first argument should always be C). The +C macro creates a new SV, mortalises it, sets its value to the +integer C, and pushes it on the stack. + +Here's another example, which flattens the array passed as an argument: +the equivalent of this Perl: + + sub flatten { my $aref = $_[0]; @$aref: } + +In this example, the SVs being pushed aren't freshly created with a +reference count one too high, so don't need mortalising. + + void + flatten(AV *av) + PPCODE: + { + int i; + int max_ix = AvFILL(av); + SV **svp; + EXTEND(SP, max_ix + 1); + for (i = 0; i <= max_ix; i++) { + svp = av_fetch(av, i, 0); + PUSHs(svp ? *svp : &PL_sv_undef); + } + } + +This function actually expects to be passed a I to an array: +the input typemap entry for C automatically takes care of +dereferencing the argument and croaking if it's not actually a reference. +The C macro simply pushes an SV onto the stack, without any +mortalising or copying. Any "holes" in the array are filled with undefs. + +=head2 Bootstrapping + +In addition to the C C function generated for each XSUB +declaration, a C C function is also automatically +generated, one for each XS file. This XSUB function is called once when +the module is first loaded. For each declared XSUB in the file, a line +similar to the following is added to the boot function: + + newXS("Foo::Bar::baz", XS_Foo__Bar_baz); + +(the exact details of the code will vary across releases and +configurations). This call creates a CV, flags it as being an XSUB, adds +a pointer from it to C, then adds the CV to the +C<*FOO::Bar::baz> typeglob in the Perl interpreter's symbol table. It is +the XS equivalent of the Perl-level + + *FOO::Bar::baz = sub { ... } + +For some XSUBs, additional lines may be added by the parser to the boot +XSUB to handle things like aliases or overloading. + +You can add your own additional lines to the boot XSUB using the C +keyword. + +A typical Perl module like F should have code in it similar +to: + + package Foo::Bar; + our $VERSION = '1.01'; + require XSLoader; + XSLoader::load(__PACKAGE__, $VERSION); + +This causes the F or F file to be dynamically linked in +and then the C function called. This boilerplate code is +typically created automatically with F when you first create the +skeleton of a new distribution. See L for more details. + +=head1 REFERENCE MANUAL + +This part of the document explains what each XS keyword does. They are +arranged in the approximate order in which they might appear within an XS +file, and then might appear within an XSUB declaration. Related keywords +are grouped together. + =head2 The MODULE Keyword The MODULE keyword is used to start the XS code and to specify the package From ca7bdbd591c0bbaeda3b30fdd6d9a038faa83b19 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Mon, 7 Jul 2025 12:21:04 +0100 Subject: [PATCH 08/42] perlxs.pod: add BNF definition section Add a section which semi-formally tries to define the syntax and structue of an XS file, using a BNF-like format. See http://nntp.perl.org/group/perl.perl5.porters/268701 for the discussion of this part. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 212 ++++++++++++++++++++++++++- 1 file changed, 211 insertions(+), 1 deletion(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index f75dfce415f2..a062a6b4ce67 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -83,7 +83,217 @@ to 5.8.0 unless otherwise specified. =head1 THE FORMAL SYNTAX OF AN XS FILE -XXX TBC +This is a BNF-like description of the syntax of an XS file. It is intended +to be human-readable rather than machine-readable, and doesn't try to +accurately specify where line breaks can occur. + + Key: + + foo BNF token. + "bar" Literal terminal symbol. + /.../ Terminal symbol defined by a pattern. + [Foo::Bar] Terminal symbol defined by way of an example. + * + ? | ( ) These have their usual regex-style meanings. + // ... BNF Comments. + + + XS_file = C_file_part ( module_decl XS_file_part )+ + + C_file_part = ( + // Lines of C code (including /* ... */), + // which are all passed through uninterpreted. + | + pod // These are stripped. + )* + + pod = /^=/ .. /^=cut\s*$/ + + module_decl = blank_line + // NB: all on one line: + "MODULE =" [Foo::Bar] "PACKAGE =" [Foo::Bar] + ( "PREFIX =" [foo_] )? + + blank_line = /^\s*$/ + + XS_file_part = ( file_scoped_decls* xsub )* + + file_scoped_decls = + blank_line + // Any valid CPP directive: these are passed through: + | "#if" | "# if" | "#define" | // etc + | #comment // anything not recognised as CPP directive + | pod + | "SCOPE:" enable + | "EXPORT_XSUB_SYMBOLS:" enable + | "PROTOTYPES:" enable + | "VERSIONCHECK:" enable + | "FALLBACK:" ("TRUE" | "FALSE" | "UNDEF") + | "INCLUDE:" [foo.xs] + | "INCLUDE_COMMAND:" [... some command line ...] + | "REQUIRE:" [1.23] // min xsubpp version + | "BOOT:" + code_block + | "TYPEMAP: <<"[EOF] + // Heredoc with typemap declarations. + [EOF] + + enable = ( "ENABLE" | "DISABLE" ) + + code_block = // Lines of C and/or blank lines terminated by the + // next keyword or XSUB start. POD is stripped. + + xsub = blank_line // not *always* necessary + xsub_decl + ( cases | xbody ) + + xsub_decl = return_type + xsub_name "(" parameters ")" "const" ? + + return_type = "NO_OUTPUT" ? "extern \"C\"" ? "static" ? C_type + + C_type = [const char *] // etc: any valid C type + + C_expression = [foo(ix) + 1] // etc: any valid C expression + + xsub_name = [foo] | [X::Y::foo] // simple name or C++ name + + parameters = empty + | parameter ( "," parameter )* + + empty = /\s*/ + + parameter = ( + in_out_decl ? + C_type ? + /\w+/ // variable name + // Default or optional value: + ( "=" ( C_expression | "NO_INIT" ) )? + + // Pseudo-param: foo must match another param name: + | C_type "length(" [foo] ")" + | "..." + ) + + in_out_decl = "IN" | "OUT" | "IN_OUT" | "OUTLIST" | "IN_OUTLIST" + + cases = ( + "CASE:" ( C_expression | empty ) + xbody + )+ + + xbody = implicit_input ? + xbody_input_part * + xbody_init_part * + xbody_code_part + xbody_output_part * // Not after PPCODE. + xbody_cleanup_part * // Not after PPCODE. + + implicit_input = ( blank_line | input_line )+ + + xbody_input_part = + "INPUT:" ( blank_line | input_line )* + | "PREINIT:" + code_block + | xbody_generic_key + | c_args + | interface_macro + | "SCOPE:" enable // Only in perl 5.44.0 onwards. + + input_line = C_type + "&" ? + /\w+/ // variable name + // Optional initialiser: + ( + ( "=" | ";" ) "NO_INIT" + | + // Override or add to the default typemap. + // The expression is eval()ed as a + // double-quotish string. + "=" [ a_typemap_override($arg) ] + | + ("+" | ";") [ a_deferred_initialiser($arg) ] + )? + ";" ? + + xbody_init_part = "INIT:" + code_block + | xbody_generic_key + | c_args + | interface + | interface_macro + + xbody_code_part = + autocall + | "CODE:" + code_block + | "PPCODE:" + code_block + | // Only recognised if immediately following + // an INPUT section: + "NOT_IMPLEMENTED_YET:" + + // Implicit call to wrapped library function. + autocall = empty + + xbody_output_part = + xbody_postcall * + xbody_output * + + xbody_postcall = "POSTCALL:" + code_block + | xbody_generic_key + + xbody_output = "OUTPUT:" + ( blank_line + | output_line + | "SETMAGIC:" enable + )* + | xbody_generic_key + + // Variable name with optional expression which + // overrides the typemap + output_line = /\w+/ ( [ sv_setfoo(ST[0], RETVAL) ] )? + + xbody_cleanup_part = "CLEANUP:" + code_block + | xbody_generic_key + + // Text to use as the arguments for an autocall; + // may be spread over multiple lines: + c_args = "C_ARGS:" [foo, bar, baz] + + // Comma-separated list of Perl subroutine names + // which use the XSUB, over one or more lines: + interface = "INTERFACE:" [foo, bar, Bar::baz] + + interface_macro = + "INTERFACE_MACRO:" + [GET_MACRO_NAME] + [SET_MACRO_NAME] ? + + + // These can appear anywhere in an XSUB. + xbody_generic_key = pod + | alias + | "PROTOTYPE:" ( enable | [$$@] ) + + // Whitespace-separated list of overload types, + // over one or more lines: + | "OVERLOAD:" [ cmp eq <=> etc ] + + // Whitespace-separated list of attribute names, + // over one or more lines: + | "ATTRS:" [foo bar baz] + + + alias = "ALIAS:" + // One or more lines; each with zero or more + // {alias_name, op, index} triplets: + ( + [bar] "=" [5] + | [Foo::baz] "=" [A_CPP_DEFINE] + | [Bar::boz] "=>" [Foo::baz] + )* =head1 OVERVIEW OF XS AND XSUBS From f596116359fb57a425bb45a097206d8dc7f01362 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Mon, 7 Jul 2025 17:55:20 +0100 Subject: [PATCH 09/42] perlxs.pod: update MODULE/PACKAGE/PREFIX Rewrite the POD for these three keywords, and in particular, treat them as one declaration, rather than three unrelated keywords. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 106 +++++++++++++-------------- 1 file changed, 50 insertions(+), 56 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index a062a6b4ce67..7fb2bc79e2b2 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -1559,62 +1559,56 @@ arranged in the approximate order in which they might appear within an XS file, and then might appear within an XSUB declaration. Related keywords are grouped together. -=head2 The MODULE Keyword - -The MODULE keyword is used to start the XS code and to specify the package -of the functions which are being defined. All text preceding the first -MODULE keyword is considered C code and is passed through to the output with -POD stripped, but otherwise untouched. Every XS module will have a -bootstrap function which is used to hook the XSUBs into Perl. The package -name of this bootstrap function will match the value of the last MODULE -statement in the XS source files. The value of MODULE should always remain -constant within the same XS file, though this is not required. - -The following example will start the XS code and will place -all functions in a package named RPC. - - MODULE = RPC - -=head3 The PACKAGE Keyword - -When functions within an XS source file must be separated into packages -the PACKAGE keyword should be used. This keyword is used with the MODULE -keyword and must follow immediately after it when used. - - MODULE = RPC PACKAGE = RPC - - [ XS code in package RPC ] - - MODULE = RPC PACKAGE = RPCB - - [ XS code in package RPCB ] - - MODULE = RPC PACKAGE = RPC - - [ XS code in package RPC ] - -The same package name can be used more than once, allowing for -non-contiguous code. This is useful if you have a stronger ordering -principle than package names. - -Although this keyword is optional and in some cases provides redundant -information it should always be used. This keyword will ensure that the -XSUBs appear in the desired package. - -=head3 The PREFIX Keyword - -The PREFIX keyword designates prefixes which should be -removed from the Perl function names. If the C function is -C and the PREFIX value is C then Perl will -see this function as C. - -This keyword should follow the PACKAGE keyword when used. -If PACKAGE is not used then PREFIX should follow the MODULE -keyword. - - MODULE = RPC PREFIX = rpc_ - - MODULE = RPC PACKAGE = RPCB PREFIX = rpcb_ +=head2 The MODULE Declaration + + MODULE = Foo::Bar PACKAGE = Foo::Bar + MODULE = Foo::Bar PACKAGE = Foo::Bar::Baz + MODULE = Foo::Bar PACKAGE = Foo::Bar PREFIX = foobar_ + +The C keyword is used to start the XS half of the file, and to +specify the package of the functions which are being defined. The +C keyword must start on column one. All text preceding the first +C keyword is considered C code and is passed through to the output +with POD stripped, but otherwise untouched. + +It is usually necessary to include a blank line before each MODULE +declaration. + +For the first such declaration, the C and C values are +typically the same. In subsequent entries, the C value varies, +while the C value is kept unchanged. In fact, only the C +value from the I such declaration is used, and specifies the name of +the boot XSUB which is called when the module is loaded (typically via +C). + +The value of the C keyword is analogous to the Perl C +keyword, and determines which package any subsequent XSUBs will be created +in. It is permissible to have the same C value appear more than +once, again similarly to Perl. + +In theory the C keyword is optional, and defaults to C<''>. This +means that any subsequent XSUBs will be placed in the C package. +In practice, you should always specify the package. + +The optional C value is stripped from the XSUB's name when +generating the XSUB's Perl name. It is typically used to simplify creating +autocall XSUBs. It addresses the issue that while Perl has package names, +C only has function name prefixes. Consider a C library called C, +which has functions such as C and C. We +want to make these accessible from a Perl module called C. In +the presence of C, any such prefix of each XSUB +name will be stripped off when determining the XSUB's Perl name. For +example: + + MODULE = Foo::Bar PACKAGE = Foo::Bar PREFIX = foobar_ + + char* foobar_read(int n) + + int foobar_write(char *text, int n) + +This will insert two XSUBs into the Perl namespace, called +C and C, which when called, will +themselves call the C functions C and C. =head2 File-scoped XS Keywords and Directives From e338dfc025a1f3269d03508abf50b2ae7699a994 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Mon, 14 Jul 2025 09:05:21 +0100 Subject: [PATCH 10/42] perlxs.pod: update file-scoped directive text Populate the new =head2 File-scoped XS Keywords and Directives section, partially by cannibalising (and then deleting) the old =head3 Inserting POD, Comments and C Preprocessor Directives subsection. This commit only adds text about directives; subsequent commits will update the various file-scoped keywords. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 95 ++++++++++++++++++---------- 1 file changed, 60 insertions(+), 35 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 7fb2bc79e2b2..a797a0845063 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -1612,48 +1612,71 @@ themselves call the C functions C and C. =head2 File-scoped XS Keywords and Directives -XXX TBC +After the first C keyword, everything else in the file consists of +XSUB definitions, plus anything that comes between the XSUBs. The XSUBs +will be explained further down, but this section addresses the in-between +stuff, which can consist of any of the following. + +=over + +=item * + +A few file-scoped keywords (including further MODULE declarations), whose +effects usually last for the rest of the file. These keywords will be +detailed further down in this section. + +=item * + +POD, which is stripped out. It must be terminated with C<=cut>. + +=item * + +Blank lines, which are discarded. + +=item * + +Known C C preprocessor directives, which are passed through as-is. +Conditional ones, such as C<#if> and C<#else>, have some basic analysis +performed on them which, in particular, allows two variants of the same +XSUB to be declared without raising a "duplicate XSUB" warning. This +warning suppression only works for the if/else/endif form. For example +this works: + + #ifdef USE_2ARG + + int foo(int a, int b) + + #else + + int foo(int a) -XXX NB: L can technically be a file-scoped -keyword too, but is dealt with elsewhere. - -=head3 Inserting POD, Comments and C Preprocessor Directives - -C preprocessor directives are allowed within BOOT:, PREINIT: INIT:, CODE:, -PPCODE:, POSTCALL:, and CLEANUP: blocks, as well as outside the functions. -Comments are allowed anywhere after the MODULE keyword. The compiler will -pass the preprocessor directives through untouched and will remove the -commented lines. POD documentation is allowed at any point, both in the -C and XS language sections. POD must be terminated with a C<=cut> command; -C will exit with an error if it does not. It is very unlikely that -human generated C code will be mistaken for POD, as most indenting styles -result in whitespace in front of any line starting with C<=>. Machine -generated XS files may fall into this trap unless care is taken to -ensure that a space breaks the sequence "\n=". - -Comments can be added to XSUBs by placing a C<#> as the first -non-whitespace of a line. Care should be taken to avoid making the -comment look like a C preprocessor directive, lest it be interpreted as -such. The simplest way to prevent this is to put whitespace in front of -the C<#>. - -If you use preprocessor directives to choose one of two -versions of a function, use - - #if ... version1 - #else /* ... version2 */ #endif -and not +while this form will still raise warnings: - #if ... version1 + #ifdef USE_2ARG + ... #endif - #if ... version2 + + #ifndef USE_2ARG + ... #endif -because otherwise B will believe that you made a duplicate -definition of the function. Also, put a blank line before the -#else/#endif so it will not be seen as part of the function body. +=item * + +XS comment lines, which are stripped out; either a C which isn't +recognised as a C preprocessor directive, or C/. + +=item * + +Anything else is an error, unless it starts on column one, in which case +it will be treated as the start of a new XSUB. + +=back + +The following file-scoped keywords are supported. Note that the +L can technically be a file-scoped keyword too, +but is described further down as an XSUB keyword. =head3 The REQUIRE: Keyword @@ -1821,6 +1844,8 @@ L for more details. XXX TBC +XXX mention that XS comments and POD can appear between keywords + =head2 An XSUB Declaration XXX TBC From df519363d5c711092397df0e7985d3501165f056 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Fri, 11 Jul 2025 11:28:02 +0100 Subject: [PATCH 11/42] perlxs.pod: update REQUIRE, VERSIONCHECK keywords --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 52 ++++++++++++++++++---------- 1 file changed, 33 insertions(+), 19 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index a797a0845063..91b4586fad53 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -1680,34 +1680,48 @@ but is described further down as an XSUB keyword. =head3 The REQUIRE: Keyword -The REQUIRE: keyword is used to indicate the minimum version of the -B compiler needed to compile the XS module. An XS module which -contains the following statement will compile with only B version -1.922 or greater: + REQUIRE: 3.58 - REQUIRE: 1.922 +The C keyword is used to indicate the minimum version of the +C XS compiler (and its F wrapper) needed to +compile the XS module. It is expected to be a floating-point number of the +form C<\d+\.\d+/>. It is analogous to the perl C. =head3 The VERSIONCHECK: Keyword -The VERSIONCHECK: keyword corresponds to B's C<-versioncheck> and -C<-noversioncheck> options. This keyword overrides the command line -options. Version checking is enabled by default. When version checking is -enabled the XS module will attempt to verify that its version matches the -version of the PM module. + VERSIONCHECK: DISABLE | ENABLE -To enable version checking: +Version checking (enabled by default) checks that the version compiled +into the C<.so> or C<.dll> file matches the C<.pm> file's C<$VERSION> +value, and if not, dies with an error message like: - VERSIONCHECK: ENABLE + Foo::Bar object version 1.03 does not match bootstrap parameter 1.04 -To disable version checking: +Typically, when a module is built for the first time, the value of the +C<$VERSION> variable in the C<.pm> file is copied to the generated +C as C, and from there, via a C<-DXS_VERSION=...> +compiler option, is baked into the boot XSUB. When the module is loaded +and the boot code called, the versions are compared, and it croaks if +there's a mismatch. This usually indicates that the C<.so> and C<.pm> +files are from different installs: for example someone copied over a more +recent version of the C<.pm> file but forgot to copy or rebuild the +C<.so>. - VERSIONCHECK: DISABLE +If the version of the PM module is a floating point number, it will be +stringified before the comparison, with a possible loss of precision +(currently chopping to nine decimal places), so it may not match the +version of the XS module any more. Quoting the C<$VERSION> declaration to +make it a string is recommended if long version numbers are used. -Note that if the version of the PM module is an NV (a floating point -number), it will be stringified with a possible loss of precision -(currently chopping to nine decimal places) so that it may not match -the version of the XS module anymore. Quoting the $VERSION declaration -to make it a string is recommended if long version numbers are used. +There is rarely any good reason to disable this check. + +Note that this module version checking is completely unrelated to the +C keyword, which is a check against the version of the I. + +The C keyword corresponds to F's C<-versioncheck> +and C<-noversioncheck> options. This keyword overrides the command line +options. =head3 The PROTOTYPES: Keyword From 0312e37bb7229ea25e1eb3439e1c3f83735eb0ae Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Fri, 11 Jul 2025 13:10:23 +0100 Subject: [PATCH 12/42] perlxs.pod: update PROTOTYPES: keyword --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 52 ++++++++++++++++++++++------ 1 file changed, 42 insertions(+), 10 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 91b4586fad53..6a401f28bfd2 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -1725,24 +1725,56 @@ options. =head3 The PROTOTYPES: Keyword -The PROTOTYPES: keyword corresponds to B's C<-prototypes> and -C<-noprototypes> options. This keyword overrides the command line options. -Prototypes are disabled by default. When prototypes are enabled, XSUBs will -be given Perl prototypes. This keyword may be used multiple times in an XS -module to enable and disable prototypes for different parts of the module. -Note that B will nag you if you don't explicitly enable or disable -prototypes, with: + PROTOTYPES: DISABLE | ENABLE - Please specify prototyping behavior for Foo.xs (see perlxs manual) +When prototypes are enabled (they are disabled by default), any +subsequent XSUBs will be given a Perl prototype. The prototype string is +usually generated from the XSUB's parameter list. This keyword may be used +multiple times in an XS module to enable and disable prototypes for +different parts of the module. + +For example, these two XSUB declarations: -To enable prototypes: + int add1(int a, int b) PROTOTYPES: ENABLE -To disable prototypes: + int add2(int a, int b) + +behave similarly to the perl-level: + + sub add1 { ... } + sub add2($$) { ... } + +Note also that prototypes can be overridden on a per-XSUB basis with the +XSUB-level L keyword. + +In general, XSUB prototypes (similarly to perl sub prototypes) are of very +limited use and are typically only used to mimic the behaviour of Perl +builtins. For example there is no way to implement a C +style function without a way of telling the Perl interpreter not to +flatten C<@a>. Outside of these narrow uses, it is generally a mistake to +use prototypes. + +In the early days of XS it was thought that using prototypes was probably +a Good Thing, and prototypes were enabled by default. This was soon +changed to disabled by default, and a warning was added if you haven't +explicitly indicated your preference: so in the absence of any +C keyword, you will get this nagging warning: + + Please specify prototyping behavior for Foo.xs (see perlxs manual) + +So 99% of the time you will want to add PROTOTYPES: DISABLE +to the start of the XS half of your C<.xs> file. + +The C keyword corresponds to F's C<-prototypes> and +C<-noprototypes> options. + +See L for more information about Perl prototypes. + =head3 The EXPORT_XSUB_SYMBOLS: Keyword The EXPORT_XSUB_SYMBOLS: keyword is likely something you will never need. From ac07e97fd466ecbaf6069cd0e29aca965c6dfdd0 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Fri, 11 Jul 2025 14:03:46 +0100 Subject: [PATCH 13/42] perlxs.pod: update EXPORT_XSUB_SYMBOLS, INCLUDE(_COMMAND) --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 67 +++++++++++----------------- 1 file changed, 25 insertions(+), 42 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 6a401f28bfd2..66410c2d3942 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -1777,61 +1777,44 @@ See L for more information about Perl prototypes. =head3 The EXPORT_XSUB_SYMBOLS: Keyword -The EXPORT_XSUB_SYMBOLS: keyword is likely something you will never need. -In perl versions earlier than 5.16.0, this keyword does nothing. Starting -with 5.16, XSUB symbols are no longer exported by default. That is, they -are C functions. If you include + EXPORT_XSUB_SYMBOLS: ENABLE | DISABLE - EXPORT_XSUB_SYMBOLS: ENABLE +This keyword is present since 5.16.0, and its value is disabled by +default. -in your XS code, the XSUBs following this line will not be declared C. -You can later disable this with - - EXPORT_XSUB_SYMBOLS: DISABLE - -which, again, is the default that you should probably never change. -You cannot use this keyword on versions of perl before 5.16 to make -XSUBs C. +Before 5.16.0, the C function which implemented an XSUB was exported. +Since 5.16.0, it is declared C. The old behaviour can be restored +by enabling it. You are very unlikely to have a need for this keyword. =head3 The INCLUDE: Keyword -This keyword can be used to pull other files into the XS module. The other -files may have XS code. INCLUDE: can also be used to run a command to -generate the XS code to be pulled into the module. + INCLUDE: const-xs.inc + INCLUDE: some_command | -The file F contains our C function: - - bool_t - rpcb_gettime(host, timep) - char *host - time_t &timep - OUTPUT: - timep - -The XS module can use INCLUDE: to pull that file into it. - - INCLUDE: Rpcb1.xsh - -If the parameters to the INCLUDE: keyword are followed by a pipe (C<|>) then -the compiler will interpret the parameters as a command. This feature is -mildly deprecated in favour of the C directive, as documented -below. +This keyword can be used to pull in the contents of another file to the +"XS" part of an XS file. Unlike a top-level XS file, included files don't +have a "C" first half, and the entire contents of the file are treated as +XS, as if it had all been inserted at that line. - INCLUDE: cat Rpcb1.xsh | +One common use of C is to include constant definitions generated +by F. -Do not use this to run perl: C will run the perl that -happens to be the first in your path and not necessarily the same perl that is -used to run C. See L<"The INCLUDE_COMMAND: Keyword">. +If the parameters to the C keyword are followed by a pipe (C<|>) +then the XS parser will interpret the parameters as a command. This +feature is mildly deprecated in favour of the C +directive, as documented below. The latter can be used to ensure that the +perl (if any) used in the command is the same as the one running the XS +parser. =head3 The INCLUDE_COMMAND: Keyword -Runs the supplied command and includes its output into the current XS -document. C assigns special meaning to the C<$^X> token -in that it runs the same perl interpreter that is running C: + INCLUDE_COMMAND: $^X -e '...' - INCLUDE_COMMAND: cat Rpcb1.xsh +Since 5.14.0. - INCLUDE_COMMAND: $^X -e ... +Similar to C except that the C<|> is implicit, and +it converts the special token C<$^X>, if present, to the path of the perl +interpreter which is running the XS parser. =head3 The TYPEMAP: Keyword From 09e774da3ea277e591e1beda42f5b0ccc306ce78 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Fri, 11 Jul 2025 14:37:39 +0100 Subject: [PATCH 14/42] perlxs.pod: update TYPEMAP: keyword --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 71 +++++++++++++++++++++------- 1 file changed, 55 insertions(+), 16 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 66410c2d3942..bf4e47f5cfa1 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -1818,22 +1818,61 @@ interpreter which is running the XS parser. =head3 The TYPEMAP: Keyword -Starting with Perl 5.16, you can embed typemaps into your XS code -instead of or in addition to typemaps in a separate file. Multiple -such embedded typemaps will be processed in order of appearance in -the XS code and like local typemap files take precedence over the -default typemap, the embedded typemaps may overwrite previous -definitions of TYPEMAP, INPUT, and OUTPUT stanzas. The syntax for -embedded typemaps is - - TYPEMAP: < keyword must appear in the first column of a -new line. - -Refer to L for details on writing typemaps. + TYPEMAP: < keyword can be used to embed typemap declarations +directly into your XS code, instead of (or in addition to) typemaps in a +separate file. Multiple such embedded typemaps will be processed in order +of appearance in the XS code. Typemaps are processed in the order: + +=over + +=item * + +The system typemap file. + +=item * + +A local typemap file, typically specified by C +in the F. + +=item * + +C entries, in order. + +=back + +The most recently-applied entries take precedence, so for example you can +use C to individually override specific C, C, or +C entries in the system typemap. In general, typemap changes +affect any subsequent XSUBs within the file, until further updates. Note +however that due a quirk in parsing, it is possible for a C +entry immediately I an XSUB to affect that XSUB. + +The C keyword syntax is intended to mimic Perl's "heredoc" +syntax, and the keyword must be followed by one of these three forms: + + << FOO + << 'FOO' + << "FOO" + +where C can be just about any sequence of characters, which must be +matched at the start of a subsequent line. + +See L and L for more details on writing +typemaps. =head3 The BOOT: Keyword From 560391d9278fafdfb186ca9c2645193b75c39004 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Fri, 11 Jul 2025 15:54:44 +0100 Subject: [PATCH 15/42] perlxs.pod: update BOOT: keyword --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index bf4e47f5cfa1..9b7f37ce53b2 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -1876,20 +1876,19 @@ typemaps. =head3 The BOOT: Keyword -The BOOT: keyword is used to add code to the extension's bootstrap -function. The bootstrap function is generated by the B compiler and -normally holds the statements necessary to register any XSUBs with Perl. -With the BOOT: keyword the programmer can tell the compiler to add extra -statements to the bootstrap function. - -This keyword may be used any time after the first MODULE keyword and should -appear on a line by itself. The first blank line after the keyword will -terminate the code block. - BOOT: - # The following message will be printed when the - # bootstrap function executes. - printf("Hello from the bootstrap!\n"); + # Print a message when the module is loaded + printf("Hello from the bootstrap!\n"); + +The C keyword is used to add code to the extension's bootstrap +function. This function is generated by the XS parser and normally holds +the statements necessary to register any XSUBs with Perl. It is usually +called once, at C time. + +This keyword should appear on a line by itself. All subsequent lines will +be interpreted as lines of C code to pass through, including C +preprocessor directives, but excluding POD and C<#> comments; until the +next keyword or possible start of a new XSUB (C). =head3 The FALLBACK: Keyword From 9af5c7406b138d7b5aa95431bd78df7f17b826f1 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Fri, 11 Jul 2025 16:31:44 +0100 Subject: [PATCH 16/42] perlxs.pod: update FALLBACK: keyword --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 9b7f37ce53b2..6b8081e5ead6 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -1892,20 +1892,22 @@ next keyword or possible start of a new XSUB (C). =head3 The FALLBACK: Keyword -In addition to the OVERLOAD keyword, if you need to control how -Perl autogenerates missing overloaded operators, you can set the -FALLBACK keyword in the module header section, like this: + MODULE = Foo PACKAGE = Foo::Bar - MODULE = RPC PACKAGE = RPC + FALLBACK: TRUE | FALSE | UNDEF - FALLBACK: TRUE - ... +Since 5.8.1. + +It defaults to C for each package. It sets the default fallback +handling behaviour for overloaded methods in the current package (i.e. +C in the example above). It is analogous to the Perl-level: + + package Foo::Bar; + use overload "fallback" => 1 | 0 | undef; -where FALLBACK can take any of the three values TRUE, FALSE, or -UNDEF. If you do not set any FALLBACK value when using OVERLOAD, -it defaults to UNDEF. FALLBACK is not used except when one or -more functions using OVERLOAD have been defined. Please see -L for more details. +It only has any effect if there ends up being at least one XSUB in the +current package with the L keyword +present. See L for more details. =head2 The Structure of an XSUB From e5346fd194e14f59ed02a5d38630fd8c2e870770 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Sat, 12 Jul 2025 09:54:55 +0100 Subject: [PATCH 17/42] perlxs.pod: update XSUB Structure + Declaration Populate the new =head2 The Structure of an XSUB =head2 An XSUB Declaration sections --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 177 ++++++++++++++++++++++++--- 1 file changed, 157 insertions(+), 20 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 6b8081e5ead6..f322b7f864d8 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -1911,40 +1911,177 @@ present. See L for more details. =head2 The Structure of an XSUB -XXX TBC +Following any file-scoped XS keywords and directives, an XSUB may appear. +The start of an XSUB is usually indicated by a blank line followed by +something starting on column one which isn't otherwise recognised as an +XSUB keyword or file-scoped directive. + +An XSUB definition consists of a declaration (typically two lines), +followed by an optional body. The declaration specifies the XSUB's name, +parameters and return type. The body consists of sections started by +keywords, which may specify how its parameters and any any return value +should be processed, and what the main C code body of the XSUB consists +of. Other keywords can change the behaviour of the XSUB, or affect how it +is registered with Perl, e.g. with extra named aliases. In the absence of +an explicit main C code body specified by the C or C +keywords, the parser will generate a body automatically; this is referred +to as L in this document. + +Nothing can appear between keyword sections apart from POD, XS comments, +and trailing blank lines, all of which are stripped out before the main +parsing takes place. Anything else will either raise an error, or be +interpreted as the start of a new XSUB. + +An XSUB's body can be thought of as having up to five parts. These are, in +order of appearance, the L, L, L, L and L parts. There is no +formal syntax to define this structure; it's just an understanding that +certain keywords may only appear in certain parts and thus may only appear +after certain other keywords etc. -XXX mention that XS comments and POD can appear between keywords =head2 An XSUB Declaration -XXX TBC + # A simple declaration: + + int + foo1(int i, char *s) + + # All on one line; plus a default parameter value: + + int foo2(int i, char *s = "") + + # Complex parameters; plus variable argument count: + + int + foo3(OUT int i, IN_OUTLIST char *s, STRLEN length(s), ...) + + # No automatic argument processing: + + void + foo4(...) + PPCODE: -=head3 The NO_OUTPUT Keyword + # C++ method; plus various return type qualifiers: -The NO_OUTPUT can be placed as the first token of the XSUB. This keyword -indicates that while the C subroutine we provide an interface to has -a non-C return type, the return value of this C subroutine should not -be returned from the generated Perl subroutine. + NO_OUTPUT extern "C" static int + X::Y::foo5(int i, char *s) const -With this keyword present the C variable is created, and in the -generated call to the subroutine this variable is assigned to, but the value -of this variable is not going to be used in the auto-generated code. -This keyword makes sense only if C is going to be accessed by the -user-supplied code. It is especially useful to make a function interface -more Perl-like, especially when the C return value is just an error condition -indicator. For example, +An XSUB declaration consists of a return type, name, parameters, and +optional C, C, C and C keywords. + +=head3 An XSUB's return type and the NO_OUTPUT keyword + +The return type can be any valid C type, including C. When non-void, +it serves two purposes. First, it causes a C auto variable of that type +to be declared, called C. Second, it (usually) makes the XSUB +return a single SV whose value is set to C's value at the time of +return. In addition, a non-void autocall XSUB will call the underlying C +library function and assign its return value to C. + +If the return type is prefixed with the C keyword, then the +C variable is still declared, but code to return its value is +suppressed. It is typically useful when making an autocall function +interface more Perl-like, especially when the C return value is just an +error condition indicator. For example, NO_OUTPUT int delete_file(char *name) + # implicit autocall code here: RETVAL = delete_file(name); POSTCALL: if (RETVAL != 0) croak("Error %d while deleting file '%s'", RETVAL, name); -Here the generated XS function returns nothing on success, and will die() -with a meaningful error message on error. +Here the generated XS function returns nothing on success, and will +C with a meaningful error message on error. The XSUB's return type +of C is only meaningful for declaring C and for doing the +autocall. + +The return type can also include the C and C +modifiers, which if present must be in that order, and come between any +C keyword and the return type. The C declaration must +be written exactly as shown, i.e. with a single space and with double +quotes around the C. These two modifiers are mainly of use for XSUBs +written in C++. A C++ XSUB declaration is also allowed to have a trailing +C keyword, which mimics the C++ syntax. See L +for more details. + +=head3 An XSUB's name + +The name of the XSUB is usually put on the line following the type, in +which case it must be on column one. It is permissible for both the return +type and name to be on the same line. + +The name can be any valid Perl subroutine name. The C value from +the most recent C declaration is used to give the XSUB it's +fully-qualified Perl name. + +If the name includes the package separator, C<::>, then it is treated as +as a C++ method declaration, and various extra bits of processing take +place, such as declaring an implicit C parameter. The XSUB's I +package name is still determined by the current XS package, and not the +C++ class name. See L for more details. + +=head3 An XSUB's parameter list + +Following the XSUB's name, there is a comma-separated list of parameters +within parentheses. Although this looks superficially the same as a C +function declaration, it is different. In particular, it is parsed by the +XS compiler, which is a simple regex-based text processor and which +doesn't understand the full C type syntax; nor does it recognise C-style +comments. + +In fact all it does is extract the text between the C<(...)> and split on +commas, while having enough intelligence to ignore commas and a closing +parenthesis within a double-quoted string. Once each parameter declaration +is extracted, it is processed, as described below in +L. + +Each parameter declaration usually generates a C auto variable declaration +of the same name, along with initialisation code which assigns the value +of the corresponding passed argument to that variable. Under some +circumstances code can also be generated to return the value too. + +Note that the original XS syntax required the type for each parameter to +be specified separately in one or more INPUT sections, mimicking pre-C89 +"K&R" C syntax. To support this, directly after the declaration there is an +implicit INPUT section, without a need to include the actual keyword. You +will see this pattern very frequently in older XS code. + +Old style with an implicit INPUT keyword (a common pattern): + + int + foo(a, b) + long a + char *b + CODE: + ... + +Old style with explicit INPUT keyword (unusual): + + int + foo(a, b) + INPUT: + long a + char *b + CODE: + ... + +New style (recommended for new code): + + int + foo(long a, char *b) + CODE: + ... + +Generally there no reason to use the old style any more, apart from a few +obscure features that can be specified on an INPUT line but not in the +signature. + -=head2 XSUB Parameters +=head2 An XSUB Parameter =head3 The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords @@ -1971,7 +2108,7 @@ pointers. The return list of the generated Perl function consists of the C return value from the function (unless the XSUB is of C return type or -C was used) followed by all the C +C was used) followed by all the C and C parameters (in the order of appearance). On the return from the XSUB the C/C Perl parameter will be modified to have the values written by the C function. @@ -2614,7 +2751,7 @@ executed after the C subroutine call is performed. When the POSTCALL: keyword is used it must precede OUTPUT: and CLEANUP: blocks which are present in the XSUB. -See examples in L<"The NO_OUTPUT Keyword">. +See an example in L<"An XSUB Declaration">. The POSTCALL: block does not make a lot of sense when the C subroutine call is supplied by user by providing either CODE: or PPCODE: section. From 955ad90794a6cad24897c2d5c2b4c98b6b14d280 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Fri, 18 Jul 2025 21:19:45 +0100 Subject: [PATCH 18/42] perlxs.pod: update section 'An XSUB Parameter' Add some initial text for this new section, and also add a new subsection "XSUB Parameter Placeholders". --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 97 ++++++++++++++++++++++++++++ 1 file changed, 97 insertions(+) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index f322b7f864d8..5eecf170fa03 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -2083,6 +2083,103 @@ signature. =head2 An XSUB Parameter +Some examples of valid XSUB parameter declarations: + + char *foo # parameter with type + char *foo = "abc" # default value + char *foo = NO_INIT # doesn't complain if arg missing + OUT char *foo # caller's arg gets updated + IN_OUTLIST char *foo # parameter value gets returned + int length(foo) # pseudo-parameter that gets the length of foo + foo # placeholder, or parameter without type + SV* # placeholder + ... # ellipsis: zero or more further arguments + +The most straightforward type of declaration in an XSUB's parameter list +consists of just a C type followed by a parameter name, such as C. This has two main effects. First, it causes a C auto variable of +that name to be declared; and second, the variable is initialised to the +value of the passed argument which corresponds to that parameter. For +example, + + void + foo(int i, char *s) + +is roughly equivalent to the Perl: + + sub foo { + my $i = int($_[0]); + my $s = "$_[1]"; + ... + } + +and the generated C code may look something like: + + if (items != 2) + croak_xs_usage(cv, "i, s"); + + { + int i = (int)SvIV(ST(0)); + char *s = (char *)SvPV_nolen(ST(1)); + foo(i, s); /* autocall */ + ... + } + +In addition to the variable declaration and initialisation, the name of +the parameter will usually be used in the usage message and in any +autocall, as shown above. These variables are accessible for any user code +in a C block or similar. Their values aren't normally returned. + +There are several variations on this basic pattern, which are explained in +the following subsections. + +=head3 XSUB Parameter Placeholders + +Sometimes you want to skip an argument. There are two supported techniques +for efficiently declaring a placeholder. Both of these will completely +skip any declaration and initialisation of a C auto variable, but will +still consume an argument. + +A bare parameter name is treated as a placeholder if has a name but no +type specified: neither in the signature, nor in any following C +section. For example: + + void + foo(int a, b, char *c) + CODE: + ... + +is roughly equivalent to the Perl: + + sub foo { + my $a = int($_[0]); + my $c = "$_[2]"; + ... + } + +A parameter containing just the specific type C and no name is +treated specially. A bug in the XS parser meant that it used to skip any +parameter declaration which wasn't parsable. This inadvertently made many +things de facto placeholder declarations. A common usage was C, which +is now officially treated as a placeholder for backwards compatibility. +Any other bare types without a parameter name are errors since +C 3.57. Note that the C text will appear in any +C error message. For example, + + void + foo(int a, SV*, char *c) + +may croak with: + + Usage: Foo::Bar::foo(a, SV*, c) at ... + +Placeholders can't be used with autocall unless you use C to +override the missing argument. For example: + + void + foo(int a, b, char *c) + C_ARGS: a, c + =head3 The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords In the list of parameters for an XSUB, one can precede parameter names From 9717f2521d0d5f70f432d128a78d9776b94d47f5 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Mon, 21 Jul 2025 16:23:26 +0100 Subject: [PATCH 19/42] perlxs.pod: update IN_OUT etc section --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 219 ++++++++++++++++++--------- 1 file changed, 151 insertions(+), 68 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 5eecf170fa03..3c624daf6668 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -2180,90 +2180,173 @@ override the missing argument. For example: foo(int a, b, char *c) C_ARGS: a, c -=head3 The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords - -In the list of parameters for an XSUB, one can precede parameter names -by the C/C/C/C/C keywords. -C keyword is the default, the other keywords indicate how the Perl -interface should differ from the C interface. - -Parameters preceded by C/C/C/C -keywords are considered to be used by the C subroutine I. C/C keywords indicate that the C subroutine -does not inspect the memory pointed by this parameter, but will write -through this pointer to provide additional return values. - -Parameters preceded by C keyword do not appear in the usage -signature of the generated Perl function. - -Parameters preceded by C/C/C I appear as -parameters to the Perl function. With the exception of -C-parameters, these parameters are converted to the corresponding -C type, then pointers to these data are given as arguments to the C -function. It is expected that the C function will write through these -pointers. - -The return list of the generated Perl function consists of the C return value -from the function (unless the XSUB is of C return type or -C was used) followed by all the C -and C parameters (in the order of appearance). On the -return from the XSUB the C/C Perl parameter will be -modified to have the values written by the C function. - -For example, an XSUB +=head3 Updating and returning parameter values: the IN_OUT etc keywords - void - day_month(OUTLIST day, IN unix_time, OUTLIST month) - int day - int unix_time - int month + int i + IN int i + IN_OUT int i + IN_OUTLIST int i + OUT int i + OUTLIST int i -should be used from Perl as +Normally a parameter declaration causes a C auto variable of the same name +to be declared and initialised to the value of the corresponding passed +argument. These modifiers can make parameters also update or return +values, and can also cause the initialisation to be skipped. They come at +the start of a parameter declaration. - my ($day, $month) = day_month(time); +These modifiers address the issue that, because a simple C function takes +a fixed number of I parameters and returns a I value, +the basic XSUB syntax has been designed to reflect that pattern. -The C signature of the corresponding function should be +The usual way to make a more complex C function API is to pass pointers to +variables, which the C function will use to set or update the variables. +For example, a couple of hypothetical C functions might be called as: - void day_month(int *day, int unix_time, int *month); + int time = ....; // an integer in the range 0..86399 + int hour, min, sec; + parse_time(time, &hour, &min, &sec); // set hour, min, sec + increment_time(&hour, &min, &sec); // update hour, min, sec -The C/C/C/C/C keywords can be -mixed with ANSI-style declarations, as in +The XS C etc modifiers allow you to write XSUBs which can wrap +such functions with autocall, and in general update passed arguments or +return multiple values. - void - day_month(OUTLIST int day, int unix_time, OUTLIST int month) +The rules of these parameter modifiers are: + +=over + +=item * + +All such parameters, regardless of the modifier, cause a C auto variable +of the same name to be declared. + +=item * + +In the absence of a modifier, it defaults to C. + +=item * + +The text of the modifier has up to two parts, separated by an underscore. +The input part comes first and can have the value C<''> or C, while +the output part can be one of C<''>, C, or C. + +=item * + +An input part of C (the default) causes that variable to be +initialised to the value of the corresponding passed argument. Otherwise +the initialisation is skipped: in particular, this means that an autocall +function will be passed a pointer to an uninitialised value: the +assumption being that the library function will set, but not use, that +value. + +=item * + +An output part of C<''> (the default) means nothing is done to return the +value of the auto variable. + +=item * + +An output part of C or C causes the value to be returned in +some fashion, and in addition, any autocall code will prefix such +variables with C<&> when calling the wrapped C library function. -(here the optional C keyword is omitted). +=item * + +An output part of C causes the corresponding passed argument to be +I with the value of the variable. + +=item * + +An output part of C causes the value of the variable to be +returned as an extra SV after the C value, if any. They are +returned in the order they appear in the XSUB's parameter list. + +=item * + +For the specific case of the modifier being C, it is a +pseudo-parameter, and I. It doesn't form part of +the signature of the XSUB, although it I used for any autocall. So for +example: + + int + foo(int a, OUTLIST int b, int c) -The C parameters are identical with parameters introduced with -L and put into the C section (see -L). The C parameters are very similar, -the only difference being that the value C function writes through the -pointer would not modify the Perl parameter, but is put in the output -list. +gets converted to roughly this C code: -The C/C parameter differ from C/C -parameters only by the initial value of the Perl parameter not -being read (and not being given to the C function - which gets some -garbage instead). For example, the same C function as above can be -interfaced with as + if (items != 2) + croak_xs_usage(cv, "a, c"); + { + int RETVAL; + int a = (int)SvIV(ST(0)); + int b; + int c = (int)SvIV(ST(1)); + RETVAL = foo(a, &b, c); + } + ... push the values of RETVAL and b onto the stack and return ... - void day_month(OUT int day, int unix_time, OUT int month); +=back -or +The approximate perl equivalents for these modifiers are given in the +examples below, where the perl code C stands in for the C +autocall C. + + + int i sub foo { + IN int i my $i = $_[N]; + real_foo($i); + } + + IN_OUT int i sub foo { + my $i = $_[N]; + real_foo(\$i); + $_[N] = $i; + } + + IN_OUTLIST int i sub foo { + my $i = $_[N]; + real_foo(\$i); + return ..., $i, ...; + } + + OUT int i sub foo { + my $i; + real_foo(\$i); + $_[N] = $i; + + OUTLIST int i sub foo { + my $i; # NB $_[N] is not consumed + real_foo(\$i); + return ..., $i, ...; + } + +Together, they allow you to wrap C functions which use pointers to return +extra values; either preserving the C-ish API in perl (C), or +providing a more Perl-like API (C). For example, wrapping the +C function from the example above could be done using +C: void - day_month(day, unix_time, month) - int &day = NO_INIT - int unix_time - int &month = NO_INIT - OUTPUT: - day - month + parse_time(int time, \ + OUT int hour, OUT int min, OUT int sec) + +which could be called from perl as: + + my ($hour, $min, $sec); + # set ($hour, $min, $sec) to (23,59,59): + parse_time(86399, $hour, $min, $sec); + +Or by using C: + + void + parse_time(int time, \ + OUTLIST int hour, OUTLIST int min, OUTLIST int sec) + +which could be called from perl as: -However, the generated Perl function is called in very C-ish style: + # set ($hour, $min, $sec) to (23,59,59): + my ($hour, $min, $sec) = parse_time(86399); - my ($day, $month); - day_month($day, time, $month); =head3 Default Parameter Values From d8dcfe71824ed40d3b6f1bce721e574e01329aff Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Sun, 27 Jul 2025 11:28:56 +0100 Subject: [PATCH 20/42] perlxs.pod: update default, length, ellipis params Rewrite (and retitle) these three subsections: =head3 Default Parameter Values =head3 The C Keyword =head3 Variable-length Parameter Lists --- XSUB.h | 2 +- dist/ExtUtils-ParseXS/lib/perlxs.pod | 207 +++++++++++++++++++-------- 2 files changed, 145 insertions(+), 64 deletions(-) diff --git a/XSUB.h b/XSUB.h index 87893d435106..1fc27cd3ff12 100644 --- a/XSUB.h +++ b/XSUB.h @@ -44,7 +44,7 @@ must be called prior to setup the C variable. =for apidoc Amn|Stack_off_t|items Variable which is setup by C to indicate the number of -items on the stack. See L. +items on the stack. See L. =for apidoc Amn|I32|ix Variable which is setup by C to indicate which of an diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 3c624daf6668..a038ae04c073 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -2350,93 +2350,174 @@ which could be called from perl as: =head3 Default Parameter Values -Default values for XSUB arguments can be specified by placing an -assignment statement in the parameter list. The default value may -be a number, a string or the special string C. Defaults should -always be used on the right-most parameters only. + int + foo(int i, char *s = "abc") -To allow the XSUB for rpcb_gettime() to have a default host -value the parameters to the XSUB could be rearranged. The -XSUB will then call the real rpcb_gettime() function with -the parameters in the correct order. This XSUB can be called -from Perl with either of the following statements: + int + bar(int i, int j = i + ')', char *s = "abc,)") - $status = rpcb_gettime($timep, $host); + int + baz(int i, char *s = NO_INIT) + +Optional parameters can be indicated by appending C<= C_expression> to the +parameter declaration. The C expression will be evaluated if not enough +arguments are supplied. Parameters with default values should come after +any mandatory parameters (although this is currently not enforced by the +XS compiler). The value can be any valid compile-time or run-time C +expression (but see below), including the values of any parameters +declared to its left. The special value C indicates that the +parameter is kept uninitialised if there isn't a corresponding argument. + +The XS parser's handling of default expressions is rather simplistic. It +just wants to extract parameter declarations (including any optional +trailing default value) from a comma-separated list, but it doesn't +understand C syntax. It can handle commas and closing parentheses within a +quoted string, but currently not an escaped quote such as C<'\''> or +C<"\"">. Neither can it handle balanced parentheses such as C. + +Due to an implementation flaw, default value expressions are currently +evalled in double-quoted context during parsing, in a similar fashion to +typemap templates. So for example C is expanded to +C or similar. This behaviour may be fixed at some +point; in the meantime, it is best to avoid the C<$> and C<@> characters +within default value expressions. + +=head3 The C pseudo-parameter - $status = rpcb_gettime($timep); + int + foo(char *s, int length(s)) -The XSUB will look like the code which follows. A CODE: -block is used to call the real rpcb_gettime() function with -the parameters in the correct order for that function. +It is common for a C function to take a string pointer and length as two +arguments, while in Perl, string-valued SVs combine both the string and +length in a single value. To simplify generating the autocall code in such +situations, the C pseudo-parameter acts as the length of the +parameter C. It doesn't consume an argument or appear in the XSUB's +usage message, but it I passed to the autocalled C function. For +example, this XS: - bool_t - rpcb_gettime(timep, host="localhost") - char *host - time_t timep = NO_INIT - CODE: - RETVAL = rpcb_gettime(host, &timep); - OUTPUT: - timep - RETVAL + void + foo(char *s, short length(s), int t) -=head3 The C Keyword +translates to something similar to this C code: -If one of the input arguments to the C function is the length of a string -argument C, one can substitute the name of the length-argument by -C in the XSUB declaration. This argument must be omitted when -the generated Perl function is called. E.g., + if (items != 2) + croak_xs_usage(cv, "s, t"); - void - dump_chars(char *s, short l) { - short n = 0; - while (n < l) { - printf("s[%d] = \"\\%#03o\"\n", n, (int)s[n]); - n++; - } + STRLEN STRLEN_length_of_s; + short XSauto_length_of_s; + char * s = (char *)SvPV(ST(0), STRLEN_length_of_s); + int t = (int)SvIV(ST(1)); + + XSauto_length_of_s = STRLEN_length_of_s; + foo(s, XSauto_length_of_s, t); } - MODULE = x PACKAGE = x +and might be called from Perl as: - void dump_chars(char *s, short length(s)) + foo("abcd", 9999); -should be called as C. +The exact C code generated will vary over releases, but the important +things to note are: -This directive is supported with ANSI-type function declarations only. +=over -=head3 Variable-length Parameter Lists +=item * + +The auto variable C will be declared with the +specified type and will be passed to any autocall function, but it won't +appear in the usage message. This variable is available for use in C +blocks and similar. -XSUBs can have variable-length parameter lists by specifying an ellipsis -C<(...)> in the parameter list. This use of the ellipsis is similar to that -found in ANSI C. The programmer is able to determine the number of -arguments passed to the XSUB by examining the C variable which the -B compiler supplies for all XSUBs. By using this mechanism one can -create an XSUB which accepts a list of parameters of unknown length. +=item * -The I parameter for the rpcb_gettime() XSUB can be -optional so the ellipsis can be used to indicate that the -XSUB will take a variable number of parameters. Perl should -be able to call this XSUB with either of the following statements. +The auto variable C is used in addition to allow +conversion between the type expected by C and the type declared +for the length pseudo-parameter. - $status = rpcb_gettime($timep, $host); +=item * - $status = rpcb_gettime($timep); +A length parameter can appear anywhere in the signature, even before the +string parameter of the same name; but its position in any autocall +matches its position in the signature. -The XS code, with ellipsis, follows. +=item * - bool_t - rpcb_gettime(timep, ...) - time_t timep = NO_INIT - PREINIT: - char *host = "localhost"; +Each length parameter must match another parameter of the same name. That +parameter must be a string type (something which maps to the C +typemap type). + +=back + +=head3 Ellipsis: variable-length parameter lists + + int + foo(char *s, ...) + +An XSUB can have a variable-length parameter list by specifying an +ellipsis as the last parameter, similar to C function declarations. Its +main effect is to disable the error check for too many parameters. Any +declared parameters will still be processed as normal, but the programmer +will have to access any extra arguments manually, making use of the +C macro to access the nth item on the stack (counting from 0), and +the C variable, which indicates the total number of passed +arguments, including any fixed arguments. + +Note that currently XS doesn't provide any mechanism to autocall +variable-length C functions, so the ellipsis should only be used on XSUBs +which have a body. + +For example, consider this Perl subroutine which returns the sum of all +of its arguments which are within a specified range: + + sub minmax_sum { + my $min = shift; + my $max = shift; + my $RETVAL = 0; + $RETVAL += $_ for grep { $min <= $_ && $_ <= $max } @_; + return $RETVAL; + } + +This XSUB provides equivalent functionality: + + int + minmax_sum(int min, int max, ...) CODE: - if (items > 1) - host = (char *)SvPVbyte_nolen(ST(1)); - RETVAL = rpcb_gettime(host, &timep); + { + int i = 2; /* skip the two fixed arguments */ + RETVAL = 0; + + for (; i < items; i++) { + int val = (int)SvIV(ST(i)); + if (min <= val && val <= max) + RETVAL += val; + } + } OUTPUT: - timep RETVAL +It is possible to write an XSUB which both accepts and returns a list. For +example, this XSUB does the equivalent of the Perl C + + void + triple(...) + PPCODE: + SP += items; + { + int i; + for (i = 0; i < items; i++) { + int val = (int)SvIV(ST(i)); + ST(i) = sv_2mortal(newSViv(val*3)); + } + } + +Note that the L keyword, in comparison to +C, resets the local copy of the argument stack pointer, and relies +on the coder to place any return values on the stack. The example above +reclaims the passed arguments by setting C back to the top of the +stack, then replaces the items on the stack one by one. + =head2 The XSUB Input Part XXX TBC @@ -3247,7 +3328,7 @@ included in that case. A CASE: might switch via a parameter of the XSUB, via the C ALIAS: variable (see L<"The ALIAS: Keyword">), or maybe via the C variable -(see L<"Variable-length Parameter Lists">). The last CASE: becomes the +(see L<"Ellipsis: variable-length parameter lists">). The last CASE: becomes the B case if it is not associated with a conditional. The following example shows CASE switched via C with a function C having an alias C. When the function is called as From 34f73e100b0f43b4809dec927aaa684eed1b58f2 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Wed, 30 Jul 2025 14:32:29 +0100 Subject: [PATCH 21/42] perlxs.pod: update: Input part, PREINIT sections Add text for the new '=head2 The XSUB Input Part' section, and rewrite the existing entry for the PREINIT keyword. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 157 +++++++++++---------------- 1 file changed, 62 insertions(+), 95 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index a038ae04c073..b0d9b2c1b6f8 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -2520,111 +2520,78 @@ stack, then replaces the items on the stack one by one. =head2 The XSUB Input Part -XXX TBC - -XXX NB: the keywords described in L and L may also appear in this part, plus C and -C. +Following an XSUB's declaration part, the body of the XSUB follows. The +first part of the body is the I part, and is mainly concerned with +declaring auto variables and assigning to them values extracted from the +passed parameters. The two main keywords associated with this activity are +L and L. The +first allows you to inject extra variable declaration lines, while the +latter used to be needed to specify the type of each parameter, but is now +mainly of historical interest. This is also the place for the rarely-used +L keyword. + +Note that the keywords described in L and L may also appear in this part, plus the L and L keywords. =head3 The PREINIT: Keyword -The PREINIT: keyword allows extra variables to be declared immediately -before or after the declarations of the parameters from the INPUT: section -are emitted. - -If a variable is declared inside a CODE: section it will follow any typemap -code that is emitted for the input parameters. This may result in the -declaration ending up after C code, which is C syntax error. Similar -errors may happen with an explicit C<;>-type or C<+>-type initialization of -parameters is used (see L<"Initializing Function Parameters">). Declaring -these variables in an INIT: section will not help. - -In such cases, to force an additional variable to be declared together -with declarations of other variables, place the declaration into a -PREINIT: section. The PREINIT: keyword may be used one or more times -within an XSUB. - -The following examples are equivalent, but if the code is using complex -typemaps then the first example is safer. + PREINIT: + int i; + char *prog_name = get_prog_name(); + +This keyword allows extra variables to be declared and possibly +initialised immediately before the declarations of auto variables +generated from any parameter declarations or C lines. Any lines +following C until the next keyword (except POD and XS comments) +are copied out as-is to the C code file. Multiple C keywords are +allowed. + +It is sometimes needed because in traditional C, all variable +declarations must come before any statements. While this is no longer a +restriction in the perl interpreter source since 5.36.0, the C compiler +flags used when compiling XS code may be different and so, depending on +the compiler, it may still be necessary to preserve the correct ordering. + +Any variable declarations generated by C and lines from C +are output in the same order they appear in the XS source, followed by any +variable declarations generated from the XSUB's parameter declarations. +These may be followed by statements to initialise those those variables. +Thus, any variable declarations in a later C or C block may be +flagged as a declaration-after-statement. + +C code shouldn't assume that any variables declared earlier have +already been initialised; initialisation is deferred if the initialisation +code (typically obtained from a typemap) isn't of the simple C form, or has a default value. - bool_t - rpcb_gettime(timep) - time_t timep = NO_INIT - PREINIT: - char *host = "localhost"; - CODE: - RETVAL = rpcb_gettime(host, &timep); - OUTPUT: - timep - RETVAL +For example: -For this particular case an INIT: keyword would generate the -same C code as the PREINIT: keyword. Another correct, but error-prone example: + void + foo(int i = 0) + PREINIT: + int j = 1; + CODE: + bar(i, j); - bool_t - rpcb_gettime(timep) - time_t timep = NO_INIT - CODE: - char *host = "localhost"; - RETVAL = rpcb_gettime(host, &timep); - OUTPUT: - timep - RETVAL +might be translated into C code similar to: -Another way to declare C is to use a C block in the CODE: section: + { + int j = 1; + int i; - bool_t - rpcb_gettime(timep) - time_t timep = NO_INIT - CODE: - { - char *host = "localhost"; - RETVAL = rpcb_gettime(host, &timep); + if (items < 1) + i = 0; + else { + i = (int)SvIV(ST(0)); } - OUTPUT: - timep - RETVAL - -The ability to put additional declarations before the typemap entries are -processed is very handy in the cases when typemap conversions manipulate -some global state: - - MyObject - mutate(o) - PREINIT: - MyState st = global_state; - INPUT: - MyObject o; - CLEANUP: - reset_to(global_state, st); - -Here we suppose that conversion to C in the INPUT: section and from -MyObject when processing RETVAL will modify a global variable C. -After these conversions are performed, we restore the old value of -C (to avoid memory leaks, for example). - -There is another way to trade clarity for compactness: INPUT sections allow -declaration of C variables which do not appear in the parameter list of -a subroutine. Thus the above code for mutate() can be rewritten as - - MyObject - mutate(o) - MyState st = global_state; - MyObject o; - CLEANUP: - reset_to(global_state, st); - -and the code for rpcb_gettime() can be rewritten as + bar(i, j); + } - bool_t - rpcb_gettime(timep) - time_t timep = NO_INIT - char *host = "localhost"; - C_ARGS: - host, &timep - OUTPUT: - timep - RETVAL +Usually you could dispense with C by just wrapping the code in +C blocks in braces, but it may be necessary if the ordering of the +variable initialisations is sensitive, e.g. if it affected by some +changing global state. =head3 The INPUT: Keyword From 89660c7fcbde239a1a5147f163a91444fd8d40a8 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Fri, 1 Aug 2025 10:01:44 +0100 Subject: [PATCH 22/42] perlxs.pod: update 'The INPUT: Keyword' section This commit completely rewrites this section and subsections: =head3 The INPUT: Keyword =head4 The NO_INIT Keyword =head4 Initializing Function Parameters =head4 The & Unary Operator It de-emphasises the INPUT keyword and suggests using ANSI XS signatures etc instead. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 250 ++++++++++++--------------- 1 file changed, 114 insertions(+), 136 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index b0d9b2c1b6f8..420c86f62ff7 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -2595,166 +2595,144 @@ changing global state. =head3 The INPUT: Keyword -The XSUB's parameters are usually evaluated immediately after entering the -XSUB. The INPUT: keyword can be used to force those parameters to be -evaluated a little later. The INPUT: keyword can be used multiple times -within an XSUB and can be used to list one or more input variables. This -keyword is used with the PREINIT: keyword. + void + foo(a, b, c, d, e, int f) + # implicit INPUT section + int a + # explicit INPUT section + INPUT: + long &b + int c = ($type)MySvIV($arg) + int d = NO_INIT + int e + if (some_condition) { $var += 1 } + ... -The following example shows how the input parameter C can be -evaluated late, after a PREINIT. +Immediately following an XSUB's declaration, there is an implicit C +section, i.e. the parser behaves as if there was a literal "C +line injected before the first line of the body. This can be followed by +zero or more explicit C sections, possibly interleaved with other +keywords and sections such as C. - bool_t - rpcb_gettime(host, timep) - char *host - PREINIT: - time_t tt; - INPUT: - time_t timep - CODE: - RETVAL = rpcb_gettime(host, &tt); - timep = tt; - OUTPUT: - timep - RETVAL +When XS was first created, it was modelled on the syntax of pre-ANSI C, +which required the types of parameters to be separately specified. It was +later updated to allow parameter types to be specified in the parameter +list, like ANSI C. Thus there is rarely any good reason to use C +sections now; but you will often encounter them in older code. -The next example shows each input parameter evaluated late. +Each C line, at a minimum, specifies the type of a parameter listed +in the XSUB's signature, e.g. - bool_t - rpcb_gettime(host, timep) - PREINIT: - time_t tt; - INPUT: - char *host + char *s + +In addition, the variable name may be prefixed with C<&> to indicate that +a pointer to the variable should be passed to any autocall function; and +may have a postfix initialisation modifier starting with one of the three +characters C<= + ;>. + +Note that if a variable name doesn't match any of the declared parameters, +then it I be treated as an auto variable declaration (depending on +the perl version and on whether it has an initialisation override). This +misfeature may be deprecated at some point in the future, so don't rely on +it: use a C section if necessary. These two examples are mostly +equivalent, with the first form being preferred: + + void + foo(int a) PREINIT: - char *h; - INPUT: - time_t timep - CODE: - h = host; - RETVAL = rpcb_gettime(h, &tt); - timep = tt; - OUTPUT: - timep - RETVAL + short b = 1; -Since INPUT sections allow declaration of C variables which do not appear -in the parameter list of a subroutine, this may be shortened to: + void + foo(a) + int a + short b = 1; - bool_t - rpcb_gettime(host, timep) - time_t tt; - char *host; - char *h = host; - time_t timep; - CODE: - RETVAL = rpcb_gettime(h, &tt); - timep = tt; +=head4 The & variable modifier in INPUT + +The C<&> variable modifier has the single effect that the corresponding +argument passed to an autocall function will have the variable name +prefixed with C<&>. Combined with C, this allows the +I
of a variable to be passed to a wrapped function, which updates +that variable's value; on return, the XSUB updates the caller's arg with +that value. The modern equivalent of this is to declare the parameter as +C. These two XSUBs are equivalent: + + void + foo(IN_OUT int i) + + void + foo(i) + int &i OUTPUT: - timep - RETVAL + i -(We used our knowledge that input conversion for C is a "simple" one, -thus C is initialized on the declaration line, and our assignment -C is not performed too early. Otherwise one would need to have the -assignment C in a CODE: or INIT: section.) +and they both wrap a C function called foo() that takes a single C +argument which (presumably) updates the integer pointed to. They both +generate C code similar to this: -=head4 The NO_INIT Keyword + int i = (int)SvIV(ST(0)); + foo(&i); + sv_setiv(ST(0), (IV)i); -The NO_INIT keyword is used to indicate that a function -parameter is being used only as an output value. The B -compiler will normally generate code to read the values of -all function parameters from the argument stack and assign -them to C variables upon entry to the function. NO_INIT -will tell the compiler that some parameters will be used for -output rather than for input and that they will be handled -before the function terminates. -The following example shows a variation of the rpcb_gettime() function. -This function uses the timep variable only as an output variable and does -not care about its initial contents. +=head4 Altering variable initialisation in INPUT - bool_t - rpcb_gettime(host, timep) - char *host - time_t &timep = NO_INIT - OUTPUT: - timep +Normally each declared parameter causes a C auto variable of the same name +to be declared, and for code to be planted which initialises that variable +to the value of the corresponding passed argument. The initialisation code +is usually obtained by expanding the typemap template corresponding to +the parameter's type. It is possible to override, augment, or skip that +initialisation code, by appending one of the three characters C<= + ;> and +an initialiser expression, to the C line. -=head4 Initializing Function Parameters - -C function parameters are normally initialized with their values from -the argument stack (which in turn contains the parameters that were -passed to the XSUB from Perl). The typemaps contain the -code segments which are used to translate the Perl values to -the C parameters. The programmer, however, is allowed to -override the typemaps and supply alternate (or additional) -initialization code. Initialization code starts with the first -C<=>, C<;> or C<+> on a line in the INPUT: section. The only -exception happens if this C<;> terminates the line, then this C<;> -is quietly ignored. - -The following code demonstrates how to supply initialization code for -function parameters. The initialization code is eval'ed within double -quotes by the compiler before it is added to the output so anything -which should be interpreted literally [mainly C<$>, C<@>, or C<\\>] -must be protected with backslashes. The variables C<$var>, C<$arg>, -and C<$type> can be used as in typemaps. + void + foo(a,b,c,d,e,f,g) - bool_t - rpcb_gettime(host, timep) - char *host = (char *)SvPVbyte_nolen($arg); - time_t &timep = 0; - OUTPUT: - timep + # Use the standard typemap entry: + int a + # and with optional trailing colon + int b; -This should not be used to supply default values for parameters. One -would normally use this when a function parameter must be processed by -another library function before it can be used. Default parameters are -covered in the next section. + # Override the typemap entry: + int c = ($type)MySvIV($arg) -If the initialization begins with C<=>, then it is output in -the declaration for the input variable, replacing the initialization -supplied by the typemap. If the initialization -begins with C<;> or C<+>, then it is performed after -all of the input variables have been declared. In the C<;> -case the initialization normally supplied by the typemap is not performed. -For the C<+> case, the declaration for the variable will include the -initialization from the typemap. + # Skip the initialisation entirely: + int d = NO_INIT + int e ; NO_INIT -=head4 The & Unary Operator + # Add deferred initialisation code + # *in addition* to the standard init: + int f + if (some_condition) { $var += 1 } -The C<&> unary operator in the INPUT: section is used to tell B -that it should convert a Perl value to/from C using the C type to the left -of C<&>, but provide a pointer to this value when the C function is called. + # Add deferred initialisation code + # *instead of* the standard init: + int g ; if (some_condition) { $var += 1 } -This is useful to avoid a CODE: block for a C function which takes a parameter -by reference. Typically, the parameter should be not a pointer type (an -C or C but not an C or C). +Any override code is passed through template expansion in the same way +that typemap templates are, with C<$var>, C<$arg>, C<$type> etc being +expanded. Deferred initialisation code is placed after all variable +declarations. -The following XSUB will generate incorrect C code. The B compiler will -turn this into code which calls C with parameters C<(char -*host, time_t timep)>, but the real C wants the C -parameter to be of type C rather than C. +In modern XS where C is not often used, some of these initialiser +effects can be achieved in other ways: - bool_t - rpcb_gettime(host, timep) - char *host - time_t timep - OUTPUT: - timep +=over -That problem is corrected by using the C<&> operator. The B compiler -will now turn this into code which calls C correctly with -parameters C<(char *host, time_t *timep)>. It does this by carrying the -C<&> through, so the function call looks like C. +=item * - bool_t - rpcb_gettime(host, timep) - char *host - time_t &timep - OUTPUT: - timep +an overridden typemap entry could be specified by using C +to add a template for the type of this variable; + +=item * + +skipping initialisation can be achieved using the C and C +parameter declaration modifiers; + +=item * + +adding deferred initialisation code may be achievable via C or +C blocks. + +=back =head3 The SCOPE: Keyword From fd4933d54960a9ba07915a4d71b4011dc38b61ae Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Fri, 1 Aug 2025 15:39:16 +0100 Subject: [PATCH 23/42] perlxs.pod: update 'SCOPE: Keyword' section --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 78 +++++++++++++++++----------- 1 file changed, 48 insertions(+), 30 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 420c86f62ff7..f8f78c7a73fa 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -1675,8 +1675,8 @@ it will be treated as the start of a new XSUB. =back The following file-scoped keywords are supported. Note that the -L can technically be a file-scoped keyword too, -but is described further down as an XSUB keyword. +L can technically be a +file-scoped keyword too, but is described further down as an XSUB keyword. =head3 The REQUIRE: Keyword @@ -2528,7 +2528,7 @@ L and L. The first allows you to inject extra variable declaration lines, while the latter used to be needed to specify the type of each parameter, but is now mainly of historical interest. This is also the place for the rarely-used -L keyword. +L keyword. Note that the keywords described in L and L may also appear in this part, plus the L blocks. =back -=head3 The SCOPE: Keyword - -The SCOPE: keyword allows scoping to be enabled for a particular XSUB. -Its effect is to wrap the main body of the XSUB (i.e. the C or -C or implicit) with an C and C pair. This has the -effect of clearing any accumulated savestack entries at the end of the -code body. It is disabled by default. - -The SCOPE keyword may appear either within the XSUB body (anywhere before -a C could appear), or just before the XSUB declaration, but part of -the same paragraph (i.e. no intervening blank lines). For example: +=head3 The SCOPE: Keyword and typemap entry + # XSUB-scoped void - foo() - INPUT: - ... - PREINIT: - ... + foo(int i) SCOPE: ENABLE CODE: ... + # file-scoped SCOPE: ENABLE void - bar() - ... + bar(int i) + CODE: + ... -The first form (within the XSUB body) has been available since perl-5.004, -but was broken by perl-5.12.0 (xsubpp v2.21) and fixed in perl-5.44.0 -(xsubpp v3.58). The second form has been available since perl-5.12.0 . + # typemap entry + TYPEMAP: <, then scoping will -be automatically enabled for any XSUB which uses that typemap entry for an -C parameter. This currently only works for parameters whose type -is specified in a separate C line rather than any ANSI-style -declaration (C). +The C keyword can be used to enable scoping for a particular XSUB +(disabled by default). Its effect is to wrap the main body of the XSUB +(including most parameter and return value processing) within an C<{ +ENTER;> and C pair. This has the effect of clearing any +accumulated savestack entries at the end of the code body. If disabled, +then the savestack will usually be cleared by the caller anyway, so this +is a rarely-used keyword. + +The SCOPE keyword may be either XSUB-scoped or file-scoped (this refers to +the scope of the keyword within the XS file, not to the scope generated by +the keyword). For the first, it may appear anywhere in the input part or +the XSUB. For the latter, it may appear anywhere in file scope, but due to +a long-standing parser bug, the keyword's state is reset at the start of +each XSUB, so it will only have any effect if appears just before a XSUB +declaration and as part of the same paragraph (i.e. with no intervening +blank lines), such as in the example above. It will only affect the single +following XSUB. + +The XSUB-scoped form has been available since perl-5.004, but was broken +by perl-5.12.0 (F v2.21) and fixed in perl-5.44.0 (F +v3.58). The file-scoped form has been available since perl-5.12.0 . + +To support potentially complex type mappings, if an C typemap entry +contains a code comment like C, then scoping will be +automatically enabled for any XSUB which uses that typemap entry. This +currently only works for parameters whose type is specified using +old-style C lines rather than an ANSI-style declaration, i.e. not +for C. In fact, the XS parser, when looking for a SCOPE +comment in a typemap, is currently very lax: it's actually a +case-insensitive match of any code comment which contains the text "scope" +plus anything else. But you shouldn't rely on this; always use the form +shown here. Even better, just don't use it at all. =head2 The XSUB Init Part From 7fd59c0c00df4a27daa99b0e9d0c37d033a58d9a Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Mon, 4 Aug 2025 18:44:11 +0100 Subject: [PATCH 24/42] perlxs.pod: update: init part, INIT sections Add text for the new '=head2 The XSUB Init Part' section, and rewrite the existing entry for the INIT keyword. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 50 +++++++++++++--------------- 1 file changed, 24 insertions(+), 26 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index f8f78c7a73fa..9c39e77f4918 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -2792,39 +2792,37 @@ shown here. Even better, just don't use it at all. =head2 The XSUB Init Part -XXX TBC - -XXX NB: the keywords described in L and L may also appear in this part, plus C C and -C. +Following an XSUB's input part, an optional init part follows. This +consists solely of the C keyword described below, plus the keywords +described in L and L, plus the +L, L and +L keywords. =head3 The INIT: Keyword -The INIT: keyword allows initialization to be inserted into the XSUB before -the compiler generates the call to the C function. Unlike the CODE: keyword -above, this keyword does not affect the way the compiler handles RETVAL. +The C keyword allows arbitrary initialisation code to inserted after +any variable declarations (and their initialisations), but before the main +body of code. It is primarily intended for use when the main body is an +autocall to a C function. For example these two XSUBs are equivalent: - bool_t - rpcb_gettime(host, timep) - char *host - time_t &timep + int + foo(int i) INIT: - printf("# Host is %s\n", host); - OUTPUT: - timep - -Another use for the INIT: section is to check for preconditions before -making a call to the C function: + if (i < 0) + XSRETURN_UNDEF; - long long - lldiv(a, b) - long long a - long long b - INIT: - if (a == 0 && b == 0) + int + foo(int i) + CODE: + if (i < 0) XSRETURN_UNDEF; - if (b == 0) - croak("lldiv: cannot divide by 0"); + RETVAL = foo(i); + OUTPUT: + RETVAL + +Any lines following C until the next keyword (except POD and XS +comments) are copied out as-is to the C code file. Multiple C +keywords are allowed. =head2 The XSUB Code Part From 8e0e4dec38a8646173db58ab41cb2e2b6c3f1ab8 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Wed, 6 Aug 2025 13:04:47 +0100 Subject: [PATCH 25/42] perlxs.pod: update: code part, autocall, C_ARGS Add text to the new =head2 The XSUB Code Part =head3 Auto-calling a C function sections, and rewrite the existing =head4 The C_ARGS: Keyword section --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 151 +++++++++++++++++++++++---- 1 file changed, 128 insertions(+), 23 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 9c39e77f4918..2074e741029e 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -2826,40 +2826,145 @@ keywords are allowed. =head2 The XSUB Code Part -XXX TBC - -XXX NB: the keywords described in L and L may also appear in this part. +Following an XSUB's optional init part, an optional code part follows. This +consists mainly of the C or C keywords, which provide the +code block for the main body of the XSUB. These two keywords are similar, +except that C can be thought of as acting at a lower level; it +resets the stack pointer to the base of the stack frame and then relies on +the programmer to push any return values; whereas C will (with +prompting) automatically generate code to return the value of C. + +There is also a rarely-used C keyword which generates +a body which croaks. + +Only one of these keywords may appear in this part, and at most once; and +no other keywords are recognised in this part (although such keywords +could be instead be processed in the tail or head of the preceding and +following init and output parts). + +In the absence of any of those three keywords, the XS compiler will +generate an autocall: a call to the C function of the same name as the +XSUB. =head3 Auto-calling a C function -XXX TBC +In the absence of any explicit main body code via C or C, +the XS parser will generate a body for you automatically (this is referred +to as C in this document). In its most basic form, the parser +assumes that the XSUB will be a simple wrapper for a C function of the +same name, with the same parameters and return type as the XSUB. So for +example, these two XSUB definitions are equivalent, but the first is an +autocall with less boilerplate needed: -=head4 The C_ARGS: Keyword + int + foo(char *s, short flags) -The C_ARGS: keyword allows creating of XSUBS which have different -calling sequence from Perl than from C, without a need to write -CODE: or PPCODE: section. The contents of the C_ARGS: paragraph is -put as the argument to the called C function without any change. + int + foo(char *s, short flags) + CODE: + RETVAL = foo(s, flags); + OUTPUT: + RETVAL -For example, suppose that a C function is declared as +Note that the XSUB C function and the wrapped C function are two +different entities; the first will have a name like C; +when Perl code calls the 'Perl' function C, behind the +scenes the Perl interpreter calls C, which extracts the +string and short int values from the two passed argument SVs, calls +C, then stuffs its return value into an SV and returns that to the +Perl caller. - symbolic nth_derivative(int n, symbolic function, int flags); +The two basic types of generated autocall code are: -and that the default flags are kept in a global C variable -C. Suppose that you want to create an interface which -is called as + foo(a, b, c); - $second_deriv = $function->nth_derivative(2); + RETVAL = foo(a, b, c); -To do this, declare the XSUB as +depending on whether the XSUB is declared C or not. The variables +passed to the function are usually just the names of the XSUB's +parameters, in the same order. Parameters with default values are +included, while ellipses are ignored. So for example - symbolic - nth_derivative(function, n) - symbolic function - int n - C_ARGS: - n, function, default_flags + int + foo(int a, int b = 0, ...) + +generates this autocall code: + + RETVAL = foo(a, b); + +There are various keywords which can be used to modify the basic behaviour +of an autocall. + +=over + +=item * + +The L keyword, which allows wrapped C functions +which share a common prefix in their names to be mapped to perl functions +whose names don't have that prefix. + +=item * + +The +L +etc parameter modifiers, which cause that parameter to be passed to the +autocalled function with a C<&> prefix, on the assumption that the +wrapped function expects a pointer and will update the location pointed +to. + +=item * + +The L pseudo-parameter"> +pseudo-parameter, which allows the length of another parameter to be +passed as a separate argument to the wrapped function, even though it +isn't a parameter of the Perl function. + +=item * + +The L keyword, which allows the arguments +passed to the wrapped function to be completely overridden: handy when +arguments need to be skipped or reordered compared with the perl +function. + +=item * + +The L keyword, which allows code to be added +directly before the autocall. + +=item * + +The L keyword, which allows code to be +added directly after the autocall. + +=item * + +Support for L XSUBs, which can (among other +things) modify the autocall into a C++ method call, e.g. +C<< THIS->foo(s,flags) >>. + +=back + + +=head4 The C_ARGS: Keyword + + void foo1(int a, int b, int c) + C_ARGS: b, a + + void foo2(int a, int b) + C_ARGS: a < 0 ? 0 : a, + b, + 0 + +Normally the arguments for an autocall are generated automatically, based +on the XSUB's parameter declarations. The C keyword allows you to +override this and manually specify the text that will be placed between +the parentheses in the autocall. This is useful when the ordering and +nature of parameters varies between Perl and C, without a need to write a +C or C section. + +The C section consists of all lines of text until the next keyword +or to the end of the XSUB, and is used without modification (except that +any POD or XS comments will be stripped). =head3 The CODE: Keyword From 4e8a05c11876b80f8dda57de855841343b4bedeb Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Sat, 9 Aug 2025 17:32:42 +0100 Subject: [PATCH 26/42] perlxs.pod: update: CODE, PPCODE Rewrite these sections: =head3 The CODE: Keyword =head3 The PPCODE: Keyword --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 272 +++++++++++++++++++-------- 1 file changed, 189 insertions(+), 83 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 2074e741029e..8965d913352d 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -2968,97 +2968,203 @@ any POD or XS comments will be stripped). =head3 The CODE: Keyword -This keyword is used in more complicated XSUBs which require -special handling for the C function. The RETVAL variable is -still declared, but it will not be returned unless it is specified -in the OUTPUT: section. - -The following XSUB is for a C function which requires special handling of -its parameters. The Perl usage is given first. - - $status = rpcb_gettime("localhost", $timep); - -The XSUB follows. - - bool_t - rpcb_gettime(host, timep) - char *host - time_t timep + int + abs_double(int i) CODE: - RETVAL = rpcb_gettime(host, &timep); + if (i < 0) + i = -i; + RETVAL = i * 2; OUTPUT: - timep - RETVAL + RETVAL + +The C keyword is the usual mechanism for providing your own code as +the main body of the XSUB. It is typically used when the XSUB, rather than +wrapping a library function, is providing general functionality which can +be more easily or efficiently implemented in C than in Perl. +Alternatively, it can still be used to wrap a library function for cases +which are too complex for autocall to handle. + +Note that on entry to the C block of code, the values of any passed +arguments will have been assigned to auto variables, but the original SVs +will still be on the stack and accessible via C if necessary. + +Similarly to autocall XSUBs, a C variable is declared if the +return value of the XSUB is not C. Unlike autocall, you have to +explicitly tell the XS compiler to generate code to return the value of +C, by using the The L keyword. +(Requiring this was probably a bad design decision, but we're stuck with +it now.) Newer XS parsers will warn if C is seen in the C +section without a corresponding C section. + +A C XSUB will typically return just the C value (or possibly +more items with the C parameter modifiers). To take complete +control over returning values, you can use the C keyword instead. +Note that it is possible for a C section to do this too, by doing its +own stack manipulation and then doing an C to return directly +while indicating that there are C items on the stack. This bypasses the +normal C etc that the XS parser will have planted after the +C lines. But it is usually cleaner to use C instead. + +Any lines following C until the next keyword (except POD and XS +comments) are copied out as-is to the C code file. Multiple C +keywords are not allowed. =head3 The PPCODE: Keyword -The PPCODE: keyword is an alternate form of the CODE: keyword and is used -to tell the B compiler that the programmer is supplying the code to -control the argument stack for the XSUBs return values. Occasionally one -will want an XSUB to return a list of values rather than a single value. -In these cases one must use PPCODE: and then explicitly push the list of -values on the stack. The PPCODE: and CODE: keywords should not be used -together within the same XSUB. - -The actual difference between PPCODE: and CODE: sections is in the -initialization of C macro (which stands for the I Perl -stack pointer), and in the handling of data on the stack when returning -from an XSUB. In CODE: sections SP preserves the value which was on -entry to the XSUB: SP is on the function pointer (which follows the -last parameter). In PPCODE: sections SP is moved backward to the -beginning of the parameter list, which allows C macros -to place output values in the place Perl expects them to be when -the XSUB returns back to Perl. - -The generated trailer for a CODE: section ensures that the number of return -values Perl will see is either 0 or 1 (depending on the Cness of the -return value of the C function, and heuristics to work around CODE -setting C on a C XSUB. The trailer generated for a PPCODE: section -is based on the number of return values and on the number of times -C was updated by C<[X]PUSH*()> macros. - -Note that macros C, C and C work equally -well in CODE: sections and PPCODE: sections. - -The following XSUB will call the C rpcb_gettime() function -and will return its two output values, timep and status, to -Perl as a single list. + # XS equivalent of: sub one_to_n { my $n = $_[0]; 1..$n } void - rpcb_gettime(host) - char *host - PREINIT: - time_t timep; - bool_t status; + one_to_n(int n) PPCODE: - status = rpcb_gettime(host, &timep); - EXTEND(SP, 2); - PUSHs(sv_2mortal(newSViv(status))); - PUSHs(sv_2mortal(newSViv(timep))); - -Notice that the programmer must supply the C code necessary -to have the real rpcb_gettime() function called and to have -the return values properly placed on the argument stack. - -The C return type for this function tells the B compiler that -the RETVAL variable is not needed or used and that it should not be created. -In most scenarios the void return type should be used with the PPCODE: -directive. - -The EXTEND() macro is used to make room on the argument -stack for 2 return values. The PPCODE: directive causes the -B compiler to create a stack pointer available as C, and it -is this pointer which is being used in the EXTEND() macro. -The values are then pushed onto the stack with the PUSHs() -macro. - -Now the rpcb_gettime() function can be used from Perl with -the following statement. - - ($status, $timep) = rpcb_gettime("localhost"); - -When handling output parameters with a PPCODE section, be sure to handle -'set' magic properly. See L for details about 'set' magic. + { + int i; + if (n < 1) + Perl_croak_nocontext( + "one_to_n(): argument %d must be >= 1", n); + EXTEND(SP, n); + for (i = 1; i <= n; i++) + mPUSHi(i); + } + +The C keyword is similar to the C keyword, except that on +entry it resets the stack pointer to the base of the current stack frame, +and it doesn't generate any code to return C or similar: pushing +return values onto the stack is left to the programmer. In this way it can +be viewed as a lower-level alternative to C, when you want to take +full control of manipulating the argument stack. The "PP" in its name +stands for "PUSH/PULL", reflecting the low-level stack manipulation. +C is typically used when you want to return several values or even +an arbitrary list, compared with C, which normally returns just the +value of C. + +The C keyword must be the last keyword in the XSUB. Any lines +following C until the end of the XSUB (except POD and XS comments) +are copied out as-is to the C code file. Multiple C keywords are +not allowed. + +Typically you declare a C XSUB with a return type of C; any +other return type will cause a C auto variable of that type to be +declared, which will be otherwise unused. + +On entry to the C block of code, the values of any declared +parameters arguments will have already been assigned to auto variables, +but the original SVs will still be on the stack and initially accessible +via C if necessary. But the default assumption for a C +block is that you have already finished processing any supplied arguments, +and that you want to push a number of return values onto the stack. The +simple C example shown above is based on that assumption. But +more complex strategies are possible. + +There are basically two ways to access and manipulate the stack in a +C block. First, by using the C macro, to get, modify, or +replace the Ith item in the current stack frame, and secondly to push +(usually temporary) return values onto the stack. The first uses the +hidden C variable, which is set on entry to the XSUB, and is the index +of the base of the current stack frame. This remains unchanged throughout +execution of the XSUB. The second approach uses the local stack pointer, +C (more on that below), which on entry to the C block points +to the base of the stack frame. Macros like C store a temporary +SV at that location, then increment C. On return from a C +XSUB, the current value of C is used to indicate to the caller how +many values are being returned. + +In general these two ways of accessing the stack should not be mixed, or +confusion is likely to arise. The PUSH strategy is most useful when you +have no further use for the passed arguments, and just want to generate +and return a list of values, as in the C example above. The +C strategy is better when you still need to access the passed +arguments. In the example below, + + # XS equivalent of: sub triple { map { $_ * 3} @_ } + + void + triple(...) + PPCODE: + SP += items; + { + int i; + for (i = 0; i < items; i++) { + int val = (int)SvIV(ST(i)); + ST(i) = sv_2mortal(newSViv(val*3)); + } + } + +C is first incremented to reclaim the passed arguments which are still +on the stack; then one by one, each passed argument is retrieved, and then +each stack slot is replaced with a new mortal value. When the loop is +finished, the current stack frame contains a list of mortals, which is +then returned to the caller, with C indicating how many items are +returned. + +Before pushing return values onto the stack (or storing values at C +locations higher than the number of passed arguments), it is necessary to +ensure there is sufficient space on the stack. This can be achieved either +through the C macro as shown in the C example +above, or by using the 'X' variants of the push macros, such as +C, which can be used to check and extend the stack by one each +time. Doing a single C in advance is more efficient. C +will ensure that there is at least enough space on the stack for n further +items to be pushed. + +If using the PUSH strategy, it is useful to understand in more detail how +pushing and the local stack pointer, C are implemented. The generated +C file will have access to (among others) the following macro definitions +or similar: + + #define dSP SV **sp = PL_stack_sp + #define SP sp + #define PUSHs(s) *++sp = (s) + #define mPUSHi(i) sv_setiv(PUSHs(sv_newmortal()), (IV)(i)) + #define PUTBACK PL_stack_sp = sp + #define SPAGAIN sp = PL_stack_sp + #define dXSARGS dSP; .... + +The global (or per-interpreter) variable C is a pointer to +the current top-most entry on the stack, equal initially to +C<&ST(items-1)>. On entry to the XSUB, the C at its top will +cause the C variable to be declared and initialised. This becomes a +I copy of the argument stack pointer. The standard stack +manipulation macros such as C all use this local copy. + +The XS parser will usually emit two lines of C code similar to these +around the PP code block lines: + + SP -= items; + ... PP lines ... + PUTBACK; return; + +This has the effect of resetting the local copy of the stack pointer (but +I the stack pointer itself) back to the base of the current stack +frame, discarding any passed arguments. The original arguments are still +on the stack. C etc will, starting at the base of the stack +frame, progressively overwrite any original arguments. Finally, the +C sets the real stack pointer to the copy, making the changes +permanent, and also allowing the caller to determine how many arguments +were returned. + +Any functions called from the XSUB will only see the value of +C and not C. So when calling out to a function which +manipulates the stack, you may need to resynchronise the two; for example: + + PUTBACK; + push_contents_of_array(av); + SPAGAIN; + +The C and C macros will update both +C and C if the extending causes the stack to be +reallocated. + +Note that there are several C macros, which generally create a +temporary SV, set its value to the argument, and push it onto the stack. +These are: + + mPUSHs(sv) mortalise and push an SV + mPUSHi(iv) create+push mortal and set to the integer val + mPUSHu(uv) create+push mortal and set to the unsigned val + mPUSHn(n) create+push mortal and set to the num (float) val + mPUSHp(str, len) create+push mortal and set to the string+length + mPUSHpvs("string") create+push mortal and set to the literal string + (perl 5.38.0 onwards) =head3 NOT_IMPLEMENTED_YET From 5ab6a04f5809f2024a87210acad9fea7ac00c261 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Tue, 12 Aug 2025 16:54:35 +0100 Subject: [PATCH 27/42] perlxs.pod: update NOT_IMPLEMENTED_YET: keyword This keyword formerly wasn't documented. The docs now say "this is what it is, but don't use it". --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 8965d913352d..6fd22252aafd 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -3166,9 +3166,20 @@ These are: mPUSHpvs("string") create+push mortal and set to the literal string (perl 5.38.0 onwards) -=head3 NOT_IMPLEMENTED_YET +=head3 The NOT_IMPLEMENTED_YET: Keyword -XXX TBC + void + foo(int a) + NOT_IMPLEMENTED_YET: + +This keyword, as a fourth alternative to C, C and autocall, +generates a main body for the XSUB consisting solely of the C code: + + Perl_croak(aTHX_ "Foo::Bar::foo: not implemented yet"); + +The current implementation is quite buggy in terms of parsing and where +the keyword can appear within an XSUB, so it's generally better to avoid +it. It is documented here for completeness. =head2 The XSUB Output Part From 62648568f84d62e354ef058433561b9c115c7699 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Wed, 13 Aug 2025 15:06:34 +0100 Subject: [PATCH 28/42] perlxs.pod: update: output part Add text to the new =head2 The XSUB Output Part section, and rewrite the text in these existing sections: =head3 The POSTCALL: Keyword =head3 The OUTPUT: Keyword --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 186 +++++++++++++++++++-------- 1 file changed, 133 insertions(+), 53 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 6fd22252aafd..dfc8ebb3613e 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -3183,74 +3183,154 @@ it. It is documented here for completeness. =head2 The XSUB Output Part -XXX TBC +Following an XSUB's code part, any results may be post-processed and +returned. Two keywords in particular support this: L, which allows for a block of code to be added after +any autocall in order to post-process return values from the call, and +L, which tells the parser to generate code to +return the value of C or to update the values of one or more +passed arguments. + +These two optional keywords should each only be used once at most, and in +that order; but due to a parsing bug (kept for backwards compatibility), +they can appear in either order any number of times. But don't do that. -XXX NB: the keywords described in L and L and L may also appear in this part. =head3 The POSTCALL: Keyword -This keyword can be used when an XSUB requires special procedures -executed after the C subroutine call is performed. When the POSTCALL: -keyword is used it must precede OUTPUT: and CLEANUP: blocks which are -present in the XSUB. +The C keyword allows a block of code to be inserted directly after +any autocall or C/C code block (although it's really only of +use with autocall). It's typically used for cleaning up the return value +from the autocall. For example these two XSUBs are equivalent: -See an example in L<"An XSUB Declaration">. - -The POSTCALL: block does not make a lot of sense when the C subroutine -call is supplied by user by providing either CODE: or PPCODE: section. + int + foo(int a) + POSTCALL: + if (RETVAL < 0) + RETVAL = 0 + int + foo(int a) + CODE: + RETVAL = foo(a); + if (RETVAL < 0) + RETVAL = 0 + OUTPUT: + RETVAL =head3 The OUTPUT: Keyword -The OUTPUT: keyword indicates that certain function parameters should be -updated (new values made visible to Perl) when the XSUB terminates or that -certain values should be returned to the calling Perl function. For -simple functions which have no CODE: or PPCODE: section, -such as the sin() function above, the RETVAL variable is -automatically designated as an output value. For more complex functions -the B compiler will need help to determine which variables are output -variables. - -This keyword will normally be used to complement the CODE: keyword. -The RETVAL variable is not recognized as an output variable when the -CODE: keyword is present. The OUTPUT: keyword is used in this -situation to tell the compiler that RETVAL really is an output -variable. - -The OUTPUT: keyword can also be used to indicate that function parameters -are output variables. This may be necessary when a parameter has been -modified within the function and the programmer would like the update to -be seen by Perl. + # Common usage: - bool_t - rpcb_gettime(host, timep) - char *host - time_t &timep + OUTPUT: + RETVAL + + # Rare usage: + + OUTPUT: + arg0 + SETMAGIC: DISABLE + arg1 + SETMAGIC: ENABLE + arg2 sv_setfoo(ST[2], arg2) + + +The C keyword can be used to indicate that the value of RETVAL +should be returned to the caller on the stack, and/or that the values of +certain passed Perl arguments should be updated with the current values of +the corresponding parameter variables. Each non-blank line of the +C block should contain the name of one variable, with optional +setting code, or a C keyword with a value of C or +C. + +The common usage is to list just the C variable: + + int + foo() + CODE: + RETVAL = ...; + OUTPUT: + RETVAL + +It is needed for XSUBs containing a C block to tell the XS compiler +to generate C code which will return the value of C to the caller. +For autocall XSUBs, this is done automatically without the need for the +C keyword. + +The second usage of C is to specify parameters to be updated; this +usage has been almost completely replaced by using the +L +parameter modifier. For example these two XSUBs have identical behaviours, +but the second is the preferred form: + + int + foo1(a) + INPUT: + int &a + OUTPUT: + a + + int + foo2(IN_OUT int a) + +They both cause output C code similar to this to be planted (with the +first part derived from a typemap): + + sv_setiv(ST(0), (IV)a); + SvSETMAGIC(ST(0)); + +which updates the value of the passed SV with the current value of C, +and then calls the SV's I magic, if any: which will, for example, +cause a tied variable to have its C method called. + +You can skip the planting of the C magic call with +C; in the example at the start of this section, C +and C will have set magic, while C won't. The C +setting remains in force until another C, or notionally until +the end of the current C block. In fact the current setting will +carry over into any further C declarations within in the same +XSUB, or since Perl 5.40.0, only into any declarations within the same +case C branch. + +The current setting of C is ignored for C, which is +usually setting the value of a fresh temporary SV which won't have any +attached magic anyway. + +Finally, it is possible to override the typemap entry used to set the +value of the temporary SV or passed argument from the C or other +variables. Normally, in an XSUB like: + + int + foo(int abc) OUTPUT: - timep + abc -The OUTPUT: keyword will also allow an output parameter to -be mapped to a matching piece of code rather than to a -typemap. - bool_t - rpcb_gettime(host, timep) - char *host - time_t &timep +the C type (via a two-stage lookup in the system typemap) will yield +this output typemap entry: + + sv_setiv($arg, (IV)$var); + +which, after variable expansion, may yield + + sv_setiv(ST(0), (IV)abc); + +or similar. This can be overridden; for example + + int + foo(int abc) OUTPUT: - timep sv_setnv(ST(1), (double)timep); - -B emits an automatic C for all parameters in the -OUTPUT section of the XSUB, except RETVAL. This is the usually desired -behavior, as it takes care of properly invoking 'set' magic on output -parameters (needed for hash or array element parameters that must be -created if they didn't exist). If for some reason, this behavior is -not desired, the OUTPUT section may contain a C line -to disable it for the remainder of the parameters in the OUTPUT section. -Likewise, C can be used to reenable it for the -remainder of the OUTPUT section. See L for more details -about 'set' magic. + abc my_setiv(ST(0), (IV)abc); + +But importantly, unlike the similar syntax in C lines, the override +text is I variable expanded. It is thus tricky to ensure that the +right arguments are used (such as C). Basically this feature has a +design flaw and should probably be avoided. Since 5.16.0 it's been +possible to have locally defined typemaps using the L keyword which is probably a better way to modify how +values are returned. =head2 The XSUB Cleanup Part From 08a12248fae0dfbf72777163cf4584d6842ecb58 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Wed, 13 Aug 2025 19:53:52 +0100 Subject: [PATCH 29/42] perlxs.pod: update: cleanup part Add text to the new =head2 The XSUB Cleanup Part section, and rewrite the text in this existing section: =head3 The CLEANUP: Keyword --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index dfc8ebb3613e..53c9af010a3b 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -3334,18 +3334,31 @@ values are returned. =head2 The XSUB Cleanup Part -XXX TBC +Following an XSUB's output part, where code will have been planted to +return the value of C and C/C parameters, it's +possible to inject some final clean-up code by using the C +keyword. -XXX NB: the keywords described in L and L and L may also appear in this part. =head3 The CLEANUP: Keyword -This keyword can be used when an XSUB requires special cleanup procedures -before it terminates. When the CLEANUP: keyword is used it must follow -any CODE:, or OUTPUT: blocks which are present in the XSUB. The code -specified for the cleanup block will be added as the last statements in -the XSUB. + char * + foo(int a) + CODE: + RETVAL = get_foo(a); + OUTPUT: + RETVAL + CLEANUP: + free(RETVAL); /* assuming get_foo() returns a malloced buffer */ + +The C keyword allows a block of code to be inserted directly +after any output code which has been generated automatically or via the +C keyword. It can be used when an XSUB requires special clean-up +procedures before it terminates. The code specified for the clean-up block +will be added as the last statements in the XSUB before the final +C or similar. =head2 XSUB Generic Keywords From cc816224f266c0923f635ff2cb642bd3ddea9136 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Wed, 13 Aug 2025 20:29:07 +0100 Subject: [PATCH 30/42] perlxs.pod: update generic intro, PROTOTYPE Add text to the new =head2 XSUB Generic Keywords section, and rewrite the text in this existing section: =head3 The PROTOTYPE: Keyword --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 60 ++++++++++++++++------------ 1 file changed, 34 insertions(+), 26 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 53c9af010a3b..77c7501dbb56 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -3362,39 +3362,47 @@ C or similar. =head2 XSUB Generic Keywords -XXX TBC +There are a few per-XSUB keywords which can appear anywhere within the +body of an XSUB. This is because they affect how the XSUB is registered +with the Perl interpreter, rather than affecting how the C code of the +XSUB itself is generated. These are described in the following +subsections. In addition there are a few more generic keywords which are +described later under L. -These keywords can appear anywhere within the body of an XSUB. +On aesthetic grounds, it is best to use these keywords near the start of +the XSUB. =head3 The PROTOTYPE: Keyword -This keyword is similar to the PROTOTYPES: keyword above but can be used to -force B to use a specific prototype for the XSUB. This keyword -overrides all other prototype options and keywords but affects only the -current XSUB. Consult L for information about Perl -prototypes. + int + foo1(int a, int b = 0) + # this XSUB gets an auto-generated '$;$' prototype + PROTOTYPE: ENABLE - bool_t - rpcb_gettime(timep, ...) - time_t timep = NO_INIT - PROTOTYPE: $;$ - PREINIT: - char *host = "localhost"; - CODE: - if (items > 1) - host = (char *)SvPVbyte_nolen(ST(1)); - RETVAL = rpcb_gettime(host, &timep); - OUTPUT: - timep - RETVAL + int + foo2(int a, int b) + # this XSUB doesn't get a prototype + PROTOTYPE: DISABLE -If the prototypes are enabled, you can disable it locally for a given -XSUB as in the following example: + int + foo3(SV* a, int b) + # this XSUB gets the specified prototype: + PROTOTYPE: \@$ - void - rpcb_gettime_noproto() - PROTOTYPE: DISABLE - ... + int + foo4(int a, int b) + # this XSUB gets a blank () prototype + PROTOTYPE: + +While the file-scoped C keyword turns automatic prototype +generation on or off for all subsequent XSUBs, the per-XSUB C +keyword overrides the setting for just the current XSUB. See the +L section for details of what a +prototype is, and why you rarely need one. + +This keyword's value can be either one of C/C to turn on +or off automatic prototype generation, or it can specify an explicit +prototype string, including the empty prototype. =head3 The OVERLOAD: Keyword From b26efebc0394bf15db9fde7e48be15db6ff266c7 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Sat, 16 Aug 2025 18:59:52 +0100 Subject: [PATCH 31/42] perlxs.pod: mention package name types Explain that a 'C' parameter type in an XSUB declaration can actually be a Perl package name or similar, e.g. Foo::Bar f(Foo::Bar obj, char *s) --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 30 ++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 77c7501dbb56..2e6e7f9277ce 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -1981,6 +1981,9 @@ return a single SV whose value is set to C's value at the time of return. In addition, a non-void autocall XSUB will call the underlying C library function and assign its return value to C. +In addition the return type can be a Perl package name; see +L for details. + If the return type is prefixed with the C keyword, then the C variable is still declared, but code to return its value is suppressed. It is typically useful when making an autocall function @@ -2086,6 +2089,7 @@ signature. Some examples of valid XSUB parameter declarations: char *foo # parameter with type + Foo::Bar foo # parameter with Perl package type char *foo = "abc" # default value char *foo = NO_INIT # doesn't complain if arg missing OUT char *foo # caller's arg gets updated @@ -2133,6 +2137,32 @@ in a C block or similar. Their values aren't normally returned. There are several variations on this basic pattern, which are explained in the following subsections. +=head3 Fully-qualified type names and Perl objects + + Foo::Bar + foo(Foo::Bar self, ...) + +Normally the type of an XUB's parameter or return value is a valid C type, +such as C<"char *">. However you can also use Perl package names. When a +type name includes a colon, it undergoes some extra processing; in +particular, the actual type as emitted into the C file is transformed +using C (unless F has been invoked with C<-hiertype>), so +that a legal C type is present. The complete effects for a type of +C are as follows. + +The type string C is looked up in the typemap I to find +the logical XS type; then the C and C typemap templates are +expanded with the C<$ntype> variable set to C<"Foo::Bar"> and the C<$type> +variable set to C<"Foo__Bar">. The declaration of the corresponding auto +variables uses the modified type string, so the example above might +result in these declarations in the C code: + + Foo__Bar RETVAL; + Foo__Bar self = ...; + +With the appropriate XS typemap entries and C typedefs, this can be used +to assist in declaring XSUBs which are passed and return Perl objects. + =head3 XSUB Parameter Placeholders Sometimes you want to skip an argument. There are two supported techniques From d11731b4e64973a23c5516949ef05eba40482fbe Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Sat, 16 Aug 2025 09:54:39 +0100 Subject: [PATCH 32/42] perlxs.pod: update OVERLOAD, add T_PTROBJ First, add a new subsection =head3 T_PTROBJ and opaque handles to the TYPEMAPs section explaining how this typemap can be used to map between Perl objects and C library handles. It provides a fully-worked example of wrapping a simple arithmetic library. Then completely rewrite the =head3 The OVERLOAD: Keyword section. In particular, it now refers to the new T_PTROBJ example and shows how it can be extended to use overloading. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 342 ++++++++++++++++++++++++--- 1 file changed, 303 insertions(+), 39 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 2e6e7f9277ce..148bce1a2c8a 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -2162,6 +2162,8 @@ result in these declarations in the C code: With the appropriate XS typemap entries and C typedefs, this can be used to assist in declaring XSUBs which are passed and return Perl objects. +See L for an example of this using +the common C typemap type. =head3 XSUB Parameter Placeholders @@ -3436,46 +3438,161 @@ prototype string, including the empty prototype. =head3 The OVERLOAD: Keyword -Instead of writing an overloaded interface using pure Perl, you -can also use the OVERLOAD keyword to define additional Perl names -for your functions (like the ALIAS: keyword above). However, the -overloaded functions must be defined in such a way as to accept the number -of parameters supplied by perl's overload system. For most overload -methods, it will be three parameters; for the C function it will -be four. However, the bitwise operators C<&>, C<|>, C<^>, and C<~> may be -called with three I five arguments (see L). - -If any -function has the OVERLOAD: keyword, several additional lines -will be defined in the c file generated by xsubpp in order to -register with the overload magic. - -Since blessed objects are actually stored as RV's, it is useful -to use the typemap features to preprocess parameters and extract -the actual SV stored within the blessed RV. See the sample for -T_PTROBJ_SPECIAL in L. - -To use the OVERLOAD: keyword, create an XS function which takes -three input parameters (or use the C-style '...' definition) like -this: + MODULE = Foo PACKAGE = Foo::Bar - SV * - cmp (lobj, robj, swap) - My_Module_obj lobj - My_Module_obj robj - IV swap - OVERLOAD: cmp <=> - { /* function defined here */} - -In this case, the function will overload both of the three way -comparison operators. For all overload operations using non-alpha -characters, you must type the parameter without quoting, separating -multiple overloads with whitespace. Note that "" (the stringify -overload) should be entered as \"\" (i.e. escaped). - -Since, as mentioned above, bitwise operators may take extra arguments, you -may want to use something like C<(lobj, robj, swap, ...)> (with -literal C<...>) as your parameter list. + SV* + subtract(SV* a, SV* b, bool swap) + OVERLOAD: - -= + CODE: + ... + +The C keyword allows you to declare that this XSUB acts as an +overload method for the specified operators in the current package. The +example above is approximately equivalent to this Perl code: + + package Foo::Bar; + + sub subtract { ... } + + use overload + '-' => \&subtract, + '-=' => \&subtract; + +The rest of the line following the keyword, plus any further lines until +the next keyword, are interpreted as a space-separated list of overloaded +operators. There is no check that they are valid operator names. The names +and symbols will eventually end up within double-quoted strings +in the C file, so double-quotes need to be escaped; in particular: + + OVERLOAD: \"\" + +This could be regarded as a bug. + +XSUBs used for overload methods are invoked with the same arguments as +Perl subroutines would be: for example, an overloaded binary operator will +trigger a call to the XSUB method with the first argument being an +overloaded object representing one of the two operands of the binary +operator; the second being the other operand (which may or may not be an +object); and third, a swap flag. See L for the full details +of how these functions will be called, with what arguments. Note that +C can in fact be undef in addition to false, to indicate an assign +overload such as C<+=>. + +Bitwise operator methods sometimes take extra arguments: in +particular under C. So you may want to use an +ellipsis (something like C<(lobj, robj, swap, ...)>) to skip them. + +The net effect of the C keyword is to add some extra code to +the boot XSUB to register this XSUB as the handler for the specified +overload actions, in the same way that C does for Perl +methods. + +See also the file-scoped L keyword for +details of how to set the fallback behaviour for the current package. + +Note that C shouldn't be mixed with the L keyword; the value of C will be undefined for any overload +method call. + +The L section contains a fully-worked +example of using the C typemap to wrap a simple arithmetic +library. The result of that wrapper allows you to write Perl code such as: + + my $i2 = My::Num->new(2); + my $i7 = My::Num->new(7); + my $i13 = My::Num->new(13); + + my $x = $i13->add($i7)->divide($i2); + printf "val=%d\n", $x->val(); + +Using overloading, we would like to be able to write those last two lines +more simply as: + + my $x = ($i13 + $i7)/$i2; + printf "val=%d\n", $x; + +The following additions and modifications to that example XS code show how +to add overloading: + + FALLBACK: UNDEF + + int + mynum_val(My::Num x, ...) + OVERLOAD: 0+ + + My::Num + mynum_add(My::Num x, My::Num y, bool swap) + OVERLOAD: + + C_ARGS: x, y + INIT: + if (swap) { + mynum* tmp = x; x = y; y = tmp; + } + + # ... and three similar XSUBs for + # mynum_subtract, mynum_multiply, mynum_divide ... + +The C line isn't actually necessary as this is the default +anyway, but is included to remind you that the keyword can be used. + +Overloading is added to the C method so that it automatically +returns the value of an object when used in a numeric context (such as for +the C above). The ellipsis is added to ignore the extra two +arguments passed to an overload method. + +The original C method which, via aliasing, handled all four +of the arithmetic operations, is now split into four separate XSUBs, since +C and C doesn't mix. + +The main change to each arithmetic XSUB part from adding the C +keyword, is that there is an extra C parameter. There's no real need +to use it for addition and multiplication, but it is important for the +non-commutative subtraction and division operations. + +That example uses the C typemap to process the second argument, +which in the most general usage may not be an object. For example the +second and third of these lines will croak with an C error: + + $i13 + My::Num->new(7); + $i13 + 7; + $i13 + "7"; + +If it is necessary to handle this, then you may need to create your own +typemap: for example, something similar to C, but with an INPUT +template along the lines of: + + T_MYNUM + SV *sv = $arg; + SvGETMAGIC(sv); + if (!SvROK(sv)) { + sv = sv_newmortal(); + sv_setref_pv(sv, "$ntype", mynum_new(SvIV($arg)); + } + .... + +Finally, although not directly related to XS, the following could be added +to F to allow integer literals to be used directly: + + sub import { + overload::constant integer => + sub { + my $str = shift; + return My::Num->new($str); + }; + } + +which then allows these lines: + + my $i2 = My::Num->new(2); + my $i7 = My::Num->new(7); + my $i13 = My::Num->new(13); + +to be rewritten more cleanly as: + + my $i2 = 2; + my $i7 = 7; + my $i13 = 13; =head3 The ATTRS: Keyword @@ -3683,6 +3800,153 @@ the different argument lists. XXX TBC +=head3 T_PTROBJ and opaque handles + +A common interface arrangement for C libraries is that some sort of +I function creates and returns a handle, which is a pointer to +some opaque data. Other function calls are then passed that handle as an +argument, until finally some sort of destroy function frees the handle and +its data. The C typemap is one common method for mapping Perl +objects to such C library handles. Behind the scenes, it uses blessed +scalar objects with the scalar's integer value set to the address of the +handle. The C code template of the C typemap retrieves the +pointer from the scalar object referred to by a passed RV argument, while +the C template creates a new blessed RV-to-SV with the handle +address stored in it. + +For the purposes of an example, we'll create here a minimal example C +library called C, which we'll then proceed to wrap using XS. This +library just stores an integer in its opaque data. In real life you would +be wrapping an existing library which stores something more interesting, +such as a complex number or a multiple precision integer. + +The following sample library code might go in the initial 'C' part of the +XS file: + + typedef struct { int i; } mynum; + + mynum* mynum_new(int i) + { + mynum* x = (mynum*)malloc(sizeof(mynum)); + x->i = i; + return x; + } + + void mynum_destroy (mynum *x) + { free((void*)x); } + + int mynum_val (mynum *x) + { return x->i; } + + mynum* mynum_add (mynum *x, mynum *y) + { return mynum_new(x->i + y->i); } + + mynum* mynum_subtract (mynum *x, mynum *y) + { return mynum_new(x->i - y->i); } + + mynum* mynum_multiply (mynum *x, mynum *y) + { return mynum_new(x->i * y->i); } + + mynum* mynum_divide (mynum *x, mynum *y) + { return mynum_new(x->i / y->i); } + +The C struct holds the opaque handle data. The C +function creates a numeric value and returns a handle to it. The other +functions then take such handles as arguments, including a destroy +function to free a handle's data. + +The following XS code shows an example of how this library might be +wrapped and be made accessible from Perl via C objects: + + typedef mynum *My__Num; + + MODULE = My::Num PACKAGE = My::Num PREFIX = mynum_ + + PROTOTYPES: DISABLE + + TYPEMAP: < which, as explained in L, is looked up as-is in the typemap, but has C +applied to the type name to convert it to the C C type when used +in the declaration of the XSUB's auto variables. + +Going through this code in order: while still in the 'C' half of the XS +file, we add a typedef which says that the C C type is equivalent +to a pointer to a handle from that arithmetic library. + +Next, the C line includes a C prefix, which means that +the names of the XSUBs in the Perl namespace will be C etc +rather than C. + +Then a C declaration is used to map the C pseudo-type to +the C XS type. + +Next comes the C class method. This will be called from perl as +C<< My::Num->new(99); >> for example. Its first parameter will be the +class name, which we don't use here, and the second parameter is the value +to initialise the object to. The XSUB autocalls the library C +function with just the C value. This returns a handle, which the +C C map converts into a blessed scalar ref containing +the handle. + +Next, the C method is just a thin wrapper around +C, while C returns the integer value of the +object. + +Finally, four binary functions are defined, sharing the same XSUB body via +aliases. + +This XS module might be accessed from Perl using code like this: + + use My::Num; + + my $i2 = My::Num->new(2); + my $i7 = My::Num->new(7); + my $i13 = My::Num->new(13); + + my $x = $i13->add($i7)->divide($i2); + printf "val=%d\n", $x->val(); # prints "val=10" + +See L for an example of how to extend this using +overloading so that the expression could be written more simply as +C<($i13 + $i7)/$i2>. + +Note that, as a very special case, the XS compiler translates the XS +typemap name using C when looking up INPUT typemap entries +for an XSUB named C. So for such subs, the C typemap +entry will be used instead. + =head2 Using XS With C++ If an XSUB name contains C<::>, it is considered to be a C++ method. From 4d48858434789bd1da0fac46b9f1d09c5951cad5 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Tue, 19 Aug 2025 09:36:56 +0100 Subject: [PATCH 33/42] perlxs.pod: document ATTRS This keyword was undocumented, even though it had been added 25 years ago. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 48 +++++++++++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 148bce1a2c8a..0d3f49bd4eed 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -3596,7 +3596,53 @@ to be rewritten more cleanly as: =head3 The ATTRS: Keyword -XXX TBC + MODULE = Foo::Bar PACKAGE = Foo::Bar + + SV* + debug() + ATTRS: lvalue + PPCODE: + # return $Foo::Bar::DEBUG, creating it if not already present: + PUSHs(GvSV(gv_fetchpvs("Foo::Bar::DEBUG", GV_ADD, SVt_IV))); + +The C keyword allows you to apply subroutine attributes to an XSUB +in a similar fashion to Perl subroutines. The XSUB in the example above is +equivalent to this Perl: + + sub debug :lvalue { return $Foo::Bar::DEBUG } + +and both can be called like this: + + use Foo::Bar; + Foo::Bar::debug() = 99; + print "$Foo::Bar::DEBUG\n"; # prints 99 + +This keyword consumes all lines until the next keyword. The contents of +each line are interpreted as space-separated attributes. The attributes +are applied at the time the XS module is loaded. This: + + void + foo(...) + ATTRS: aaa + bbb(x,y) ccc + +is approximately equivalent to: + + use attributes Foo::Bar, \&foo, 'aaa'; + use attributes Foo::Bar, \&foo, 'bbb(x,y)'; + use attributes Foo::Bar, \&foo, 'ccc'; + +User-defined attributes, just like with Perl subs, will trigger a call to +C, as described in L. + +Note that not all built-in subroutine attributes necessarily make sense +applied to XSUBs. + +Currently the parsing of white-space is crude: C is +misinterpreted as two separate attributes, C<'bbb(x,'> and C<'y)'>. + +The C keyword can't currently be used in conjunction with C +or C; in this case, the attributes are just silently ignored. =head2 Sharing XSUB bodies From f66fd494287d33453e4cc0068666eec18c7a84d7 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Tue, 19 Aug 2025 11:32:37 +0100 Subject: [PATCH 34/42] perlxs.pod: add "Sharing XSUB bodies" section Populate the introduction to this new section. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 0d3f49bd4eed..8a8dc2c60a22 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -3646,7 +3646,32 @@ or C; in this case, the attributes are just silently ignored. =head2 Sharing XSUB bodies -XXX TBC +Sometimes you want to write several XSUBs which are very similar: they +all have the same signature, have the same generated code to convert +arguments and return values between Perl and C, and may only differ in a +few lines in the main body or in which C library function they wrap. It is +in fact possible to share the same XSUB function among multiple Perl CVs. +For example, C<&Foo::Bar::add> and C<&Foo::Bar::subtract> could be two +separate CVs in the Perl namespace which both point to the same XSUB, +C say. But each CV holds some sort of unique +identifier which can be accessed by the XSUB so that it can determine +whether it should behave as C or C. + +Both the C and C keywords (described below) allow +multiple CVs to share the same XSUB. The difference between them is that +C is intended for when you supply the main body of the XSUB +yourself (e.g. using C): it sets an integer variable, C (derived +from the passed CV), which you can use in a C statement or +similar. Conversely, C is intended for use with autocall; +information stored in the CV indicates which C library function should be +autocalled. + +Finally, there is the C keyword, which allows the whole body of an +XSUB (not just the C part) to have alternate cases. It can be +thought of as a C analogue which works at the top-most XS level +rather than at the C level. The value the C acts on could be +C for example, or it could be used in conjunction with the C +keyword and switch on the value of C. =head3 The ALIAS: Keyword From 815fe4e9c14a30b2e0557eba08271b1c2305459d Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Tue, 19 Aug 2025 13:27:58 +0100 Subject: [PATCH 35/42] perlxs.pod: update: ALIAS Rewrite this section: =head3 The ALIAS: Keyword --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 111 +++++++++++++++------------ 1 file changed, 60 insertions(+), 51 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 8a8dc2c60a22..57f07bf45399 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -3675,69 +3675,78 @@ keyword and switch on the value of C. =head3 The ALIAS: Keyword -XXX this keyword can appear anywhere within the body of an XSUB. + int add(int x, int y) + ALIAS: + # implicit: add = 0 + subtract = 1 + multiply = 2 divide = 3 + CODE: + switch (ix) { ... } -The ALIAS: keyword allows an XSUB to have two or more unique Perl names -and to know which of those names was used when it was invoked. The Perl -names may be fully-qualified with package names. Each alias is given an -index. The compiler will setup a variable called C which contain the -index of the alias which was used. When the XSUB is called with its -declared name C will be 0. +Note that this keyword can appear anywhere within the body of an XSUB. -The following example will create aliases C and -C for this function. +The C keyword allows a single XSUB to have two or more Perl names +and to know which of those names was used when it was invoked. Each alias +is given an integer index value, with the main name of the XSUB being +index 0. This index is accessible via the variable C which is +initialised based on which CV (i.e. which Perl subroutine) was called. - bool_t - rpcb_gettime(host, timep) - char *host - time_t &timep - ALIAS: - FOO::gettime = 1 - BAR::getit = 2 - INIT: - printf("# ix = %d\n", ix); - OUTPUT: - timep +Note that an XSUB may be shared by multiple CVs, and each CV may have +multiple names. Given the C XSUB definition above, and given this +Perl code: + + use Foo::Bar; + BEGIN { *addition = *add } + +Then in the C namespace, the entries C and C +point to the same CV, which has index 0 stored in it; while C +points to a second CV with index 1, and so on. All four CVs point to the +same C function, C. + +The alias name can be either a simple function name or can include a +package name. The alias value to the right of the C<=> may be either a +literal positive integer or a word (which is expected to be a CPP define). -A warning will be produced when you create more than one alias to the same -value. This may be worked around in a backwards compatible way by creating -multiple defines which resolve to the same value, or with a modern version -of ExtUtils::ParseXS you can use a symbolic alias, which are denoted with -a C<< => >> instead of a C<< = >>. For instance you could change the above -so that the alias section looked like this: +The rest of the line following the C keyword, plus any further +lines until the next keyword, are assumed to contain zero or more alias +name and value pairs. + +A warning will be produced if you create more than one alias to the same +index value. If you want multiple aliases with the same value, then a +backwards-compatible way of achieving this is via separate CPP defines to +the same value, e.g. + + #define DIVIDE 3 + #define DIVISION 3 ALIAS: - FOO::gettime = 1 - BAR::getit = 2 - BAZ::gettime => FOO::gettime + divide = DIVIDE + division = DIVISION -this would have the same effect as this: +Since Perl 5.38.0 or C 3.51, alias values may refer to +other alias names (or to the main function name) by using C<< => >> rather +than the C<=> symbol: ALIAS: - FOO::gettime = 1 - BAR::getit = 2 - BAZ::gettime = 1 + divide = 3 + division => divide -except that the latter will produce warnings during the build process. A -mechanism that would work in a backwards compatible way with older -versions of our tool chain would be to do this: +Both alias names and C<< => >> values may be fully-qualified: - #define FOO_GETTIME 1 - #define BAR_GETIT 2 - #define BAZ_GETTIME 1 + ALIAS: + red = 1 + COLOR::red => red + COLOUR::red => COLOR::red - bool_t - rpcb_gettime(host, timep) - char *host - time_t &timep - ALIAS: - FOO::gettime = FOO_GETTIME - BAR::getit = BAR_GETIT - BAZ::gettime = BAZ_GETTIME - INIT: - printf("# ix = %d\n", ix); - OUTPUT: - timep +Note that any L is applied to the main +name of the XSUB, but not to any aliases. + +See L for a fully-worked example using +aliases. + +See L below for an alternative to +C which is more suited for autocall. Note that C should not +be used together with either of C or C. =head3 The INTERFACE: Keyword From 7188b4a0960113261f6c55bdcbd374f45ed27a83 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Tue, 26 Aug 2025 13:28:46 +0100 Subject: [PATCH 36/42] perlxs.pod: update INTERFACE, INTERFACE_MACRO Rewrite these sections: =head3 The INTERFACE: Keyword =head3 The INTERFACE_MACRO: Keyword also demote the second to be a head4 child of the first. Then expand the T_PTROBJ example to use INTERFACE as an alternative to ALIAS. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 190 +++++++++++++++++---------- 1 file changed, 119 insertions(+), 71 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 57f07bf45399..1139380bc421 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -3750,84 +3750,119 @@ be used together with either of C or C. =head3 The INTERFACE: Keyword -XXX this keyword can appear anywhere within the init part of an XSUB. + MODULE = Foo::Bar PACKAGE = Foo::Bar PREFIX = foobar_ -This keyword declares the current XSUB as a keeper of the given -calling signature. If some text follows this keyword, it is -considered as a list of functions which have this signature, and -should be attached to the current XSUB. - -For example, if you have 4 C functions multiply(), divide(), add(), -subtract() all having the signature: - - symbolic f(symbolic, symbolic); - -you can make them all to use the same XSUB using this: - - symbolic - interface_s_ss(arg1, arg2) - symbolic arg1 - symbolic arg2 - INTERFACE: - multiply divide - add subtract - -(This is the complete XSUB code for 4 Perl functions!) Four generated -Perl function share names with corresponding C functions. - -The advantage of this approach comparing to ALIAS: keyword is that there -is no need to code a switch statement, each Perl function (which shares -the same XSUB) knows which C function it should call. Additionally, one -can attach an extra function remainder() at runtime by using + int + arith(int a, int b) + INTERFACE: foobar_add foobar_subtract + foobar_divide foobar_multiply + +This keyword can appear anywhere within the L of an XSUB. + +This keyword provides similar functionality to C, but is intended +for XSUBs which use autocall. It allows a single XSUB to have multiple +names in the Perl namespace which, when invoked, will call the correct +wrapped C library function. + +In the example above there is a single C XSUB function created (called +C), plus four CVs in the Perl namespace called +C etc. Calling C from Perl invokes +C with some indication of which C function to call, +which is then autocalled. C achieves this by storing an index value +in each CV and making it available via the C variable, while +C currently achieves this by storing a C function pointer in +each CV. So the C CV holds a pointer to the +C C function. The action of the XSUB is to extract the +parameter values from the passed arguments and the function pointer from +the CV, then call the underlying C function. + +Note that storing a function pointer in the CV is an implementation detail +which could change in the future. See L for +details of how to customise the setting and retrieving of this value in +the CV. + +The rest of the line following the C keyword, plus any further +lines until the next keyword, are assumed to contain zero or more +interface names, separated by white space (or commas). + +An interface name is always used as-is for the name of the wrapped C +function. If the name contains a package separator, then it will be +used as-is to generate the Perl name; otherwise any prefix is stripped and +the current package name is prepended. The following shows how a few such +interface names would be processed (assuming the current PACKAGE and +PREFIX are C and C): + + Interface name Perl function name C function name + -------------- ------------------ ---------------- + abc Foo::Bar::abc abc + foobar_abc Foo::Bar::abc foobar_abc + X::Y::foobar_def X::Y::foobar_def X::Y::foobar_def + +Unlike C, the XSUB name is used only as the name of the generated C +function; in the example above, it doesn't cause a Perl function called +C to be created. + +See L for a complete example using +C with the C typemap. But note that before Perl +5.44.0 (F 3.60), C would not work properly +on XSUBs used with Perlish return types (as used by C), such as - CV *mycv = newXSproto("Symbolic::remainder", - XS_Symbolic_interface_s_ss, __FILE__, "$$"); - XSINTERFACE_FUNC_SET(mycv, remainder); + Foo::Bar + foo(...) + .... -say, from another XSUB. (This example supposes that there was no -INTERFACE_MACRO: section, otherwise one needs to use something else instead of -C, see the next section.) +This has mostly been fixed in 5.44.0 onwards, but may generate invalid C +code (in particular, invalid function pointer casts) for XSUBs having a +C keyword, unless the value of C is a simple list of +parameter names. -=head3 The INTERFACE_MACRO: Keyword +Note that C should not be used together with either of C +or C. -XXX this keyword can appear anywhere within the input and init parts of an -XSUB. +=head4 The INTERFACE_MACRO: Keyword -This keyword allows one to define an INTERFACE using a different way -to extract a function pointer from an XSUB. The text which follows -this keyword should give the name of macros which would extract/set a -function pointer. The extractor macro is given return type, C, -and C for this C. The setter macro is given cv, -and the function pointer. - -The default value is C and C. -An INTERFACE keyword with an empty list of functions can be omitted if -INTERFACE_MACRO keyword is used. - -Suppose that in the previous example functions pointers for -multiply(), divide(), add(), subtract() are kept in a global C array -C with offsets being C, C, C, -C. Then one can use - - #define XSINTERFACE_FUNC_BYOFFSET(ret, cv, f) \ - ((XSINTERFACE_CVT_ANON(ret))fp[CvXSUBANY(cv).any_i32]) - #define XSINTERFACE_FUNC_BYOFFSET_set(cv, f) \ + int + arith(int a, int b) + INTERFACE: add subtract divide multiply + INTERFACE_MACRO: MY_FUNC_GET + MY_FUNC_SET + +Note that this keyword is deprecated since it assumes a particular +implementation for the C keyword, which might change in future. + +This keyword can appear anywhere within the L +or L parts of an XSUB. + +By default, the C code generated by the C keyword plants calls +to two macros, C and C, which are +used respectively to set (at boot time) a field in the CV to the address +of the C function pointer to use, and to retrieve (at run time) that value +from the CV. + +The C macro allows you to override the names of the two +macros to be used for this purpose. The rest of the line following the +C keyword, plus any further lines until the next keyword, +should contain (in total) two words which are taken to be macro names. + +The get macro takes three parameters: the return type of the function, the +CV which holds the function's pointer value, and the field within the CV +which has the pointer value. It should return a C function pointer. The +setter macro has two parameters: the CV, and the function pointer. + +Suppose that in the example above, pointers to the C, +C, C and C functions are kept in a global C +array called C with offsets specified by the enum values +C, C, C and C. Then one +could use: + + #define MY_FUNC_GET(ret, cv, f) \ + ((XSINTERFACE_CVT_ANON(ret))arith_ptrs[CvXSUBANY(cv).any_i32]) + #define MY_FUNC_SET(cv, f) \ CvXSUBANY(cv).any_i32 = CAT2(f, _off) -in C section, - - symbolic - interface_s_ss(arg1, arg2) - symbolic arg1 - symbolic arg2 - INTERFACE_MACRO: - XSINTERFACE_FUNC_BYOFFSET - XSINTERFACE_FUNC_BYOFFSET_set - INTERFACE: - multiply divide - add subtract - -in XSUB section. +to store an array index in the CV, rather than storing the actual function +pointer. =head3 The CASE: Keyword @@ -4005,7 +4040,20 @@ C, while C returns the integer value of the object. Finally, four binary functions are defined, sharing the same XSUB body via -aliases. +aliases. As an alternative, the code for the main XSUB could simplified +using the L keyword rather than +using aliasing: + + My::Num + arithmetic_interface(My::Num x, My::Num y) + INTERFACE: + mynum_add + mynum_subtract + mynum_multiply + mynum_divide + +but note that C only supports Perlish return types such +as C from Perl 5.44.0 (F 3.60) onwards. This XS module might be accessed from Perl using code like this: From 17cf16874e21b902874074588d7eef6402d86368 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Fri, 29 Aug 2025 20:33:37 +0100 Subject: [PATCH 37/42] perlxs.pod: update: CASE Rewrite this section: =head3 The CASE: Keyword --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 134 ++++++++++++++++++--------- 1 file changed, 90 insertions(+), 44 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 1139380bc421..a03a11380166 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -3866,50 +3866,96 @@ pointer. =head3 The CASE: Keyword -The CASE: keyword allows an XSUB to have multiple distinct parts with each -part acting as a virtual XSUB. CASE: is greedy and if it is used then all -other XS keywords must be contained within a CASE:. This means nothing may -precede the first CASE: in the XSUB and anything following the last CASE: is -included in that case. - -A CASE: might switch via a parameter of the XSUB, via the C ALIAS: -variable (see L<"The ALIAS: Keyword">), or maybe via the C variable -(see L<"Ellipsis: variable-length parameter lists">). The last CASE: becomes the -B case if it is not associated with a conditional. The following -example shows CASE switched via C with a function C -having an alias C. When the function is called as -C its parameters are the usual C<(char *host, time_t *timep)>, -but when the function is called as C its parameters are -reversed, C<(time_t *timep, char *host)>. - - long - rpcb_gettime(a, b) - CASE: ix == 1 - ALIAS: - x_gettime = 1 - INPUT: - # 'a' is timep, 'b' is host - char *b - time_t a = NO_INIT - CODE: - RETVAL = rpcb_gettime(b, &a); - OUTPUT: - a - RETVAL - CASE: - # 'a' is host, 'b' is timep - char *a - time_t &b = NO_INIT - OUTPUT: - b - RETVAL - -That function can be called with either of the following statements. Note -the different argument lists. - - $status = rpcb_gettime($host, $timep); - - $status = x_gettime($timep, $host); + int + foo(int a, int b = NO_INIT, int c = NO_INIT) + CASE: items == 1 + C_ARGS: 0, a + CASE: items == 2 + C_ARGS: b, a + CASE: + CODE: + RETVAL = b > c ? foo(b, a) : bar(b, a); + OUTPUT: + RETVAL + + +The C keyword allows an XSUB to effectively have multiple bodies, +but with only a single Perl name (unlike C, which has multiple +names). Which body is run depends on which CASE expression is the first to +evaluate to true. Unlike C's C keyword, execution doesn't fall +though to the next branch, so there is no XS equivalent of the C +keyword. The expression for the last CASE is optional, and if not present, +acts as a default branch. + +The example above translates to approximately this C code: + + if (items < 1 || items > 3) { croak("..."); } + + if (items == 1) { + int RETVAL; + int a = (int)SvIV(ST(0)); int b = /* etc */ + RETVAL = foo(0, a); + /* ... return RETVAL as ST(0) ... */ + } + else if (items == 2) { + int RETVAL; + int a = (int)SvIV(ST(0)); int b = /* etc */ + RETVAL = foo(b, a); + /* ... return RETVAL as ST(0) ... */ + } + else { + int RETVAL; + int a = (int)SvIV(ST(0)); int b = /* etc */ + RETVAL = b > c ? foo(b, a) : bar(b, a); + /* ... return RETVAL as ST(0) ... */ + } + + XSRETURN(1); + +Each C keyword precedes an entire normal XSUB body, including all +keywords from C to C. Generic XSUB keywords can be +placed within any C body. The code generated for each C/C +branch includes nearly all the code that would usually be generated for a +complete XSUB body, including argument processing and return value +stack processing. + +Note that the CASE expressions are outside of the scope of any parameter +variable declarations, so those values can't be used. Typical values which +I in scope and might be used are the C variable which +indicates how many arguments were passed (see L<"Ellipsis: variable-length +parameter lists">) and, in the presence of C, the C variable. + +Here's another example, this time in conjunction with C to wrap the +same C function as two separate Perl functions, the second of which +(perhaps for backwards compatibility reasons) takes its arguments in the +reverse order. This is a somewhat contrived example, but +demonstrates how the C keyword must be within one of the C +branches (it doesn't matter which), as C must always appear in the +outermost scope of the XSUB's body: + + int + foo(int a, int b) + CASE: ix == 0 + CASE: ix == 1 + ALIAS: foo_rev = 1 + C_ARGS: b, a + +Note that using old-style parameter declarations in conjunction with +C allows the types of the parameters to vary between branches: + + int + foo(a, int b = 0) + CASE: items == 1 + INPUT: + short a + CASE: items == 2 + INPUT: + long a + +In practice, C produces bloated code with all the argument and +return value processing duplicated within each branch, is not often all +that useful, and can often be better written just by using a C +statement within a C block. =head2 Using Typemaps From 35fc3f0a28eb133e88a2a805efe5ca3bd4140b0c Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Sat, 30 Aug 2025 15:30:05 +0100 Subject: [PATCH 38/42] perlxs.pod: add "Using Typemaps" section Populate this new section (except for the T_PTROBJ subsection, which had already been added by an earlier commit within this branch). Note that the "Common typemaps" subsection could probably benefit from some further expansion by someone familiar with which built-in T_FOO entries are useful. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 168 ++++++++++++++++++++++++++- 1 file changed, 167 insertions(+), 1 deletion(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index a03a11380166..cc13299727b3 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -3959,7 +3959,173 @@ statement within a C block. =head2 Using Typemaps -XXX TBC +This section describes the basic facts about using typemaps. For full +information on creating your own typemaps plus a comprehensive list of +what standard typemaps are available, see the L document. + +Typemaps are sets of rules which map C types such as C to logical XS +types such as C, and from there to C and C templates +such as C<$var = ($type)SvIV($arg)> and C which, +after variable expansion, generate C code to convert back and forth +between Perl arguments and C auto variables. + +There is a standard system typemap file bundled with Perl for common C and +Perl types, but in addition, you can add your own typemap file. From Perl +5.16.0 onwards you can also include extra typemap declarations in-line +within the XS file. + +=head3 Locations and ordering of typemap processing + +Typemap definitions are processed in order, with more recent entries +overriding any earlier ones. Definitions are read in first from files and +then from L sections in the XS file. + +When considering how files are located and read in, note that the XS +parser will initially change directory to the directory containing the +F file that is about to be processed, which will affect any +subsequent relative paths. Then any typemap files are located and read in. +The files come from two sources: standard and explicit. + +Standard typemap files are always called C and are searched for +in a standard set of locations (relative to C<@INC> and to the current +directory), and any matched files are read in. These paths are, in order +of processing: + + "$_/ExtUtils/typemap" for reverse @INC + + ../../../../lib/ExtUtils/typemap + ../../../../typemap + ../../../lib/ExtUtils/typemap + ../../../typemap + ../../lib/ExtUtils/typemap + ../../typemap + ../lib/ExtUtils/typemap + ../typemap + typemap + +Note that searching C<@INC> in reverse order means that typemap files +found earlier in C<@INC> are processed later, and thus have higher +priority. + +Explicit typemap files are specified either via C +command line switches, or programmatically by an array passed as: + + ExtUtils::ParseXS::process_file(..., typemap => ['foo',...]); + +These files are read in order, and the parser dies if any explicitly +listed file is not found. + +Prior to Perl 5.10.0 and Perl 5.8.9, C<@INC> wasn't searched, and standard +files were searched for and processed I any explicit ones. From +Perl 5.10.0 onwards, standard files were processed I any explicit +ones. From Perl 5.44.0 (F 3.60) onwards, explicit files +are again processed last, and thus take priority over standard files. +In Perl 5.16.0 onwards, C sections are then processed in order +after all files have been processed. + +Note also that F usually invokes F with two +C<-typemap> arguments: the first being the system typemap and the second +being the module's typemap file, if any. This compensates for older Perls +not searching C<@INC>. + +For a typical distribution, all this complication usually results in the +typemap file bundled with Perl being read in first, then the typemap file +included with the distribution adding to (and overriding) any standard +definitions, then any C entries in the XS file overriding +everything. + +=head3 Reusing, redefining and adding typemap entries + +Both typemap files and C blocks can have up to three sections: +C (which is implicit at the start of the file or block) and +C and C. There is no requirement for all three sections to +be present. Whatever I present is added to the global state for that +section, either adding a new entry or redefining an existing entry. + +Probably the simplest use of an additional typemap entry is to map a new C +type to an I XS type; for example, given this C type: + + typedef enum { red, green, blue } colors; + +then adding the following C-to-XS type-mapping entry to the typemap would +be sufficient if you just want to treat such enums as simple integers when +used as parameter and return types: + + colors T_IV + +Or you could override just an existing INPUT or OUTPUT template; for +example: + + OUTPUT + T_IV + my_sv_setiv($arg, (IV)$var); + +For a completely novel type you might want to add an entry to all three +sections: + + foo T_FOO + + INPUT + T_FOO + $var = ($type)get_foo_from_sv($arg); + + OUTPUT + T_FOO + set_sv_to_foo($arg, $var); + +=head3 Common typemaps + +This section gives an overview of what common typemap entries are +available for use. See the L document for a complete list, +or examine the F file which is bundled with the Perl +distribution. Also, see L for a detailed +dive into one particular typemap which is particularly useful for mapping +between Perl objects and C handles. See L +for a general discussion about returning one or more values from an XSUB, +where typemaps can sometimes be of use (and sometimes aren't). + +Standard signed C int types such as C, C and C, are all +mapped to to the C XS type. Integer-like Perl types such C and +C are also mapped to this. If a parameter is declared as something +mapping to C, then the C value of the passed SV will be +extracted (perhaps first converting a string value like C<"123"> to an +IV), then that value will be cast to the final C type, with the usual C +rules for casting between integer types. Conversely, when returning a +value, the C value is first cast to C, then the SV is set to that +IV value. + +Similarly, common C and Perl unsigned types map to C, and values +are converted back and forth via C<(UV)> casts. A few unsigned types such +as C and C are instead mapped to C and C XS +types, but these have the same effect as C. + +The C type is treated similarly to other C types, but +C is treated as a string rather than an integer. A C parameter +will treat its passed argument as a string and set the auto variable to +the first I of that string (which may produce weird results with +UTF-8 strings). Returning a C value will return a one-character +string to the Perl caller. + +The C type and its common variants are mapped to C. Passed +parameters will (via C or similar) return a string buffer +representing that SV. This buffer may be part of the SV if that SV has a +string value (or if it can be converted to a string value), or it may be a +temporary buffer otherwise. For example, an SV holding a reference to an +array might return a temporary string buffer with the value +C<"ARRAY(0x12345678)">. When an XSUB has a return type which maps to +C, the temporary SV which is to be returned gets assigned the +current value of C, with the string's length being determined by +C or its equivalent. + +See L for the difficulties associated with handling +UTF-8 strings. + +The C, C and C types map to C, C and +C XS types, which all operate by converting to and from an SV via +C and C with suitable casting. + +The C type maps to C, which basically does no processing and +allows you to access the actual passed SV argument. =head3 T_PTROBJ and opaque handles From 95a623b18bf412621576b3d7873045357ddbb768 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Sun, 28 Sep 2025 14:42:42 +0100 Subject: [PATCH 39/42] perlxs.pod: update "Using XS With C++" section Rewrite this section: =head2 Using XS With C++ Disclaimer: I've never written a proper C++ program. I had to (literally) dust off my 34-year old copy of Stroustrup(*) and also do some Googling. Hopefully what I've written is sane. (*) This was bought back in the days when people used to to learn things by buying books, and when I thought that I ought to know something about this newfangled C++ thing. I never got round to reading all of it: I discovered Perl around the same time, which looked to be a lot more fun. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 296 ++++++++++++++++++++------- t/porting/known_pod_issues.dat | 1 + 2 files changed, 218 insertions(+), 79 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index cc13299727b3..242e7f784c6b 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -4289,112 +4289,250 @@ entry will be used instead. =head2 Using XS With C++ -If an XSUB name contains C<::>, it is considered to be a C++ method. -The generated Perl function will assume that -its first argument is an object pointer. The object pointer -will be stored in a variable called THIS. The object should -have been created by C++ with the new() function and should -be blessed by Perl with the sv_setref_pv() macro. The -blessing of the object by Perl can be handled by a typemap. An example -typemap is shown at the end of this section. - -If the return type of the XSUB includes C, the method is considered -to be a static method. It will call the C++ -function using the class::method() syntax. If the method is not static -the function will be called using the THIS-Emethod() syntax. - -The next examples will use the following C++ class. - - class color { - public: - color(); - ~color(); - int blue(); - void set_blue(int); + MODULE = Foo::Bar PACKAGE = Foo::Bar - private: - int c_blue; - }; + # Class methods -The XSUBs for the blue() and set_blue() methods are defined with the class -name but the parameter for the object (THIS, or "self") is implicit and is -not listed. + int + X::Y::new(int i) + + static int + X::Y::foo(int i) + + # Object methods + + int + X::Y::bar(int i) int - color::blue() + X::Y::bar2(int i) const void - color::set_blue(val) - int val + X::Y::DESTROY() + + # C-linkage function + + extern "C" int + baz(int i) + +XS provides limited support for generating C++ (as opposed to C) output +files. Any XSUB whose name includes C<::> is treated as a C++ method. This +triggers two main changes in the way the XSUB's code is generated: + +=over + +=item * + +An implicit first argument is added. For class methods, this will be +called C and will be of type C. For object methods, it will +be called C and be of type C (where C is the prefix +of the XSUB's name). XSUBs are treated as class methods if their name is +C or their return type has the C prefix. + +=item * -Both Perl functions will expect an object as the first parameter. In the -generated C++ code the object is called C, and the method call will -be performed on this object. So in the C++ code the blue() and set_blue() -methods will be called as this: +Any autocall will generate an appropriate C++ method call rather then a C +function call. In particular, based on the examples above: - RETVAL = THIS->blue(); + new: RETVAL = new X::Y(i); + static foo: RETVAL = X::Y::foo(i); + bar (and bar2): RETVAL = THIS->bar(i); + DESTROY: delete THIS; - THIS->set_blue(val); +=item * + +In addition, if the XSUB declaration has a trailing C, then the +type of C will be declared as C. -You could also write a single get/set method using an optional argument: +=back + +This is mostly just syntactic sugar. The C XSUB declaration above +could be written longhand as: int - color::blue(val = NO_INIT) - int val - PROTOTYPE $;$ - CODE: - if (items > 1) - THIS->set_blue(val); - RETVAL = THIS->blue(); - OUTPUT: - RETVAL + bar(X::Y* THIS, int i) + CODE: + RETVAL = THIS->foo(i); + OUTPUT: + RETVAL -If the function's name is B then the C++ C function will be -called and C will be given as its parameter. The generated C++ code for +Note that the type of C (and, since Perl 5.42, C) can be +overridden with a line in an C section: - void - color::DESTROY() + int + X::Y::bar(int i) + X::Y::Z *THIS -will look like this: +Finally, a plain C XSUB declaration can be prefixed with C to +give that XSUB C linkage. - color *THIS = ...; // Initialized as in typemap +Some of the methods above might be called from Perl using code like this: - delete THIS; + { + my $obj = Foo::Bar->new(1); + $obj->bar(2); + # implicit $obj->DESTROY(); + } -If the function's name is B then the C++ C function will be called -to create a dynamic C++ object. The XSUB will expect the class name, which -will be kept in a variable called C, to be given as the first -argument. +This example uses C rather than C to emphasise that the +name of the C++ class needn't follow the Perl package name. - color * - color::new() +The call to C will pass the string C<"Foo::Bar"> as the first +argument, which can be used to allow multiple Perl classes to share the +same C method. In the simple worked example below, the package name +is hard-coded and that parameter is unused. The C method is +expected to return a Perl object which in some way has a pointer to the +underlying C++ object embedded within it. This is similar to the +L example of wrapping a C library which +uses a handle, although with a subtle difference, as explained below. -The generated C++ code will call C. +Calling C passes this Perl object as the first argument, which the +typemap will use to extract the C++ object pointer and assign to the +C auto variable. - RETVAL = new color(); +=head3 A complete C++ example -The following is an example of a typemap that could be used for this C++ -example. +First, you need to tell MakeMaker or similar that the generated file +should be compiled using a C++ compiler. For basic experimentation you may +be able to get by with just adding these two lines to the +C method call in F: - TYPEMAP - color * O_OBJECT + CC => 'c++', + LD => '$(CC)', - OUTPUT - # The Perl object is blessed into 'CLASS', which should be a - # char* having the name of the package for the blessing. - O_OBJECT - sv_setref_pv($arg, CLASS, (void*)$var); +but for portability in production use, you may want to use something like +L to automatically generate the correct options for +L or L based on which C++ compiler +is available. + +Then create a C<.xs> file like this: + + #define PERL_NO_GET_CONTEXT + + #include "EXTERN.h" + #include "perl.h" + #include "XSUB.h" + #include "ppport.h" + + namespace Paint { + class color { + int c_R; + int c_G; + int c_B; + public: + color(int r, int g, int b) { c_R = r; c_G = g; c_B = b; } + ~color() { printf("destructor called\n"); } + int blue() { return c_B; } + void set_blue(int b) { c_B = b; }; + // and similar for red, green + }; + } + + typedef Paint::color Paint__color; + + MODULE = Foo::Bar PACKAGE = Foo::Bar + + PROTOTYPES: DISABLE + + TYPEMAP: <. The example includes a +namespace to make it clearer when something is a namespace, class name or +Perl package. The Perl package is called C rather than +C to again distinguish it. You could however call the Perl +package C if you desired. + +A single typedef follows to allow for XS-mangled class names, as explained +in L. + +Then the C line starts the XS part of the file. + +Then there follows a full definition of a new typemap called C. +This is actually a direct copy of the C typemap found in the +system typemap file, except that all occurrences of C<$ntype> have been +replaced with C<$Package>. It serves the same basic purpose as +C: embedding a pointer within a new blessed Perl object, +and later, retrieving that pointer from the object. The difference is in +terms of what package the object is blessed into. C expects the +type name (C) to already be a pointer type, but with a C++ +XSUB, the implicit C argument is automatically declared to be of +type C (so C itself isn't necessarily a +pointer type). In addition, when the Perl and C++ class names differ we +want the object to be blessed using the Perl package name, not the C++ +class name. In this example, the actual values of the two variables when +the typemap template is being evalled, are: + + $ntype = "Paint::colorPtr"; + $Package = "Foo::Bar"; + +The typemap also includes an INPUT definition for C, which is +an I copy of C. This is needed because, as an +optimisation, the XS parser automatically renames an INPUT typemap using +C if the name of the XSUB is C, on the grounds that +it's not necessary to to check that the object is the right class. + +Finally the XS file includes a few XSUBs which are wrappers around the +class's methods. + +This class might be used like this: + + use Foo::Bar; + my $color = Foo::Bar->new(0x10, 0x20, 0xff); + printf "blue=%d\n", $color->blue(); # prints 255 + $color->set_blue(0x80); + printf "blue=%d\n", $color->blue(); # prints 128 =head2 Safely Storing Static Data in XS diff --git a/t/porting/known_pod_issues.dat b/t/porting/known_pod_issues.dat index 8213094a3b05..995e79f873a9 100644 --- a/t/porting/known_pod_issues.dat +++ b/t/porting/known_pod_issues.dat @@ -123,6 +123,7 @@ exit(3) Expect Exporter::Easy ExtUtils::Constant::ProxySubs +ExtUtils::CppGuess fchdir(2) fchmod(2) fchown(2) From ae9183e8f38f382bb0c84439b029aa728bf3e99b Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Mon, 29 Sep 2025 11:10:28 +0100 Subject: [PATCH 40/42] perlxs.pod: update MY_CXT section Revise the text in this section: =head2 Safely Storing Static Data in XS --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 160 ++++++++++++++++++--------- 1 file changed, 110 insertions(+), 50 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 242e7f784c6b..a1cd24f65410 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -4536,25 +4536,41 @@ This class might be used like this: =head2 Safely Storing Static Data in XS -Starting with Perl 5.8, a macro framework has been defined to allow -static data to be safely stored in XS modules that will be accessed from -a multi-threaded Perl. +You should generally avoid declaring static variables and data within an +XS file. The Perl interpreter binary is commonly configured to allow +multiple interpreter structures, with a complete set of interpreter state +per interpreter struct. In this case, you usually need your "static" data +to be per-interpreter rather than a single shared per-process value. -Although primarily designed for use with multi-threaded Perl, the macros -have been designed so that they will work with non-threaded Perl as well. +This becomes more important in the presence of multiple threads; either +via C or where the Perl interpreter is embedded within +another application (such as a web server) which may manage its own +threads and allocate interpreters to threads as it sees fit. -It is therefore strongly recommended that these macros be used by all -XS modules that make use of static data. +A macro framework is available to XS code to allow a single C struct to be +declared and safely accessed. Behind the scenes, the struct will be +allocated per interpreter or thread; on non-threaded Perl interpreter +builds, the macros gracefully degrade to a single global instance. These +macros have C ("my context") as part of their names. -The easiest way to get a template set of macros to use is by specifying -the C<-g> (C<--global>) option with h2xs (see L). +It is therefore strongly recommended that these macros be used by all XS +modules that make use of static data. -Below is an example module that makes use of the macros. +When creating a new skeleton F file, you can use the C<--global> +option of F to also include a skeleton set of macros, e.g. + + h2xs -A --global -n Foo::Bar + +Below is a complete example module that makes use of the macros. It tracks +the names of up to three blind mice. #define PERL_NO_GET_CONTEXT #include "EXTERN.h" #include "perl.h" #include "XSUB.h" + #include "ppport.h" + + #define MAX_NAME_LEN 100 /* Global Data */ @@ -4562,27 +4578,29 @@ Below is an example module that makes use of the macros. typedef struct { int count; - char name[3][100]; + char name[3][MAX_NAME_LEN+1]; } my_cxt_t; START_MY_CXT MODULE = BlindMice PACKAGE = BlindMice + PROTOTYPES: DISABLE + BOOT: { MY_CXT_INIT; MY_CXT.count = 0; - strcpy(MY_CXT.name[0], "None"); - strcpy(MY_CXT.name[1], "None"); - strcpy(MY_CXT.name[2], "None"); } int - newMouse(char *name) + AddMouse(char *name) PREINIT: dMY_CXT; CODE: + if (strlen(name) > MAX_NAME_LEN) + croak("Mouse name too long\n"); + if (MY_CXT.count >= 3) { warn("Already have 3 blind mice"); RETVAL = 0; @@ -4595,13 +4613,12 @@ Below is an example module that makes use of the macros. RETVAL char * - get_mouse_name(index) - int index + get_mouse_name(int index) PREINIT: dMY_CXT; CODE: if (index > MY_CXT.count) - croak("There are only 3 blind mice."); + croak("There are only %d blind mice.", MY_CXT.count); else RETVAL = MY_CXT.name[index - 1]; OUTPUT: @@ -4612,43 +4629,78 @@ Below is an example module that makes use of the macros. CODE: MY_CXT_CLONE; -=head3 MY_CXT REFERENCE +The main points from this example are: + +=over + +=item * + +The C struct will hold all your "static" data. + +=item * + +The C and C are boilerplate to make the macro +system work. The former is a string which should be unique to your module. + +=item * + +The C in the C section allocates the struct when the +module is loaded. You can add further boot code which does any +initialisation you require (such as setting C). C is called +at most once per interpreter, when code in that interpreter instance first +does C. + +=item * + +Each XSUB includes a C declaration, which retrieves a pointer to +the struct associated with the current interpreter and saves it in a +hidden auto variable; C allows you to access fields within this +structure. + +=item * + +C creates a byte-for-byte copy of the current struct. This +is called from the special C XSUB, to ensure that each new thread +gets its own copy of the data which is otherwise shared by default. + +=back + +=head3 MY_CXT macros reference =over 5 =item MY_CXT_KEY -This macro is used to define a unique key to refer to the static data -for an XS module. The suggested naming scheme, as used by h2xs, is to -use a string that consists of the module name, the string "::_guts" -and the module version number. +This macro is used to define a unique key to refer to the static data for +an XS module. The suggested naming scheme, as used by F, is to use a +string that consists of a concatenation of the module name, the string +C<::_guts> and the module version number: #define MY_CXT_KEY "MyModule::_guts" XS_VERSION -=item typedef my_cxt_t - -This struct typedef I always be called C. The other -C macros assume the existence of the C typedef name. +=item my_cxt_t -Declare a typedef named C that is a structure that contains -all the data that needs to be interpreter-local. +The "static" values should be stored within a struct typedef which I +always be called C. The other C<*MY_CXT*> macros assume the +existence of the C typedef name. For example: typedef struct { int some_value; + int some_other_value; } my_cxt_t; =item START_MY_CXT -Always place the START_MY_CXT macro directly after the declaration -of C. +This macro contains hidden boilerplate code. Always place the +C macro directly after the declaration of C. =for apidoc Amnh||START_MY_CXT =item MY_CXT_INIT -The MY_CXT_INIT macro initializes storage for the C struct. +The C macro initializes storage for the C struct. -It I be called exactly once, typically in a BOOT: section. If you +It must be called I, typically in a BOOT: section. If you are maintaining multiple interpreters, it should be called once in each interpreter instance, except for interpreters cloned from existing ones. (But see L below.) @@ -4657,21 +4709,21 @@ interpreter instance, except for interpreters cloned from existing ones. =item dMY_CXT -Use the dMY_CXT macro (a declaration) in all the functions that access -MY_CXT. +Use the C macro (a declaration) at the start of all the XSUBs +(and other functions) that access C. =for apidoc Amnh||dMY_CXT =item MY_CXT -Use the MY_CXT macro to access members of the C struct. For +Use the C macro to access members of the C struct. For example, if C is typedef struct { int index; } my_cxt_t; -then use this to access the C member +then use this to access the C member: dMY_CXT; MY_CXT.index = 2; @@ -4680,7 +4732,8 @@ then use this to access the C member C may be quite expensive to calculate, and to avoid the overhead of invoking it in each function it is possible to pass the declaration -onto other functions using the C/C macros, eg +onto other functions using the argument/parameter C/C +macros, e.g.: =for apidoc Amnh||_aMY_CXT =for apidoc Amnh||aMY_CXT @@ -4700,17 +4753,24 @@ onto other functions using the C/C macros, eg MY_CXT.index = 2; } -Analogously to C, there are equivalent forms for when the macro is the -first or last in multiple arguments, where an underscore represents a -comma, i.e. C<_aMY_CXT>, C, C<_pMY_CXT> and C. +Analogously to C, there are equivalent forms for when the macro is +the first or last in multiple arguments, where an underscore is expanded +to a comma where appropriate, i.e. C<_aMY_CXT>, C, C<_pMY_CXT> +and C. These allow for the possibility that those macros might +optimise away any actual argument without leaving a stray comma. =item MY_CXT_CLONE -By default, when a new interpreter is created as a copy of an existing one -(eg via C<< threads->create() >>), both interpreters share the same physical -my_cxt_t structure. Calling C (typically via the package's -C function), causes a byte-for-byte copy of the structure to be -taken, and any future dMY_CXT will cause the copy to be accessed instead. +When a new interpreter is created as a copy of an existing one (e.g. via +C<< threads->create() >>), then by default, both interpreters share the +same physical my_cxt_t structure. Calling C (typically via +the package's C function), causes a byte-for-byte copy of the +structure to be taken (but not a deep copy) and any future C will +cause the copy to be accessed instead. + +This is typically used within the C method which is called each +time an interpreter is copied (usually when creating a new thread). Other +code can be added to C to deep copy items within the structure. =for apidoc Amnh||MY_CXT_CLONE @@ -4718,14 +4778,14 @@ taken, and any future dMY_CXT will cause the copy to be accessed instead. =item dMY_CXT_INTERP(my_perl) -These are versions of the macros which take an explicit interpreter as an -argument. +These are variants of the C and C macros which take +an explicit perl interpreter as an argument. =back Note that these macros will only work together within the I source -file; that is, a dMY_CTX in one source file will access a different structure -than a dMY_CTX in another source file. +file; that is, a C in one source file will access a different +structure than a C in another source file. =head1 EXAMPLES From f3f436aa510e0ee79efef0c73fee04eaa089af37 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Mon, 29 Sep 2025 11:31:55 +0100 Subject: [PATCH 41/42] perlxs.pod: update EXAMPLES section Rewrite this section: =head1 EXAMPLES Basically, delete the one big example in this section and instead provide links to various other examples already present in this document instead. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 79 ++++++++-------------------- 1 file changed, 23 insertions(+), 56 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index a1cd24f65410..1157dd3fd97e 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -4789,77 +4789,44 @@ structure than a C in another source file. =head1 EXAMPLES -File C: Interface to some ONC+ RPC bind library functions. +Fairly complete examples of XS files can be found elsewhere in this +document: - #define PERL_NO_GET_CONTEXT - #include "EXTERN.h" - #include "perl.h" - #include "XSUB.h" - - /* Note: On glibc 2.13 and earlier, this needs be */ - #include - - typedef struct netconfig Netconfig; - - MODULE = RPC PACKAGE = RPC - - SV * - rpcb_gettime(host = "localhost") - char *host - PREINIT: - time_t timep; - CODE: - ST(0) = sv_newmortal(); - if (rpcb_gettime(host, &timep)) - sv_setnv(ST(0), (double)timep); - - - Netconfig * - getnetconfigent(netid="udp") - char *netid +=over +=item * - MODULE = RPC PACKAGE = NetconfigPtr PREFIX = rpcb_ +L - void - rpcb_DESTROY(netconf) - Netconfig *netconf - CODE: - printf("NetconfigPtr::DESTROY\n"); - free(netconf); +=item * -File C: Custom typemap for RPC.xs. (cf. L) +L - TYPEMAP - Netconfig * T_PTROBJ +=item * -File C: Perl module for the RPC extension. +L - package RPC; +=back - require Exporter; - require DynaLoader; - @ISA = qw(Exporter DynaLoader); - @EXPORT = qw(rpcb_gettime getnetconfigent); +while L contains an overview of an XS file and L +contains various worked examples. - bootstrap RPC; - 1; +You can of course look at existing XS distributions on CPAN for +inspiration, although bear in mind that many of these will have been +created before this document was rewritten in 2025, and so may not follow +current best practices. -File C: Perl test program for the RPC extension. +Note that when wrapping a real library, you'll often need to add a line +like this to the .xs file: - use RPC; + #include - $netconf = getnetconfigent(); - $a = rpcb_gettime(); - print "time = $a\n"; - print "netconf = $netconf\n"; +and add entries like: - $netconf = getnetconfigent("tcp"); - $a = rpcb_gettime("poplar"); - print "time = $a\n"; - print "netconf = $netconf\n"; + LIBS => ['-lfoo', '-lbar'], -In Makefile.PL add -ltirpc and -I/usr/include/tirpc. +to F or similar. And don't forget to add test scripts under +t/. =head1 CAVEATS From 9afbfdafc5e26a350cb4c430f5d881e35e598f43 Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Tue, 30 Sep 2025 18:35:52 +0100 Subject: [PATCH 42/42] perlxs.pod: update CAVEATS, AUTHOR, A DIAGNOSTICS Tweak the final few sections of perlxs.pod. --- dist/ExtUtils-ParseXS/lib/perlxs.pod | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 1157dd3fd97e..5ed2143ff960 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -4832,7 +4832,8 @@ t/. =head2 Use of standard C library functions -See L. +Often, the Perl API contains functions which you should use I of +the standard C library ones. See L. =head2 Event loops and control flow @@ -4853,17 +4854,20 @@ This document covers features supported by C =head1 AUTHOR DIAGNOSTICS -As of version 3.49 certain warnings are disabled by default. While developing -you can set C<$ENV{AUTHOR_WARNINGS}> to true in your environment or in your -Makefile.PL, or set C<$ExtUtils::ParseXS::AUTHOR_WARNINGS> to true via code, or -pass C<< author_warnings=>1 >> into process_file() explicitly. Currently this will -enable stricter alias checking but more warnings might be added in the future. -The kind of warnings this will enable are only helpful to the author of the XS -file, and the diagnostics produced will not include installation specific -details so they are only useful to the maintainer of the XS code itself. +As of version 3.49 a few parser warnings are disabled by default. While +developing you can set C<$ENV{AUTHOR_WARNINGS}> to true in your +environment or in your Makefile.PL, or set +C<$ExtUtils::ParseXS::AUTHOR_WARNINGS> to true via code, or pass C<< +author_warnings=>1 >> into process_file() explicitly. Currently this will +enable stricter alias checking but more warnings might be added in the +future. The kind of warnings this will enable are only helpful to the +author of the XS file, and the diagnostics produced will not include +installation specific details so they are only useful to the maintainer of +the XS code itself. =head1 AUTHOR Originally written by Dean Roehrich >. +Completely rewritten in 2025. -Maintained since 1996 by The Perl Porters >. +Maintained since 1996 by The Perl Porters, >.