-
Notifications
You must be signed in to change notification settings - Fork 589
Create new OP_MULTIPARAM
to implement subroutine signatures
#23574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blead
Are you sure you want to change the base?
Conversation
Weirdly, while github CI and most commentors on IRC report test failures in In case it's somehow related to build options, the script I use to configure and build is: test -f config.sh && rm config.sh
test -f Policy.sh && rm Policy.sh
./Configure -des \
-Dusedevel \
-Dprefix=$HOME/perl5/perlbrew/perls/bleadperl/ \
-DDEBUGGING \
-Uversiononly \
-Dman1dir=none -Dman3dir=none \
-Dusethreads \
-Dinc_version_list=none \
-Doptimize='-gdwarf-2 -g3' \
"$@"
make -j8
make -j8 test_prep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit message: "This two" should be "These two"
op.c
Outdated
*/ | ||
varop->op_next = defop; | ||
defexpr->op_next = varop; | ||
param->padix = allocmy("", 0, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perl_allocmy
currently does not support empty names because it blindly uses name[1]
. It needs to be changed to check len
first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allocmy()
is just a wrapper around pad_add_name_pvn()
with some extra sanity checking and flag-setting (I think it exists just for a convenience from the parser). I've instead changed this code to call pad_add_name_pvn()
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding more asserts in #23583 which would have caught this
482ac37
to
e961b24
Compare
Recent discussons on github [1] found a bug when calling this function as allocmy("", 0, 0) This ought not be allowed. The length must be at least 2, because the function checks the first two characters of `name`. [1]: Perl#23574 (comment)
Recent discussons on github [1] found a bug when calling this function as allocmy("", 0, 0) This ought not be allowed. The length must be at least 2, because the function checks the first two characters of `name`. [1]: #23574 (comment)
e961b24
to
5dd6e3a
Compare
EXTEND(SP, (IV)(3 + nparams + 1)); | ||
mPUSHu(p->min_args); | ||
mPUSHu(p->n_positional); | ||
PUSHs(sv_2mortal(p->slurpy ? newSVpvf("%c", p->slurpy) : &PL_sv_no)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why run a 1 byte long string through a printf engine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because offhand I don't believe we have a newSVpvc
function. Can you suggest an alternative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newSVpvn(&p->slurpy, 1)
?
Also, why are we mortalizing PL_sv_no
? Couldn't we do this (warning: untested code, written in the browser):
PUSHs(p->slurpy ? newSVpvn_flags(&p->slurpy, 1, SVs_TEMP) : &PL_sv_no);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very true, there is missing Perl in C infrastructure in this area. I've seen various builds from various decades of Spidermonkey/V8 having all 1 char long string objects, fo all of low 7 bit printable ASCII or 0x00-0x7F ,pre-generated burned into HW RO memory as C structs. Perl 5 calls it a SvIMMORTAL. Its funny because demerphq very quietly added something similar for all of 1 char long, 0x00-0xff, to Perl 5, like 8 years ago, but its always # define off by default and nobody has paid attention to the feature ever since. And Configure/config.h doesn't know about it.
So currently in perl 5, there are 2 normalish ways to do this. and maybe 1 @bulk88 high-speed way to do this, I'd have to research if the cpan visible exported constant array I'm thinking could be inside libperl actually exists.
I'd say option 1 is see 60% of the time, option 2 is 40% in core.
normal option 1
char c = 'z';
sv = newpvn(&c, 1);
normal option 2
sv = newpvn(" ", STRLENs(" ")); // or sv = newpvn("\0", STRLENs("\0"));
SvPVX(sv)[0] = 'z'; // safe b/c we know the above isn't going to do
// COW tricks on us (and if it does it probably needs another dedicated name)
// ive personally smoke tested with only 2 lines SEGVing after a make test
// in all of blead where newSVpvs("some hw const c str");
// doesn't make a Newx() block inside SvPVX() but a
// SvLEN() == 0 no cow or SvLEN() == 0 with COW
// POSIX::.pm's 2000+ newCONSTSUB_flags() calls on initial load
// badly needs this for `SvPVX(cv)`'s buffer which is always 1 byte long string `""`
// not 2000 16 byte Newx() blocks with CUR=0 and LEN=16.
bulk option that probably doesn't exist inside libperl, lack of infrastructure
// below probably needs to be CPAN visible, if the fantasy wishlist feature
// of nukeing a module's *main::MyPkg:: and then de-mmaping the
// XS .so and .dll files from address space at any time actually was
// safe, we'd rather not have CPAN shared libs creating this kind of
// SV * POK with a "dbl quote" hardware literal string from their .rodata
// instead of libperl's .rodata if we can help it
const char ** PL_1_ch_strs[256] = {..... ,"1", "2", "3", .... "a", "b", "c", ,,,,};
sv = newSVtype(SVt_PV);
//calling this below is really bad, this whole sequence needs a dedicated Perl_newSV*() func
// or some more secret bitfield flags for Perl_newSVpvn_flags();
// or don't be scared and just use SvPV_set(); SvCUR_set(); yourself
// SvLEN_set(,0); i belive is redundant, the body allocators zero out
// SV body all non ghost fields IME
usepvn(sv, PL_1_ch_strs[c], 1, SV_I_DID_ADD_THE_NUL_CHAR);
SvLEN(sv, 0);
SvFLAGS(sv) |= (SVf_ISCOW | SVs_STATIC);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For better infrastructure, most or all src code instance of "abcd"
, "abc"
, "ab"
, "a"
need to be exterminated. Those are not strings, they are call integers or CPU registers.
sv = newSVpvc('exit');
sv = newSVpvc('INIT');
See what I did?
All LE CPU will need a secret byte order flipper macro inside the #define newSVpvc(_chrs)
macro.
The real .i
machine code decl of the above is
extern SV* Perl_newSVpvc_x(pTHX_ U32 chrs);
sv = newSVpvc('exitexit');
isn't ISO C and way too rare as a vendor extension to pretend that (a U64
multi char literal) exists anywhere on any C compiler Perl can target now, historically, or in the future.
sv = newSVpvc8('exit','exit');
is possible, day code we have such rapid short string comparison/parsing macros. newSVpvc8
internally on i386 is
extern SV* Perl_newSVpvc8u32x2_x(pTHX_ U32 chrlo, U32 chrhi);
and on x64
extern SV* Perl_newSVpvc8u64_x(pTHX_ U64 chrs);
https://www.altium.com/ 's ISO C99 compliant compiler for ARM32/64 has uint2_t
, uint3_t
, uint7_t
, and a whole bunch of funny stuff built into its official C grammar/BNF notation file. Its non-ISO extensions are all GCC-style syntax.
https://valhalla.altium.com/Learning-Guides/GU0122%20C-to-Hardware%20Compiler%20User%20Manual.pdf
I want to crack a joke after reading its entire manual, if Perl 5 was compiled with that C compiler, the output will binary describing transistor gates. There is now a Perl CPU executing the Perl ISA. The Perl CPU's socket pins are an RS-232 serial port called STDIN
.
I'm not sure after reading that C compiler's manual and IDE, if Altium can generate an ELF/EFI binary of ARM machine code or not or it only generates transistor gates for random unmodified desktop POSIX software, and that transistor gate language can only execute in BOCHS
or QEMU
, or be emailed to Taiwan to get back as an actual microchip (big $$$).
Altium's C99 compiler has USA and EU govt certificates to generate GUI-aware binaries that draw automobile digital speedometers and aircraft pilot LCD GUIs. Enough said who their customers are. The last time I rented a Nissan, I found out the radio was a 5 year old no-Google Android 6.0 OS, on a "this year" 4 month old car.
Nissan bought the cheapest android SOC chips they could find, that were heading to be melted or burned anyway, because they were obsolete, and nobody, not at any cost, will solder them into a phone, not even a $40 phone.
The Nissans's dashboard has a 1 way serial port that pushes data to Android 6 (tire pressure icon, etc), the radio has absolutely no electrical way to talk back to the GUI on the instrument console. Proper design IMO.
{ | ||
struct op_multiparam_aux *p = (struct op_multiparam_aux *)aux; | ||
UV nparams = p->n_positional; | ||
EXTEND(SP, (IV)(3 + nparams + 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the MEXTEND mortal stack stretcher macro here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other cases (OP_MULTIDEREF
, OP_MULTICONCAT
) don't do those either; I was following similar structure. Perhaps in a separate commit we could add that later. Though this is only during B
's inspection, it's not a hot performance path. I think the slight performance cost is fine for the simpler code.
for(UV parami = 0; parami < nparams; parami++) | ||
mPUSHu(p->param_padix[parami]); | ||
mPUSHu(p->slurpy_padix); | ||
XSRETURN(3 + nparams + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace with PUTBACK; return;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, just following similarity with the other cases.
if(namepv) | ||
sv_catpvf(retsv, ":%s=%" UVf, namepv, paramidx); | ||
else | ||
sv_catpvf(retsv, ":(anon)=%" UVf, paramidx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#define PadnamePV(pn) (pn)->xpadn_pv
#define PadnameLEN(pn) (pn)->xpadn_len
#define PadnameUTF8(pn) 1
#define PadnameSV(pn) newSVpvn_flags(PadnamePV(pn), PadnameLEN(pn), SVs_TEMP|SVf_UTF8)
":(anon)=%"
can be de-duped to 1 branch above with ":(%s)=%". "anon",
in theory PadnameLEN(pn)
and sv_catXXX()
family calls should be taken advantage of here, rather than repeatedly going through the printf engine.
IDK and IDC enough, and probably its impossible to write a bug ticket with a failure/defect demo, about it, but I see UTF8 flag is perma-on in the data source API, but we arent propagating it to the higher level. And this is XS::APItest anyways, so perfection isnt critical. but someone might look at this in the future for "best practices" ideas, and then copy paste quicky hacky code into a more visible API.
lib/B/Deparse.pm
Outdated
# Look for OP_NULL[OP_PARAMTEST[OP_PARAMSTORE]] | ||
if ($o->name eq 'null' and $o->flags & OPf_KIDS and | ||
$o->first->name eq 'paramtest' and | ||
$o->first->first->name eq 'paramstore') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$o->first
cache the retval of the method or key ($first = $o->first)->name
.
|
||
length $sig[$parami] > 1 ? | ||
( $sig[$parami] .= ' ' ) : | ||
( $sig[$parami] = '$' ); # intentionally no trailing space |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont look up same thing over and over in an array, do a $sref = \$sig[$parami];
or $s = $sig[$parami];
. probably the first is easier in this sub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The performance improvement of that indirection will be absolutely miniscule given this is deparse of a signatured sub; hardly a hot path. I prefer the clarity of the code as written.
|
||
my $defop = $paramtest->first->first; | ||
if ($defop->name eq "stub") { | ||
$sig[$parami] .= "$assign"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
y the ""s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Symmetry with the other case below.
char slurpy; /* the sigil of the slurpy var (or null) */ | ||
OP *elemops; /* NULL, or an OP_LINESEQ of individual element and fence ops */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move field char slurpy; to the bottom of the struct. don't have hidden alignment holes.
* zero or more arguments. | ||
*/ | ||
UV next_argix; /* the argument index of the next parameter we add */ | ||
UV opt_params; /* number of optional scalar parameters */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using 64 bit integers to store an arrays length on i386 CPUs? Since when is void *
8 bytes long on a Pentium 3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking a little bit more, I think this API should be counting things not in types STRLEN/Size_t, but in type U16 (perfection design/ideal design) or type I32/U32 (some user's bizzare and unreasonable definition of "production code"). Since this API represents pieces of .pl/.pm src code should be U32 is the maximum non-fuzzing/non-pentesting amount of named @_ args legally possible. Above that, above 4.2 billion named my $named+args_from_AT_UNDERSCOE;
, yy_lex()
is being fed infinite PP src code nonsense on FD STDIN
generated by another pen testing .pl
script running in another perl process on a 64 bit CPU.
So these counts ideally should be U16 on all 32 and 64b CPUs. With appropriate overflow/fatal effor bounds checks as needed, Or a compromise design for unrealistic PP src, code, of type U32 on all 32 and 64b CPUs.
Some years ago, there was a legit production PP code ticket filed and fixed, so 64b Perl can do this $obj->meth(@{$ref_to_arr_with_4_dot_5_billion_scalars});
. But this API in this PR, represents fixed length, named, incoming arguments,.
It doesn't represent or implement basic day 1 of Perl lang @_
variable length upto infinity (aka OOM) incoming arguments feature. So it doesn't need to count to 4.9 billion SV*
s on 64bit CPUs. Type U16
or max U32
should be used. Anything larger type is just creating a permanent ocean of 0x00
bytes that are wasteful. So no Size_t
, it should be U16
or I32
or U32
.
I32
can be a good choice efficient choice if negative non-0 value or sign bit on is used to mean some specific behavior vs the behavior of a positive non-0 value.
Perl C API historically likes to like negative I32
s or negative SSize_t
to mean the object's string/array/contents are UTF8-on/UTF8-yes. instead of creating a whole new U8 or U32 somewhere, just to transport a single 1/0 bit, which indicate UTF-yes vs UTF-no flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use UVs prettymuch universally for these sorts of things everywhere else.
if(signature->elemops) | ||
op_free(signature->elemops); | ||
if(signature->params) { | ||
for(UV parami = 0; parami < signature->nparams; parami++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i386 doesnt have 8 byte pointers, use Size_t.
/* handle '$=' special case */ | ||
if(varop) | ||
if(padix) | ||
yyerror("Optional parameter lacks default expression"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was struct yy_parser_signature_param *param = subsignature_push_param();
leaked if this error branch was entered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's been stored in the params
array of the current struct yy_parser_signature
, which will be tidied up on unwind by the deferred call to destroy_subsignature_context()
.
op.c
Outdated
UV end_argix = signature->next_argix; | ||
|
||
struct op_multiparam_aux *aux = (struct op_multiparam_aux *)PerlMemShared_malloc( | ||
sizeof(struct op_multiparam_aux) + end_argix * sizeof(PADOFFSET)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope I know PEMDAS but a () would be nice and make it easier to read, on what gets multiplied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair. Will add.
/* Move the entire chain of kid ops in one go */ | ||
OpMORESIB_set(cLISTOPx(sigops)->op_last, fenceops->op_first); | ||
cLISTOPx(sigops)->op_last = fenceops->op_last; | ||
OpLASTSIB_set(cLISTOPx(sigops)->op_last, sigops); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
factor out cLISTOPx(sigops)
to a C auto i suspect its many derefs deep and keeps being re-read after each of these assignments in -O1/-O2. cLISTOPx(sigops)->op_last
is written so many times its a puzzle what happened to the old target ptr struct, and the new target ptr, and is new or old target inside the container after this block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cLISTOPx()
is just a typecast macro; it isn't doing any active behaviour here.
op.h
line 472:
#define cLISTOPx(o) ((LISTOP*)(o))
struct yy_parser_signature_param *params = signature->params; | ||
UV max_argix = 0; | ||
|
||
for(UV parami = 0; parami < signature->nparams; parami++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no 64 bit pointers on ARM32/i386.
PADOFFSET padix = param->padix; | ||
|
||
while(max_argix < argix) { | ||
aux->param_padix[max_argix] = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't double deref in a loop. CC has no idea if &(aux->param_padix)
and &(aux->param_padix[5])
happened to be the same mem addr.
IDK what realistic values are for max_argix, but if its over 16 x U32/U64 or 32 x U32/U64 call libc's memset aka perl Zero(), below that, this loop will win in speed b/c its move 4/8 aligned bytes at a time, while all libc memsets, need 4-8 branches to figure out alignment and if to use 2/4/8/16 sse/32 avx/64 avx512 loops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As written elsewhere, honestly I don't expect many signatured functions to ever have more than about 4 actual parameters. People don't usually write the sort of code that declares hundreds of parameters. Optimising for large memset-style operations is pointless here, vs the clarity of the code as written.
OP *o = newUNOP(OP_NULL, 0, paramtest); | ||
|
||
paramstore->op_next = o; | ||
paramtest->op_next = o; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these zeroings really needed? use a step debugger to find out whats inside. are they already zeroed b/c of their earlier allocator() always zering new structs? is it uninit memory? was there a resource tracked ptr here that we need to free in this slot before zering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's o
, the current op pointer. Not zero.
UV n_positional; /* = the number of mandatory + optional scalar parameters, not counting a final slurpy */ | ||
char slurpy; | ||
PADOFFSET *param_padix; /* points at storage allocated along with the struct itself, immediately following */ | ||
PADOFFSET slurpy_padix; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sort all struct fields by alignment and size, dont use 64 bit ptrs on 32 bit CPUs
STATIC void | ||
S_av_refresh_elements_range(pTHX_ AV *av, IV startix, IV endix) | ||
{ | ||
for(IV ix = startix; ix < endix; ix++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AV*s use SSize_t not IV.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just moved the previous code, which also used an IV
. I wasn't going to change the type because of that.
S_av_refresh_elements_range(pTHX_ AV *av, IV startix, IV endix) | ||
{ | ||
for(IV ix = startix; ix < endix; ix++) { | ||
SV **svp = av_fetch(av, ix, FALSE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless var av is reachable from PP lang, use fetch_simple() variants, we dont need to call TIEARRAY
PP methods here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again - I just moved existing code, I wasn't going to spend a while thinking whether it should be different.
In any case, most of the time this runs it's against the special @_
variable, which already has !AvREAL()
behaviours; I wasn't sure if the _simple
variant can cope with that.
{ | ||
for(IV ix = startix; ix < endix; ix++) { | ||
SV **svp = av_fetch(av, ix, FALSE); | ||
SV *newsv = newSVsv_flags(svp ? *svp : &PL_sv_undef, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It takes alot of work for sv_setsv_flags()
to figure out &PL_sv_undef
is empty. This isn't Chrome or NodeJS, its C. It won't constant fold. Grep the perl repo for the correct way to make a new read write undef SV*
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I just moved the existing code. We can tidy it up later.
for(IV ix = startix; ix < endix; ix++) { | ||
SV **svp = av_fetch(av, ix, FALSE); | ||
SV *newsv = newSVsv_flags(svp ? *svp : &PL_sv_undef, | ||
(SV_DO_COW_SVSETSV|SV_NOSTEAL)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SV_DO_COW_SVSETSV good idea
SV **svp = av_fetch(av, ix, FALSE); | ||
SV *newsv = newSVsv_flags(svp ? *svp : &PL_sv_undef, | ||
(SV_DO_COW_SVSETSV|SV_NOSTEAL)); | ||
if(!av_store(av, ix, newsv)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is av a TIEARRAY? Is this really going to return false? Why are we continuing execution and not throwing a panic:/heap corruption/OOM? Can it be a TIEARRAY? why not use av_simpleX() family of calls here?
Renew() Newx() handles all OOMs NULL RETVALS for us unless we explicitly temporarily disabled that, which is done approximately in only 1 place inside the entire P5P repo, and ~1-2 places on grep CPAN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above.
#define av_refresh_elements_range(av, startix, endix) S_av_refresh_elements_range(aTHX_ av, startix, endix) | ||
STATIC void | ||
S_av_refresh_elements_range(pTHX_ AV *av, IV startix, IV endix) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the av_extend()? so we aren't constantly going inside libc's realloc every 1-5 SVs. I forgot the exact increment unit av_store() does on a grow event, but even wanting know that count is UB/bad coding practices. av_extend() it once. We know the max length, we aren't iterating a HV or a SQL DB or a readdir() or something off an AJAX socket here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to av_extend()
. We don't alter the size of the array, we're just refreshing SVs at certain indices within it. This is an in-place alteration that preserves the size.
pp.c
Outdated
|
||
UV parami; | ||
for(parami = 0; parami < nparams; parami++) { | ||
PADOFFSET padix = aux->param_padix[parami]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
factor out of the loop reading's struct aux's param_padix field over and over , nothing will be reallocing it afaik
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, will do.
} | ||
|
||
SV **valp = av_fetch(defav, parami, FALSE); | ||
SV *val = valp ? *valp : &PL_sv_undef; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is valp retval going to be NULL?
SV*
val should be set to NULL instead of &PL_sv_undef
here if SV**
was NULL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good question. The same question could be applied to pp_argelem
where I copied this from. Perhaps there could be holes in @_
. It's trying to act defensively and treat those holes as just undef
, as it should.
if (UNLIKELY(TAINT_get) && !SvTAINTED(val)) | ||
TAINT_NOT; | ||
|
||
SvSetMagicSV(*padentry, val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go read the source code for macro SvSetMagicSV()
and tell me if its optimizations apply or don't apply here. You need to set a breakpoint in your C debugger and examine left and right side SV ptrs to fix this line either through a code change or a comment saying you verified certain behavior can (if so when and % chance of it happening in typical production code), or that certain behavior will always happen here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go read the source code for macro
SvSetMagicSV()
and tell me if its optimizations apply or don't apply here. You need to set a breakpoint in your C debugger and examine left and right side SV ptrs to fix this line either through a code change or a comment saying you verified certain behavior can (if so when and % chance of it happening in typical production code), or that certain behavior will always happen here.
Also from the comment above, if right side SV is NULL, there is a better way to take an existing SV ptr and set it to undef. grep the repo's root .c and .h files to learn how. Since you want to fire SetMagic here (why and when can there be SMG here?), make sure the correct set to undef macro or fn either fires SMG for you or fire SMG yourself right after it returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll have to explain more what you mean by that comment.
In any case again this is all behaviour copied from the existing pp_argelem
which I observe has run without reported issue for the past decade or so.
|
||
if(av_count(av)) { | ||
/* see "target should be empty" comments in pp_argelem above */ | ||
av_refresh_elements_range(defav, parami, parami + argc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should retval of av_count(av) be passed as arg 4 to av_refresh_elements_range()? does av_refresh_elements_range() need that info? dont look it up twice. IDK enough to say what to do here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous code I copied it from didn't make use of it.
if (UNLIKELY(TAINT_get) && !SvTAINTED(val)) | ||
TAINT_NOT; | ||
|
||
av_store(av, avidx++, newSVsv(val)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newSVsv is not cow aware, see your code from above how to make it cow aware. dont use 64b ptrs on i386/arm32. newSVsv(&PL_sv_undef) is not how to alloc a new SV with value undef. use av_simple variants unless TIEARRAY can happen here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm; that's unfortunate. Given how often we'd want to do "the normal thing", perhaps it suggests we should make a COW-aware version; maybe newSVsv_cow()
or somesuch? It gets a pain to have to remember the exact names of all these flags every time.
argc -= 2; | ||
|
||
if (UNLIKELY(SvGMAGICAL(key))) | ||
key = sv_mortalcopy(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#define sv_mortalcopy(sv) Perl_sv_mortalcopy_flags(aTHX_ sv, SV_GMAGIC|SV_DO_COW_SVSETSV)
looks good on this line
SV **svp; | ||
|
||
svp = av_fetch(defav, parami, FALSE); parami++; | ||
SV *key = svp ? *svp : &PL_sv_undef; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What on earth does hv_store_ent(hv, &PL_sv_undef, newSVsv(val), 0);
mean?
how about executing hv_store_ent(hv, &PL_sv_undef, newSVsv(val), 0);
5 or 9 times in a row, each time with a different addr in var "val"? what does that mean? why is the previous sentence valid correct behavior and not a panic/heap corruption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A SEGV here would be just as good and more efficient than an actual croak()
fn call located here and its C string that nobody will ever see unless they use a disassembler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What on earth does
hv_store_ent(hv, &PL_sv_undef, newSVsv(val), 0);
mean?
Same thing as
my $key = undef;
$hv{$key} = $val;
I.e. treat the key as an empty string but provoke the "uninitialized" (sic) warning. See hv.c
line 545.
Executing it 5 or 9 times in a row will have the effect of storing a copy of the final val in the hash, having first thrown away the first 4 or 8 temporary copies.
Exactly as per
$hv{+undef} = $_ for @five_or_nine_values;
|
||
if(!ok) | ||
return cLOGOP->op_other; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic tree makes no sense. test each c auto or malloc backed variable exactly once in this if/else tree. var
OP*
PL_op should be saved to a C auto at the very top of this PP func to stop multiple rereads.
PL_op is an lvalue but it is not a variable. If you believe its a C variable goto the top of pp.c and add this line. it will be harmless.
#include "EXTERN.h"
#define PERL_IN_PP_C
#include "perl.h"
#include "keywords.h"
#include "invlist_inline.h"
#include "reentr.h"
#include "regcharclass.h"
#undef PL_op /* <<<<< Add this line and push this commit to blead, its a C variable right? */
/* variations on pp_null */
PP(pp_stub)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what most of that comment is about, but I'll pull out the PL_op->op_private
into a U8 priv
as many ops seem to do.
SV *value = POPs; | ||
|
||
SvPADSTALE_off(TARG); | ||
SvSetMagicSV(TARG, value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
research with a step debugger and BPs if the macro is appropriate or should be sv_setsv(dsv, ssv) + SvSMG(dsv)
|
||
assert(PadnamePV(PadnamelistARRAY(PL_comppad_name)[padix])[0] == sigil); | ||
|
||
param->padix = padix; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a /* this is a different struct member */
or /*different var */
comment, or maybe rename padix
the c auto to pad_ix
. these 2 lines took too many seconds with eyeballs to figure out they arent the same line accidentally written twice. makes it easier to read by future ppl.
The previous names were just copied from similar fields in previous signature handling code. These new names are a better fit for their current and intended future purpose.
By defining a helper term for optional defaulting expression we can remove one of the three alternate cases down to just two. This will help when adding more code to these action blocks in future,
These two new helper functions will be useful for sharing behaviour with upcoming code, when adding OP_MULTIPARAM.
…gument processing The specific behaviour of this obscure modification case is not documented or relied upon elsewhere in code. Rather than attempt to preserve this cornercase, it's easier just to no longer test for it, as upcoming changes will alter the values that are visible.
Creates a new UNOP_AUX op type, `OP_MULTIPARAM`, that handles all of the initial behaviour of assigning values to parameters of a subroutine signature out of values passed by the caller. This is created in a similar style to other multi-ops like `OP_MULTIDEREF` and `OP_MULTICONCAT` where the op's aux structure contains a sub-program of sorts, which describes all of the small details of operation. Also adds a LOGOP, `OP_PARAMTEST` and UNOP `OP_PARAMSTORE` which are responsible for implementing the default expressions of optional parameters. These use the `SvPADSTALE` flag set on pad lexicals used as parameters to remember whether assignment has happened, ensuring that missing vs present-but-undef can be detected in a way that does not depend on counting arguments on the stack. This change is carefully designed to support two future ideas: * Named parameters as per PPC0024 https://github.com/Perl/PPCs/blob/main/ppcs/ppc0024-signature-named-parameters.md * "no-snails"; the performance optimisation that avoids storing incoming argument values in the `@_` AV and instead consumes them directly from the stack
5dd6e3a
to
0f590aa
Compare
I have made some slight improvements based on a few of @bulk88's comments that I could understand and actually made sense. Most of the rest I couldn't see the relevance of (including a long rambling about C compilers and Nissan cars?!), or other things such as subtle details about possibly-64bit UV values used as array indexes that were copied from other code anyway. Most of the rest of the latter may be valid things to look into, but were already present in the code before I started. They should be looked at separately, either as commits before this one is merged, or afterwards. |
Creates a new UNOP_AUX op type,
OP_MULTIPARAM
, that handles all of the initial behaviour of assigning values to parameters of a subroutine signature out of values passed by the caller. This is created in a similar style to other multi-ops likeOP_MULTIDEREF
andOP_MULTICONCAT
where the op's aux structure contains a sub-program of sorts, which describes all of the small details of operation.Also adds a LOGOP,
OP_PARAMTEST
and UNOPOP_PARAMSTORE
which are responsible for implementing the default expressions of optional parameters. These use theSvPADSTALE
flag set on pad lexicals used as parameters to remember whether assignment has happened, ensuring that missing vs present-but-undef can be detected in a way that does not depend on counting arguments on the stack.This change is carefully designed to support two future ideas:
Named parameters as per PPC0024
"no-snails"; the performance optimisation that avoids storing incoming argument values in the
@_
AV and instead consumes them directly from the stack--
This set of changes currently does not have a perldelta entry, because it doesn't directly make any changes that are end-user visible. However, since it is quite a significant internal change, I could be talked into writing an entry in the "internal changes" section anyway. Reviewers: Thoughts?