-
Notifications
You must be signed in to change notification settings - Fork 576
Dedicated SV copying code in place of Perl_sv_setsv_flags #23202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blead
Are you sure you want to change the base?
Conversation
S_sv_freshcopy_flags is a drop-in replacement for `Perl_sv_setsv_flags`. It is designed for use when the destination SV is being freshly created and much of the logic in `sv_setsv_flags` is irrelevant. The intended users for this new function are: * Perl_sv_mortalcopy_flags * Perl_newSVsv_flags Those functions have been modified such that: * Bodyless destination SVs are created inline * SVt_PVs also have special casing * SVt_PVMG and below use S_sv_freshcopy_flags * Anything else drops back to using Perl_sv_setsv_flags S_sv_freshcopy_POK is a helper function that concentrates on the string assignment logic: * Swipe the buffer * CoW the buffer * Copy the buffer Using perl's test harness as a guide: * 45% of Perl_newSVsv_flags / 57% of Perl_sv_mortalcopy_flags calls use the bodyless code * 32% of Perl_newSVsv_flags / 36% of Perl_sv_mortalcopy_flags calls use the SVt_PV/SVp_POK code * The S_sv_freshcopy_flags code handles the bulk of the remainder. With these changes compared with a build of blead: * `perl -e 'for (1..100_000) { my $x = [ (1) x 1000 ]; }'` runs 30% faster * `perl -e 'for (1..100_000) { my $x = [ ("Perl") x 1000 ]; }' runs: * 15% faster if `newSV_type(SVt_PV)` is NOT inlined * 30% faster if it IS inlined The overall reduction in branches when cloning SVs, and refocusing of branch prediction within Perl_sv_setsv_flags, will hopefully give a meaningful boost to realistic Perl applications.
SvNV_set(dsv, SvNVX(ssv)); | ||
SvFLAGS(dsv) |= SVf_IsCOW|SVppv_STATIC| | ||
(sflags & (SVf_IOK|SVp_IOK|SVf_IVisUV|SVf_NOK|SVp_NOK|SVf_POK|SVf_POK|SVf_UTF8)); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case:
is getting vague and indecisive and unsure of itself own purpose, is this case:
a generic SVf_IsCOW|SVppv_STATIC
duplicator, or a highly specific "3 SvIMMORTALs()" duplicator, or a duplicator of "3 SvIMMORTALS" + any regular RCed forks/descendants of the 3 IMMs/RO SV bools? We don't need to dynamically &
mask out current contents from sflags
, if we can prove what flag bits sflags
, and use a most efficient literal integer here,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's a bit crufty, but seemed like a reasonable trade-off.
It's intended as a generic SVf_IsCOW|SVppv_STATIC
duplicator, even though PL_sv_yes
and PL_sv_no
are the only cases that I think it will actually handle right now. I left it somewhat generic in case other SVppv_STATIC
SVs are added in the future.
SVf_IsCOW|SVppv_STATIC
seems worth handling as a special case so that S_sv_freshcopy_POK
doesn't have to check for it before it gets to its hotter COW or copy cases.
This case didn't seem hot enough to optimise hard and I didn't want to bulk out S_sv_freshcopy_flags
with variants to cover all potential future special SVs that might not have the same flags that PL_sv_yes
& PL_sv_no
do. I also didn't want to leave future additions unhandled, as it would likely not be obvious to any future implementers of special SVs that they might need to add a new case in this function.
return dsv; | ||
case SVp_NOK: | ||
SvNV_set(dsv, SvNVX(ssv)); | ||
SvFLAGS(dsv) |= (sflags & (SVf_NOK|SVp_NOK)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why read and extract bits from sflags
? does SVp_NOK WITHOUT SVf_NOK and no other "OK" bits, does that have any meaning on a PP level? is it ever encountered at runtime in a blead/stable perl running normal production PP and CPAN XS code? do we need to consider buggy/weird/beginner CPAN XS code that would make such a SV* head or not?
|=
with a literal on the right side, is much more efficient that bit extraction from var sflags, in CC optimized C code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically it's defensive coding because I don't know what flags are set by XS code out in the wild. Or to what extend we should support such code.
Happy for those discussions to happen, I just didn't want to arbitrarily decide in private. 😄
@@ -10178,20 +10509,76 @@ parameter. | |||
SV * | |||
Perl_newSVsv_flags(pTHX_ SV *const old, I32 flags) | |||
{ | |||
SV *sv; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe Perl_mortalcopy_flags
needs to be a thin wrapper around Perl_newSVsv_flags
, or a tailcall to Perl_newSVsv_flags
with a super secret or minimally secret flag in Perl_newSVsv_flags
's I32 flags
arg. Alot of duplicate looking code between mortalcopy and newSVsv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I definitely recognise that and don't want it to stay like that.
I wasn't sure how to best to handle it, given the different handling of ssv == NULL
and Perl_newSVsv_flags
's early SvIS_FREED
check.
(I'm still unclear why Perl_newSVsv_flags
behaves like that. Under what circumstances is returning a NULL
for the SV*
not going to cause problems?)
Figured I'd give reviewers the chance to suggest implementation ideas... 😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am tempted to promote S_sv_freshcopy_flags
to Perl_sv_freshcopy_flags
and convert the callers to macros or inline functions:
SV *
Perl_sv_mortalcopy_flags(pTHX_ SV *const old, U32 flags)
{
if (old)
return sv_2mortal(sv_freshcopy_flags(old, flags));
else
return sv_newmortal();
}
^ especially if sv_2mortal
separately becomes a macro or inline function of some sort. The alternative is to pass SVs_TEMP
to sv_freshcopy_flags
within the flags
argument.
SV *
Perl_newSVsv_flags(pTHX_ SV *const old, I32 flags)
{
if (old) {
if (!SvIS_FREED(old))
return sv_freshcopy_flags(old, flags);
ck_warner_d(packWARN(WARN_INTERNAL), "semi-panic: attempt to dup freed string");
}
return NULL;
}
Having a quick look at call sites in core, it's very often apparent that the SV to be copied is non-NULL, so compilers have the opportunity to optimise away the if (old)
check.
Still not entirely sure about the best way to handle the differing behaviour around SvIS_FREED
though.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under what circumstances is returning a
NULL
for theSV*
not going to cause problems?)
Ignore that comment. My brain forgot that AV elements are currently NULL initialized. 🤦
Cloning is rather unfortunate choice of words, given that it has a very specific meaning in our codebase that is quite different from what this PR is about. Renaming the PR may be helpful. |
Perl_sv_setsv_flags
is the heavyweight function for assigning the value(s) ofa source SV to a destination SV. It contains many branches for preparing the
destination SV prior to assignment. However:
This set of commits:
Perl_sv_setsv_flags
into a macro.S_sv_freshcopy_flags
andS_sv_freshcopy_POK
helper functions.Perl_newSVsv_flags
andPerl_sv_mortalcopy_flags
to use them -and optimise the hot cases in particular.
should use
Perl_newSVsv_flags
orPerl_sv_mortalcopy_flags
.Using perl's test harness as a guide:
Perl_newSVsv_flags
and57% of calls to
Perl_sv_mortalcopy_flags
.SVt_PV/SVp_POK
code handles 32% of calls toPerl_newSVsv_flags
and 36% of calls toPerl_sv_mortalcopy_flags
.S_sv_freshcopy_flags
code handles 95% of the remainder inPerl_newSVsv_flags
and 91% of the remainder in toPerl_sv_mortalcopy_flags
.With these changes compared with a build of blead:
perl -e 'for (1..100_000) { my $x = [ (1) x 1000 ]; }'
runs 30% fasterperl -e 'for (1..100_000) { my $x = [ ("Perl") x 1000 ]; }'
runs:newSV_type(SVt_PV)
is NOT inlined