Skip to content

Dedicated SV copying code in place of Perl_sv_setsv_flags #23202

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: blead
Choose a base branch
from

Conversation

richardleach
Copy link
Contributor

Perl_sv_setsv_flags is the heavyweight function for assigning the value(s) of
a source SV to a destination SV. It contains many branches for preparing the
destination SV prior to assignment. However:

  • If the destination SV has just been created, much of that logic isn't needed.
  • When cloning a SV, simple assignments (particularly IVs and PVs) dominate.

This set of commits:

  • Extracts the "is this CoWable?" test from Perl_sv_setsv_flags into a macro.
  • Adds S_sv_freshcopy_flags and S_sv_freshcopy_POK helper functions.
  • Modifies Perl_newSVsv_flags and Perl_sv_mortalcopy_flags to use them -
    and optimise the hot cases in particular.
  • Standardizes a number of call sites that did their own things but really
    should use Perl_newSVsv_flags or Perl_sv_mortalcopy_flags.

Using perl's test harness as a guide:

  • Bodyless code handles 45% of calls to Perl_newSVsv_flags and
    57% of calls to Perl_sv_mortalcopy_flags.
  • The SVt_PV/SVp_POK code handles 32% of calls to
    Perl_newSVsv_flags and 36% of calls to Perl_sv_mortalcopy_flags.
  • S_sv_freshcopy_flags code handles 95% of the remainder in
    Perl_newSVsv_flags and 91% of the remainder in to Perl_sv_mortalcopy_flags.

With these changes compared with a build of blead:

  • perl -e 'for (1..100_000) { my $x = [ (1) x 1000 ]; }' runs 30% faster

  • perl -e 'for (1..100_000) { my $x = [ ("Perl") x 1000 ]; }' runs:

    • 15% faster if newSV_type(SVt_PV) is NOT inlined
    • 30% faster if it IS inlined

  • This set of changes does not require a perldelta entry.

S_sv_freshcopy_flags is a drop-in replacement for `Perl_sv_setsv_flags`.
It is designed for use when the destination SV is being freshly created
and much of the logic in `sv_setsv_flags` is irrelevant.

The intended users for this new function are:
* Perl_sv_mortalcopy_flags
* Perl_newSVsv_flags

Those functions have been modified such that:
* Bodyless destination SVs are created inline
* SVt_PVs also have special casing
* SVt_PVMG and below use S_sv_freshcopy_flags
* Anything else drops back to using Perl_sv_setsv_flags

S_sv_freshcopy_POK is a helper function that concentrates on the string
assignment logic:
* Swipe the buffer
* CoW the buffer
* Copy the buffer

Using perl's test harness as a guide:
* 45% of Perl_newSVsv_flags / 57% of Perl_sv_mortalcopy_flags calls use
  the bodyless code
* 32% of Perl_newSVsv_flags / 36% of Perl_sv_mortalcopy_flags calls use
  the SVt_PV/SVp_POK code
* The S_sv_freshcopy_flags code handles the bulk of the remainder.

With these changes compared with a build of blead:

* `perl -e 'for (1..100_000) { my $x = [ (1) x 1000 ]; }'` runs 30% faster

* `perl -e 'for (1..100_000) { my $x = [ ("Perl") x 1000 ]; }' runs:
   * 15% faster if `newSV_type(SVt_PV)` is NOT inlined
   * 30% faster if it IS inlined

The overall reduction in branches when cloning SVs, and refocusing of branch
prediction within Perl_sv_setsv_flags, will hopefully give a meaningful
boost to realistic Perl applications.
@richardleach richardleach added the defer-next-dev This PR should not be merged yet, but await the next development cycle label Apr 15, 2025
SvNV_set(dsv, SvNVX(ssv));
SvFLAGS(dsv) |= SVf_IsCOW|SVppv_STATIC|
(sflags & (SVf_IOK|SVp_IOK|SVf_IVisUV|SVf_NOK|SVp_NOK|SVf_POK|SVf_POK|SVf_UTF8));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case: is getting vague and indecisive and unsure of itself own purpose, is this case: a generic SVf_IsCOW|SVppv_STATIC duplicator, or a highly specific "3 SvIMMORTALs()" duplicator, or a duplicator of "3 SvIMMORTALS" + any regular RCed forks/descendants of the 3 IMMs/RO SV bools? We don't need to dynamically & mask out current contents from sflags, if we can prove what flag bits sflags, and use a most efficient literal integer here,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a bit crufty, but seemed like a reasonable trade-off.

It's intended as a generic SVf_IsCOW|SVppv_STATIC duplicator, even though PL_sv_yes and PL_sv_no are the only cases that I think it will actually handle right now. I left it somewhat generic in case other SVppv_STATIC SVs are added in the future.

SVf_IsCOW|SVppv_STATIC seems worth handling as a special case so that S_sv_freshcopy_POK doesn't have to check for it before it gets to its hotter COW or copy cases.

This case didn't seem hot enough to optimise hard and I didn't want to bulk out S_sv_freshcopy_flags with variants to cover all potential future special SVs that might not have the same flags that PL_sv_yes & PL_sv_no do. I also didn't want to leave future additions unhandled, as it would likely not be obvious to any future implementers of special SVs that they might need to add a new case in this function.

return dsv;
case SVp_NOK:
SvNV_set(dsv, SvNVX(ssv));
SvFLAGS(dsv) |= (sflags & (SVf_NOK|SVp_NOK));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why read and extract bits from sflags? does SVp_NOK WITHOUT SVf_NOK and no other "OK" bits, does that have any meaning on a PP level? is it ever encountered at runtime in a blead/stable perl running normal production PP and CPAN XS code? do we need to consider buggy/weird/beginner CPAN XS code that would make such a SV* head or not?

|= with a literal on the right side, is much more efficient that bit extraction from var sflags, in CC optimized C code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically it's defensive coding because I don't know what flags are set by XS code out in the wild. Or to what extend we should support such code.

Happy for those discussions to happen, I just didn't want to arbitrarily decide in private. 😄

@@ -10178,20 +10509,76 @@ parameter.
SV *
Perl_newSVsv_flags(pTHX_ SV *const old, I32 flags)
{
SV *sv;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe Perl_mortalcopy_flags needs to be a thin wrapper around Perl_newSVsv_flags, or a tailcall to Perl_newSVsv_flags with a super secret or minimally secret flag in Perl_newSVsv_flags's I32 flags arg. Alot of duplicate looking code between mortalcopy and newSVsv.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I definitely recognise that and don't want it to stay like that.

I wasn't sure how to best to handle it, given the different handling of ssv == NULL and Perl_newSVsv_flags's early SvIS_FREED check.

(I'm still unclear why Perl_newSVsv_flags behaves like that. Under what circumstances is returning a NULL for the SV* not going to cause problems?)

Figured I'd give reviewers the chance to suggest implementation ideas... 😛

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am tempted to promote S_sv_freshcopy_flags to Perl_sv_freshcopy_flags and convert the callers to macros or inline functions:

SV *
Perl_sv_mortalcopy_flags(pTHX_ SV *const old, U32 flags)
{
    if (old)
        return sv_2mortal(sv_freshcopy_flags(old, flags));
    else
        return sv_newmortal();
}

^ especially if sv_2mortal separately becomes a macro or inline function of some sort. The alternative is to pass SVs_TEMP to sv_freshcopy_flags within the flags argument.

SV *
Perl_newSVsv_flags(pTHX_ SV *const old, I32 flags)
{
    if (old) {
        if (!SvIS_FREED(old))
            return sv_freshcopy_flags(old, flags);
        ck_warner_d(packWARN(WARN_INTERNAL), "semi-panic: attempt to dup freed string");
        
    }
    return NULL;
}

Having a quick look at call sites in core, it's very often apparent that the SV to be copied is non-NULL, so compilers have the opportunity to optimise away the if (old) check.

Still not entirely sure about the best way to handle the differing behaviour around SvIS_FREED though.

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under what circumstances is returning a NULL for the SV* not going to cause problems?)

Ignore that comment. My brain forgot that AV elements are currently NULL initialized. 🤦

@Leont
Copy link
Contributor

Leont commented Apr 29, 2025

Cloning is rather unfortunate choice of words, given that it has a very specific meaning in our codebase that is quite different from what this PR is about. Renaming the PR may be helpful.

@richardleach richardleach changed the title Dedicated SV cloning code in place of Perl_sv_setsv_flags Dedicated SV copying code in place of Perl_sv_setsv_flags Apr 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defer-next-dev This PR should not be merged yet, but await the next development cycle
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants