-
Notifications
You must be signed in to change notification settings - Fork 585
Dedicated SV copying code in place of Perl_sv_setsv_flags #23202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blead
Are you sure you want to change the base?
Conversation
Cloning is rather unfortunate choice of words, given that it has a very specific meaning in our codebase that is quite different from what this PR is about. Renaming the PR may be helpful. |
c392526
to
91c2b99
Compare
I've made a lot of changes following earlier comments - thanks for those - and have finally force-pushed. These changes aren't complete. For example:
|
sv.c
Outdated
if (sflags & SVf_IsCOW) { | ||
sv_buf_to_rw(ssv); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unclear if there's a bug here carried over from Perl_sv_setsv_flags
.
If (!(sflags & SVf_IsCOW))
above, sflags
isn't updated to reflect that SVf_Is_COW
is now set on ssv
, but why would we not want to do sv_buf_to_rw(ssv)
in that situation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to make another PR about a very sketchy function in Perl 2003-2025 with 1 parent caller, its name is sv_setsv_cow()
. I might be mistaken, but your code highlight, and what I think 1 branch in sv_setsv_cow()
is doing, in my eyes, is a virgin never been HW ROed buffer, that for the first time will be getting HW RO perms added, therefore no ring 0 call is needed to mark it RW perms. verify I am making any sense, this feature isn't available on Windows and I don't see the point/personal necessity to port it to Windows. It would take me 2-5 hours to do that, somewhere between easy and monotonous/unchallenging on the difficulty scale.
Edit:
Since my Win32::API::Callback for many years already has MsMalloc()
blocks that are inter-ithread safe, clone safe, Perl ref counted, executable perm-ed (x86/x64 mach code) read only perm-ed (DEP/being a good citizen). If I wrote such an API, I would probably write the patch for WinPerl to have HW RO COW buffers for production grade default WinPerl stable. Win32::API::Callback
flips the RO RW perm bit manually, with minor help from the MS-only Public API HeapCreate()
.
AFAIK Father C's (???) #define PERL_DEBUG_READONLY_COW
is un-usable in a production stable perl b/c extreme memory waste. I think it give each 'Newx/
malloc()block an entire 4096 byte long page valgrind/dr memory style. Win 7's undocumented turn-key HW RO malloc feature,
RtlProtectHeap()has no !!! memory/RAM overhead, since the blocks still have WinOS's default 8 byte long headers.
RtlProtectHeap()'s runtime CPU overhead, ...., well ....., that is a different story.
RtlProtectHeap()internally is a large tree of if/else branches and some loops. B/c its closed source, I can't describe exactly what all the asm code logic does. But
RtlProtectHeap()is very simple internally, the only 2 functions it is capable of calling are
ZwQueryVirtualMemory()and
ZwProtectVirtualMemory()which are
interrupt 0x2e` system calls.
More realistically, Perl 5 needs to just verbatim copy Chrome Browser's hardware RO COW/Heap implementation. I'm sure by now Webkit and Mozilla also use Chrome's HW RO COW/Heap design.
Remember Chrome's design is penetration tested and extensively fuzzed, to prevent rootkit-ed/exploited chroot
jailed OS processes aka same origin browser tabs aka V8 isolates (my_perl
in Perl-ese) aka JS VMs instances, where blackhat javascript code escaped the javascript sandbox and now has C lang/Asm machine code control over an OS thread in a chroot
jailed OS process.
Chrome's design is simple, 2 mmaps at different address ranges, or in 2 different OS processes, to the same phy RAM pages, or to the same pagefile virtual RAM pages. The JS runtime/runloop engine only knows the address of RO-perm-ed view or address range of mmap() object. The hypervisor
, which is a same process different OS thread, or a different OS process, knows the location of BOTH address ranges, the RO public
address range, and the RW secret
address range. The "hypervisor" (in Perl-ese that means the "proprietary undocumented internals of sv.c and libperl.so"), so the "hypervisor", gets called from untrusted JS bytecode, to RC++ or RC-- an JS data-ish object or a JS Function object, and the hypervisor uses the secret RW memory window to change the reference count on a ref counted/color counted JS object, Untrusted byte code can read the RC and other JS object header fields (SV *
header), but untrusted JS byte code, even if exploited to machine code/C level, can't rewrite a COWed/RO/immutable/static/const JS object.
Since libperl.so and PP code fundamentally, as designed, are a https://en.wikipedia.org/wiki/Arbitrary_code_execution exploit, P5P's only duty is user error/dev error/idiot mitigation. Perl and P5P have minimal security obligation to think of, P5P would just embed the RO ptr mmap to RW ptr mmap conversion equation into macro #define share_hek_hek(hek)
and mission accomplished,
If a user has suffered damages from Perl or /bin/sh
being an arbitrary code execution exploit, please send all legal threats to https://kernel.org/ and https://www.opengroup.org and https://canonical.com/ .
91c2b99
to
583bbac
Compare
583bbac
to
d1e53fd
Compare
Since around the 5.10 era, `Perl_sv_setsv_flags` has unconditionally set `SvPOK_only(dsv)` in the `SVp_POK` branch. The associated comment reads: /* Whichever path we take through the next code, we want this true, and doing it now facilitates the COW check. */ Things have changed since 5.10 though, in particular using the `SVf_POK` to distinguish between a value that started off as a string from one that was originally an integer/float and later stringified. This commit: * Removes the `SvPOK_only(dsv)` in favour of `SvOK_off(dsv)` and hoisting the copying of `sflags` over. * Transforms the subsequent now-redundant `SVf_POK` toggles into asserts (to help reduce) the chance of inadvertent behaviour changes.
`Perl_sv_setsv_flags` is a hot function that contains liberal sprinklings of `SvOK_off()`. This commit changes two instances, where the operand SV cannot possibly be using the OOK hack, to do direct flag twiddling instead. `SvOK_off()` does two things: 1. Toggles off some flags: SvFLAGS(sv) &= ~(SVf_OK|SVf_IVisUV|SVf_UTF8) 2. Checks for use of the OOK hack and undoes it: ((void)(SvOOK(sv) && (sv_backoff(sv),0))) At least some compilers seem to struggle to figure out when `SvOOK(sv)` cannot be true and to then elide the call to `sv_backoff()`. This is desirable when: 1. ssv & dsv are both lower types than SVt_PV and cannot support OOK 2. inside a block following a conditional check that OOK is not in use In the two cases identified, the flag toggling is now done explicitly.
d1e53fd
to
05d5efa
Compare
Perl_sv_freshcopy_flags creates a fresh SV that contains the values of it source SV argument. It's like calling `new_SV(dsv)` followed by `sv_setsv_flags(dsv, ssv, flags`, but is optimized for a brand new destination SV and the most common code paths. The intended initial users for this new function were: * Perl_sv_mortalcopy_flags (still in sv.c) * Perl_newSVsv_flags (now a simple function in sv_inline.h) Perl_sv_freshcopy_flags handles the following cases: * SVt_NULL * SVt_IV * SVt_NV * SVt_PV * SVt_LAST For everything else, it calls S_sv_freshcopy_PVxx which handles: * SVt_INVLIST * SVt_PVIV * SVt_PVNV * SVt_PVMG - with no GET magic For everything else, there's a fall back to sv_setsv_flags. S_sv_freshcopy_POK is a dedicated helper for string swipe/COW/copy logic and is called from both Perl_sv_freshcopy_flags and S_sv_freshcopy_PVxx. With these changes compared with the previous commit: * `perl -e 'for (1..100_000_0) { my $x = { (1) x 1000 }; }'` runs about 20% faster * `perl -e 'for (1..100_000_0) { my $x = { ("Perl") x 250 }' runs about 40% faster * `perl -e 'for (1..100_000_0) { my $x = { a => 1, b => 2, c => 3, d => 4, e => 5 }; }'` is a touch faster, but within the margin for error * `perl -e 'for (1..100_000_0) { my $x = { a => "Perl", b => "Perl", c => "Perl", d => "Perl", e => "Perl" } ; }'` runs about 17% faster
Besides using the just-introduced faster path for SV copying, this allows the check for SV_GMAGIC to be pushed into the called function without having to worry about SV leaks. Two additional micro-optimizations are also in this commit: * A pointer to xav_fill is cached for use in the loop. This can be used directly to update AvFILLp(av), rather than having to get there from av's SV* each time. * The value of the loop iterator, i, is directly written into xav_fill, rather than getting the value in that slot, incrementing it (to get the same value as i), and writing it back.
05d5efa
to
bf95ea0
Compare
(void)SvPOK_only(dsv); | ||
(void)SvOK_off(dsv); | ||
SvFLAGS(dsv) |= sflags & | ||
(SVf_POK|SVp_POK|SVf_IOK|SVp_IOK|SVf_NOK|SVp_NOK|SVf_UTF8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why isnt SVF_ISUV
copied over from sflags
? did anything earlier wipe the SVf_UTF8
inside SvFLAGS(dsv)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out, I think the flag handling around this bit of code could be tidied up further. Will look at it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember (void)SvOK_off(dsv);
being a func call was IIRC a Father C 2014 or 2012 quick fix for random memory corruption/leaks involving SvOOK()
back when the OOK delta count was stored in a SVt_PVIV
inside field SvIVX()
. Someone later on changed OOK()
's design to be a BER/ASN.1/SMI/compressed integer stored in the PV buffer. Eliminating the former design limitation, that an OOK POK SV, can NEVER EVER EVER be a dual var with POK + OOK + IOK.
I'm not gonna touch cactus that is sv_backoff()
/SvOK_off(dsv)
unless I'm at a YAPC or a Toolchain summit with another qualified Perl C human 3-6 feet away.
/* Passes the swipe test. */ | ||
SvLEN_set(dsv, len); | ||
SvCUR_set(dsv, cur); | ||
SvPV_set(dsv, SvPVX_mutable(ssv)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
split mem read SvPVX_mutable(ssv)
from SvPV_set(dsv, )
, and move the mem read SvPVX_mutable(ssv)
and save it to a C auto, right before SvLEN_set(dsv, len);
. pointer aliasing rules in C. C abstract machine doesn't know if SvCUR_set(dsv, cur)
will write ontop of 8 bytes of memory backing SvPVX_mutable(ssv)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't make any difference on gcc builds. What do you see on MSVC builds?
/* (void)SvOK_off(ssv); but without the superfluous SvOOK_off(ssv)) */ | ||
SvFLAGS(ssv) &= ~(SVf_OK|SVf_IVisUV|SVf_UTF8|SVs_TEMP); | ||
|
||
SvPV_set(ssv, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this line to below SvCUR_set(ssv, 0);
, pointer aliasing, ILP, pipeline stall reasons. The value of SvANY(ssv) is unknown until SvPV_set(ssv, NULL);
is 100% completed by the CPU pipeline/ CPU conveyor belt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move which of these two lines? Neither of them need to know SvANY(ssv)
, so I'm confused.
#else | ||
dsv->sv_u.svu_rv = SvREFCNT_inc( old->sv_u.svu_rv ); | ||
#endif | ||
SvFLAGS(dsv) = SVt_IV|SVf_ROK; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move SvFLAGS(dsv) = SVt_IV|SVf_ROK;
up to between these 2 lines
assert(SvANY(dsv));
/////////////// here
#if defined (DEBUGGING) || defined (PERL_DEBUG_COW)
that way
dsv->sv_u.svu_iv = old->sv_u.svu_iv;
dsv->sv_u.svu_rv = SvREFCNT_inc( old->sv_u.svu_rv );
are likely to get some cpu ops deduped by the CC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GCC adds in more mov
s and a few additional branches with that change. I'd need more convincing to put it in.
#if defined (DEBUGGING) || defined (PERL_DEBUG_COW) | ||
dsv->sv_u.svu_rv = SvREFCNT_inc(SvRV(old)); | ||
#else | ||
dsv->sv_u.svu_rv = SvREFCNT_inc( old->sv_u.svu_rv ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ADD SvREFCNT_inc_simple_NN(sv)
its faster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's 100% safe to assume that (SvRV(old))
is true?
* taint and vstring magic, which are also handled here. */ | ||
|
||
static void | ||
S_sv_freshcopy_PVxx(pTHX_ SV* dsv, SV* ssv, const I32 flags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change retval type to SV*
and return SV* dsv
as the retval. Never waste a free register. Tailcall or reducing liveness or reducing peak non volatile reg usage opportunities exist now, or will exist in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did try it and (IIRC) it performed either worse or no better. I'll look again before force-pushing changes for some of the other comments above though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wouldn't be benchmark able, Its just a provision for the future. A human still needs to single step the asm code and manually look for callsites where that "setter" call is very close or right above a possible return some_size_t;
statement.
I have a never finished branch turning all to Perl_sv_set**v**_*mg*()
to return their SV*
arg, sv_2mortal()
style. Doing the change I described would allow 25%-50% of all XSUBs on CPAN, to use 100% volatile CPU registers, and zero reads/writes to the C stack, and zero save/restores of non-vol registers, since that 25%-50% group of XSUBs don't need to carry any C land variables, around any function call in their body.
* This function is marked for inlining, also to benefit the hot SVt_PV case. | ||
*/ | ||
PERL_STATIC_FORCE_INLINE void | ||
S_sv_freshcopy_POK(pTHX_ SV* dsv, SV* ssv, const I32 flags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change void to SV*
, return as retval SV*
dsv
see oth cmt 4 rational
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See reply to other comment. ;) I will still look at it again though.
Perl_sv_setsv_flags
is the heavyweight function for assigning the value(s) ofa source SV to a destination SV. It contains many branches for preparing the
destination SV prior to assignment. However:
This set of commits:
Perl_sv_setsv_flags
into a macro.Perl_sv_freshcopy_flags
and two static helper functions.Perl_newSVsv_flags
andPerl_sv_mortalcopy_flags
to use them.should use
Perl_newSVsv_flags
orPerl_sv_mortalcopy_flags
.Using perl's test harness as a guide:
Perl_newSVsv_flags
and57% of calls to
Perl_sv_mortalcopy_flags
.SVt_PV/SVp_POK
code handles 32% of calls toPerl_newSVsv_flags
and 36% of calls toPerl_sv_mortalcopy_flags
.S_sv_freshcopy_flags
code handles 95% of the remainder inPerl_newSVsv_flags
and 91% of the remainder in toPerl_sv_mortalcopy_flags
.With these changes compared with a build of blead:
perl -e 'for (1..100_000) { my $x = [ (1) x 1000 ]; }'
runs 10% fasterperl -e 'for (1..100_000) { my $x = [ ("Perl") x 250 ]; }'
runs 45% faster