-
Notifications
You must be signed in to change notification settings - Fork 585
pp_ref() builtin_pp_reftype(): strlen()+Newx()+memcpy()->100% pre-made COWs #23391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blead
Are you sure you want to change the base?
Conversation
On Sun, Jun 29, 2025 at 03:56:07PM -0700, bulk88 wrote:
-ref() PP keyword has extremely high usage. Greping my blead repo shows:
[ snip 100 further lines]
Please try to use meaningful commit summary lines and messages.
I tried to read the commit message. I had no idea what what the commit was
about, apart from something to do with a badly designed sv_ref/sv_reftype
API perhaps?. Looking at the actual diff I *guess* the commit is about
adding two new functions, sv_refhek and sv_reftypehek and then making use
of them to speed up pp_ref() etc.? And perhaps adding some new SV
constants? Who knows?
The commit also seems to have snuck in an unrelated change to pp_const().
…--
31 Dec 1661: "I have newly taken a solemne oath about abstaining from plays".
1 Jan 1662: "And after ... we went by coach to the play".
-- The Diary of Samuel Pepys
|
All tech decisions are documented with rational. Read them bullet point by bullet point. If I am the only Subject Matter Expert who knows the Perl VM C code, I can't really help out a React JSX SME or Go SME guru who tries to review the Perl C VM code. At that point I would have to offer a 6 hour pre-conference class at a TPRC or YAPCEU event on P5 VM C level design/optimization/O(n) complexity of interp internals to my students. Not a joke.
Correct. I didn't invent Current Utf8 isn't original to P5, but those 2 can't return a yes/no utf8 flag either. Also the backing storage and lifetime of those Returning HEKs always with the 2 new fns fixes pretty much every design problem I can think of. Returning new SV heads with RC=1, or new SV heads with RC=1+mortal, or accepting an in Also I decided returning the global/permanent SV*s, back to callers is a bad idea, I would have to mark the
Now what? Line 2 fatal errored. But if its a SVPV holding a HEK, it is silently decowed on line 2 without problems. Thats why the new API returns HEKs and doesn't use SV APIs. The SpiderMonkey JS engine's src code's initial commit is 1 year or max 2 years, after Perl 5's initial commit. So SM JS engine and Perl 5 engine are the same exact age. Since Netscape's/Mozilla's/Firefox's JS engine is very well used, tried, true, and tested for decades, borrowing design choices from it, can not be a bad idea. Perl's Rest of this is FF JS VM vs P5 VM management of CC/link time constants and how they appear on a C runtime level and at a ECMAScript/PP level. Spidermonkey calls them "Atoms", Perl calls them "HEK *"s or "U32 hash"s. Spidermonkey uses words like "Pinned" and "JSExternalString", to mean Perl's https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/String.h Here is a list of what Spidermonkey says are critical "" string/token/identifier literals that are required to run the JS engine. https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/CommonPropertyNames.h https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/Keywords.h https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/jsatom.cpp#L56 Spidermonkey Immortals https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/Id.cpp Spidermonkey has C global RW HEK*s structs baked into the engine (libperl.so or libspidermonkey.so) at CC time. Notice Spidermonkey has 1 byte long (latin 1) Immortal Currently in Perl, splitting a 8+24+16+16=64 bytes, detailed math: 8 SV* in AV* + 24 SV head + 16 XPV body + 8 OS malloc header + 16 min buf alloc rule of newSVpvn = 72 bytes offtopic: stolen buzzword/tech word from Perl VM lol https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/SelfHosting.cpp#L247 Not how SM burns in/attaches/binds https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/SelfHosting.cpp#L793 SM's analog of Here is your (davem's) short string experiment perl branch , as production code in SM https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/String-inl.h#L46 I think machine integers 0-99, can be converted to https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/String.h#L1043 This is for another ticket, but SM decided on > 1/4th unused space, or 75% mark, to do a https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/vm/StringBuffer.cpp#L30 offtopic, the JS stack, internally is the OS's C stack with some tiny Asm tricks, generic RISC and stack grows up HPUX PARISC compliant https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/jsnativestack.cpp
It is unrelated, but a tiny meaningless change, not worth a PR on its own, and then 2 lines long commit in the P5P repo. I can BP that line now to see what is inside the SV*. It makes no machine code difference in -O1/-O2 before and after. But I can now set a BP on the line, and see what is inside the SV* struct. If someone doesn't like the change, it means they don't know what a C debugger is, or how to use one, and can't call themselves a professional C dev if the only C level diag tool they know how to use is |
Let me summarise the above: "I am very smart." |
On Mon, Jun 30, 2025 at 11:06:34AM -0700, bulk88 wrote:
bulk88 left a comment (Perl/perl5#23391)
> On Sun, Jun 29, 2025 at 03:56:07PM -0700, bulk88 wrote: -ref() PP keyword has extremely high usage. Greping my blead repo shows:
> [ snip 100 further lines] Please try to use meaningful commit summary lines and messages.
All tech decisions are documented with rational. Read them bullet point
by bullet point.
If I am the only Subject Matter Expert who knows the Perl VM C code, I
can't really help out a React JSX SME or Go SME guru who tries to review
the Perl C VM code. At that point I would have to offer a 6 hour
pre-conference class at a TPRC or YAPCEU event on P5 VM C level
design/optimization/O(n) complexity of interp internals to my students.
Not a joke.
[ snip lots more stuff]
Ok, I will try once again to explain what I mean in really simple terms.
A commit message often has three parts:
1) the subject line;
2) an initial paragraph or two describing in general terms the purpose of
the commit;
3) further paragraphs to explain in more detail what the problem was and
how it was fixed.
These three are potentially for different audiences.
(1) is so that, for example, people bisecting, preparing perldeltas etc
can get a quick idea of the general category of this commit. It may be
read by people who have no particular interest in, or expertise in this
area.
For example a commit subject message of:
"add sv_refhek, sv_reftypehek functions for better ref() performance"
shows, at a casual glance, the two main bullet points of the commit: that
it's adding new functions, and that it's to address a particular
performance issue.
The proposed commit message of
"pp_ref() builtin_pp_reftype(): strlen()+Newx()+memcpy()->100% pre-made COWs"
says nothing about what the commit actually does, and is entirely cryptic
as to what issue it is addressing.
In the second part, you can mention briefly that the existing sv_ref() and
sv_reftype() functions have a poorly designed interface and why this
is inefficient, and how the new functions address this issue.
Then in the remainder of the commit message you can (but only if
necessary) address at great length details which aren't immediately
apparent from the commit diff, such as benchmark results, disassembly
listings, etc.
As regards the pp_const() change: even although it's a small change, it
should be done as a separate commit, with it's own commit message. That
message will explain why that change is useful. And will stop people who
are examining the main commit trying to understand why pp_const() needs
changing to address the ref() performance issue.
…--
A power surge on the Bridge is rapidly and correctly diagnosed as a faulty
capacitor by the highly-trained and competent engineering staff.
-- Things That Never Happen in "Star Trek" #9
|
"better ref() performance" is a side effect of this commit. My original hacking attempt 9 months ago ago was inspired by how bad EU::PXS's/P5P's official Line 222 in 5551b97
Line 248 in 5551b97
and Line 91 in 5551b97
Line 159 in 5551b97
and I'm annoyed at having to always copy-paste reimplement those templates with my own code in every single CPAN or private XS module. The 9 months ago hacking attempt branch started to modify I recently got annoyed again at and even more annoyed at this copy-paste
Since
I will assume anyone with a commit bit has read https://perldoc.perl.org/perlguts and https://perldoc.perl.org/perlxs atleast once in their life, even if they have never clicked "Save" on a HTML5/AJAX/DHTML/DOM are child safe programing languages. C is not. Even if you have never been at a gun range/actual training, every dev of any high level lang should know to ignore it (C), or stay away from C code when they see C code, or start writing complaints in forums and bug trackers. Any adult knows not to pick up a gun and "hey is this thing real?" and pull the trigger. Guns don't come with yellow and orange labels safety labels on them. Just like I can't read a single optree or regexp engine related commit, PP devs can't read a single I can read Python, I've never clicked the "Save" button in my life. I know not to try it. Beyond my skills. I don't expect the 50th percentile of PP-only devs to understand my commit message. And PP-only devs aren't capable of break-pointing C code or fixing it, or maintaining it. They aren't the audiance. If they read https://perldoc.perl.org/perlguts and https://perldoc.perl.org/perlxs top to bottom, atleast once in their life, they will understand enough of the commit message, even if they can't judge, debug, criticise, or modify the code. My original title was very clear explaining what the commit does: pp_ref() = P5P word I will revise the commit title to "add sv_refhek, sv_reftypehek functions for better ref() performance" since I don't have any reason not to, if other people with other eyeballs think that sounds better, then it is. I know I'm writing for someone else to read this in the future after I'm dead/bussed. I'm not writing it for me to read later, but someone like me in the future to know what I did and why, and what lead to the choices I made long ago, of all possible choices I had available to me at that time in history.
Where is the split or balance between .git repo message bloat, and accurate tech info to know what the original author was thinking at the time, years ago in history? The todo list can definetly be stripped from my original commit message, and just left in this PR, with a short sentence "Go read the PR associated with this commit, for info about things not fixed in this commit, and todo ideas".
Agree. |
Is the use of HEKs like this the ideal approach? For example, would we get better mileage in the long run out of something like an additional (I hear the groans) COW implementation for internal core consts (like the ref constants) that can also hold the string values of POK CONST OPs? (i.e. If some Perl code uses the same constant more than once, possibly more than 255 times across a large app, there's only one actual string buffer.) Or perhaps more focussed on reftypes, should they be implemented as If not, what are the downsides or blockers to those alternatives? |
Some people say that about me, but I don't think I am. PCs only do what their owners tell them to do. And they are damn good at it. My degree says EE, not CS. Material properties, entropy, solar noise, stuff over heating, stuff rusting, stuff UV rotting, stuff cracking/abrasion/flex damage, catching fire, estimating service life, human factors (end users, and even all train professionals make mistakes, how many redundancies are in the design?). Intel/AMD/ARM chips are created by humans. The veins in a tree leaf or flower petal, or pattern of fur on your cat, are not created by humans. Because humans created Intel/AMD/ARM chips, they only do what people tell them to do. If it wasn't you, some other man or woman sat down, and sketched out the blueprints/code/CAD/FEA, to make your melted nugget of 1/4 part beach sand and 3/4 parts copper https://download.intel.com/newsroom/kits/chipmaking/pdfs/Sand-to-Silicon_32nm-Version.pdf into an Intel Xeon CPU. Your not any better or worse than the other person. You can learn enough to be the other person if you want. Because they did. Some books I own: click to expand, way too off topic.https://www.amazon.com/Manga-Guide-Electricity-Kazuhiro-Fujitaki/dp/1593271972 I think I met the book's author in person at a convention/conference and bought it when it came out in 2010. Good read. |
On Wed, Jul 02, 2025 at 05:53:48AM -0700, bulk88 wrote:
My original title was very clear explaining what the commit does:
pp_ref() = P5P word
builtin_pp_reftype() = P5P word
strlen() = CS101 C lang class @ any community college
Newx() = P5P word
memcpy() = CS101 C lang class @ any community college
100% pre-made = my words (the benefit)
COWs = basic Unix/Linux word, fork is not an eating utensil
Ok I've tried gently cajoling you into the right direction. This is having
no effect, so I'll try a direct approach.
For most people's posts on p5p, or commit messages etc, I'd generally rate
them somewhere in the range of 5..10 on a scale of 1..10 in terms of
comprehensibility. I would rate most of your posts at about -20.
They are truly awful and incomprehensible. But the problem is that you
seem to be *completely* unaware of how just awful they are. So when someone
politely suggests ways that they might be improved, you don't accept that
point and just write lots of (awful and incomprehensible) prose about how
everything's fine. You are like someone who, having had a stroke, talks
slurred jibberish, but is completely convinced that their speech is normal.
As an example, you seem to be convinced that the commit message is
comprehensible. It's not. It's just a random collection of terms. Yes, I
understand what each term means. I just have no idea what you are trying
to communicate with them, even after having examined the diff.
I generally skip most of your p5p emails and PR requests after reading the
subject line and the first couple of paragraphs, because usually I have
(at that point) no idea what the post is about, and life is too short to
slog my way through the entirety of them.
For this particular PR, for once I took the effort to read the entire
commit message. I still had no real idea what the commit was trying to
achieve. So I read the diff in its entirety. I started to get a vague idea
of what it *might* be about. But there was a whole bunch of stuff in there
which immediately rang alarm bells:
- It introduced new functions, ones which had no documentation or
comments to give even a hint of their purpose.
- It was using HEKs, but it wasn't clear to me why; and in general I like
to avoid entanglements with other unrelated parts of the perl core where
possible.
- It added SV_CONST_FOO symbols, but without apparently using the
SV_CONST() mechanism. Again, an entanglement.
- It had a change to pp_const() which was undocumented and seemed
unrelated.
In other words it looked like a rubbish commit, but I couldn't even be
sure whether it was rubbish because the commit message was so rubbish that
I couldn't determine whether the things in the commit were rubbish or not.
And to determine all this and try and help get the PR into a suitable
state to be accepted, is going to take a whole bunch of time and emotional
effort by me or some other p5p committer.
In summary: your commit messages are really bad, and often the contents of
your PRs are too. I'm not going to waste more of my time helping you get
them into an acceptable state. I am, however, going to be more vocal in
future about us not accepting your poor-quality PRs. I will continue to
skip your p5p posts unless they are clearly written.
…--
Red sky at night - gerroff my land!
Red sky at morning - gerroff my land!
-- old farmers' sayings #14
|
They are the ONLY solution. The entirety of Perl 5 lang's
Basically any If I find any
Your walking on thin ice. The technical debate, if COW 255 obj's RC reaches == 255, then "forever-pin" the Newx backed buffer, for the rest of the perl PID's lifetime, has gotten formerly active P5P devs banned by P5P moderators previously in history, see https://perldoc.perl.org/perlpolicy#STANDARDS-OF-CONDUCT I have no engineering comment on that debate if to forever process pin to "faux-C static storage" any Newx() backed COW 255 object that naturally reaches RC == 255 in the runloop. If I read your sentence of How about Here is a dump of -O1 PP keyword
Too dangerous b/c too little usage. 80% of all PP code wants to know Its hard to find CPAN modules with PUBLIC APIs, that say pass a ref to Also ISO C drama hardware read only ISO C string "literals" ARE NOT A DEFAULT OPTIMIZATION!!!!! (obv WinPerl -O0 -Od -DDEBUGGING still turns on this optimization) https://learn.microsoft.com/en-us/cpp/build/reference/gf-eliminate-duplicate-strings?view=msvc-170
Also, while I like What if the next C function is MUHAHAAH!!!!! >:-< Propagating a I haven't really decompiled or researched how I don't think |
Benchmarks show an improvement of %6 in the CPU burn loop. A/B runtime C bool var set from env var was used for before/after, same exact perl541.dll file. no recompile.
FORCE OLD MODE
NEW MODE SV_SETHEK
Raw data/bench code/instrumented C code hidden b/c its not that important
```
use Benchmark qw( :all :hireswallclock );
use v5.30;
my $gcnt = 0;
my $m;
#$m = (time() >> 12)+0;
$m = 42761;
#say $m;exit;
my @A;
$a[$m+1] = 1;
{
my ($i, $scal, $rcode, $mod) = (0, 1, \&Internals::V);
my $rscal = \$scal;
for(;$i<$m;$i++){
$mod = $i % 3;
$a[$i] = $mod == 0 ? $scal : $mod == 1 ? $rscal : $rcode;
}
}
sub ben {
my ($m2, $cnt, $i, $aref) = ($m, 0, 0, \@A);
#system 'pause';
#for(;$i < $m2 ; $i++) { $cnt++ if ref($aref->[$i]) eq 'SCALAR';}
foreach my $el (@{$aref}) { $cnt++ if ref($el) eq 'SCALAR';}
$gcnt += $cnt;
}
cmpthese(undef,{b1 => \&ben, b2 => \&ben});
print $gcnt ."\n";
```
```
C:\sources\perl5\win32>cd .. && timeit perl.exe -Ilib win32\benchref.pl & cd win32
Rate b1 b2
b1 293/s -- -0%
b2 295/s 0% --
30874164
Exit code : 0
Elapsed time : 7.95
Kernel time : 0.05 (0.6%)
User time : 7.85 (98.7%)
page fault # : 2504
Working set : 9864 KB
Paged pool : 94 KB
Non-paged pool : 8 KB
Page file size : 5260 KB
C:\sources\perl5\win32>set PERL_RR=1 ##### FORCE OLD MODE
C:\sources\perl5\win32>cd .. && timeit perl.exe -Ilib win32\benchref.pl & cd win32
Rate b2 b1
b2 276/s -- -0%
b1 277/s 1% --
29676828
Exit code : 0
Elapsed time : 8.06
Kernel time : 0.05 (0.6%)
User time : 8.02 (99.5%)
page fault # : 2500
Working set : 9848 KB
Paged pool : 94 KB
Non-paged pool : 8 KB
Page file size : 5244 KB
C:\sources\perl5\win32>
```
temp A/B runtime branch selector new vs old
```
PP(pp_ref)
{
SV * const sv = *PL_stack_sp;
do_sv_ref:
U32 g_old = 0; void
#ifdef MULTIPLICITY {
|
-ref() PP keyword has extremely high usage. Greping my blead repo shows: Searched "ref(" 4347 hits in 605 files of 5879 searched -High level PP keyword ref(), aka C function Perl_pp_ref(), uses slow, inefficient, badly designed, backend public XS/C API called functions called Perl_sv_ref()/Perl_sv_reftype(). -This commit fixes all design problems with Perl_sv_ref()/Perl_sv_reftype(), and will speed up the very high usage PP keyword ref(), along with a very similar but very new and very little used PP keyword called "use builtin qw( reftype );" which is near identical to Perl_pp_ref(). -a crude benchmark, with the array ref in $aref holding 43000 SV*s, split 1/3rd SV* IOK, 1/3rd RV* to SV* IOK, and 1/3rd RV* to CV*, showed a %6 speed increase for this code sub benchme { foreach my $el (@{$aref}) { $cnt++ if ref($el) eq 'SCALAR';} } -The all UPPERCASE strings keyword ref() returns are part of the Perl 5 BNF grammer. Changing their spelling or lowercasing them is not for debate, or i18n-ing them dynamically realtime against glibc.so's current "OS global locale" with inotify()/kqueue() in the runloop to monitor a text file /etc or /var so this race condition works as designed in a unit test will never happen: $perl -E "dire('hello')" Routine indéfinie &cœur::dire aufgerufen bei -e Zeile 1 -sv_reftype() and sv_ref() have very badly designed prototypes, and the first time a new Perl in C dev reads their source code, they will think these 2 will cause infinite C stack recursion and a SEGV. Probably most automated C code analytic tools maybe will complain these 2 functions do infinite recursion. -The 2 functions don't return a string length, forcing all callers to execute a libc strlen() call on a string, that could be 8 bytes, or 80 MB. -All null term-ed strings that they return, are already sitting in virtual address space. Either const HW RO, or RCed HEK*s from the PL_strtab pool, that were found inside something similar to a GV*/HV*/HE*/CV*/AV*/GP*/OP*/SV* in a OP*(no threads) . -COW 255 buffers from Newx() under 9 chars can't COW currently by policy. CODE is 4, SCALAR is 6. HASH is 4. ARRAY is 5. But very short SV HEK* COWs will COW propagate without problems. ref() is also used to retrieve "Local::My::Class" strings, which have an extremely high chance to wind up getting passed to hv_common() through some high level PP keyword like bless or @isa, and hv_common() extracts precalculated U32 hash values from SV* with HEK* buffers, speeding up hv_common(). So SV* POKs with COW 255 and COW SVs_STATIC buffers are bad choices compared to using SV* POK HEK* buffers for a new faster version of sv_reftype()/sv_ref(). -PP code "if(ref($self) eq 'HASH') {}" should never involve all 3-5 calls Newx()/Realloc()/strlen()/memcpy()/Safefree(), on each execution of the line. To improve the src code dev-friendlyness of the prototypes of, and speed inside of, and the speed of in all libperl callers of Perl_sv_ref()/Perl_sv_reftype(). Make HEK* variants of them. Initially the HEK* variants are private to libperl. Maybe after 1-3 years into the future, they can be made official public C API for CPAN XS authors. These 2 new functions are undocumented/private API until further notice. Using SV* holding RC-ed HEK* SvPVX() buffers removes all these libc C lang logical and/or Asm machine code steps from during execution of PP keyword ref(). The pre-allocated PAD TARG SV* just keeps getting a RC-- on the old HEK* inside SvPVX(), and a RC++ on the new HEK* written to SvPVX() of the PAD TARG SV*. Touching only 6 void*s/size_t adresses total, each one a single read/write CPU instruction pair. SvPVX, SvCUR, SvLEN, old_hek.shared_he.shared_he_he.he_valu.hent_refcount, new_hek.shared_he.shared_he_he.he_valu.hent_refcount, new_hek.shared_he.shared_he_hek.hek_len. This brings PP KW ref() closer to C++ style RTTI that just compares const read-only vtable pointers. Some design and optimization problems with the old and new pp_ref()/pp_reftype()/sv_ref()/sv_reftype()/sv_refhek()/sv_reftypehek() calls are intentionally not being fixed in this commit to keep this commit small. Check the associated PR of the commit for details.
…ations -faster method lookups, faster new SVPV creation (COWs), some of these locations were missed by the original branch/PRs/commits that added SV_CONST() macro/api. -I belive all "" C string literals that match a SV_CONST_UPPERCASE SV* HEK* cached constant have been replaced with their SV* POK HEK* COW buffer equivalents inside libperl with this commit, excluding some instances of "__ANON__" strings. Only PERL_CORE files qualify for the SV_CONST() optimization, because of design choices made previously about the SV_CONST() API. Changing the PERL_CORE-only design choice is out of scope of this patch. -in pp_dbmopen() add SV_CONST(TIEHASH) macros for faster lookup/U32 hash pre-calc, and change newSVpvs_flags("AnyDBM_File", SVs_TEMP) to newSVpvs_share("AnyDBM_File"), because this sv is used multiple times in this pp_*() function, and it is a package name, and it is guaranteed to get passed into hv_common() somewhere eventually in some child function call we are making. -some "__ANON__" locations were not changed from sv_*newSV*pvs("__ANON__"); to sv_*newSV*hek(SV_CONST(__ANON__)); because right after, there is a sv_catpvs(""); that will make the SVPV HEK* COW instantly de-COW which saved no CPU or memory resources in the end, and only wasted them. Or it didn't look "safe" for a SV* COW buffer to be on that line. -pp_tie() call_method() is an thin inefficient wrapper that makes a mortal SVPV around a C string, since the real backend API is call_sv(), so switch the call_method() in pp_tie() to the read backend function call_sv() and avoid making that mortal SVPV
36e7ab4
to
2ad59ed
Compare
repushed, less detailed commit title, shorter commit message body, pp_const() optimization removed, short API docs added as C comments, the 2nd commit expands usage of the new |
ref() PP keyword has extremely high usage. Greping my blead repo shows:
Searched "ref(" 4347 hits in 605 files of 5879 searched
The strings keyword ref() returns are part of the Perl 5 BNF grammer.
This is not up for debate. Changing their spelling or lowercasing them
is not for debate, or i18n-ing them dynamically realtime against
glibc.so's current OS process global locale is not up for debate or
wiring, or wiring inotify/kqueue into the runloop to monitor /etc or /var
so this race condition works as designed in a unit test:
sv_reftype() and sv_ref() have very badly designed prototypes, and the
first time a new Perl in C dev reads their source code, they will think
these 2 will cause infinite C stack recursion and a SEGV. Probably most
automated C code analytic tools will complain these 2 functions do
infinite recursion too.
The 2 functions don't return a string length, forcing all callers to
execute a libc strlen() call on a string, that could be 8 bytes, or 80 MB.
The 2 functions don't split, parse, cat, or glue multiple strings to
create their output. All null term-ed strings that they return, are
already sitting in virtual address space. Either const HW RO, or
RCed HEK*s from the PL_strtab pool, that were found inside something
similar to a GV*/HV*/HE*/CV*/AV*/GP*/OP*/SV* in a OP* (no threads).
COW 255 buffers from Newx() under 9 chars can't COW currently by policy.
CODE is 4, SCALAR is 6. HASH is 4. ARRAY is 5. But very short SV HEK* COWs
will COW propagate without problems.
PP code
if(ref($self) eq 'HASH') {}
should never involve all 3-4 callsNewx()/Realloc()/strlen()/memcpy().
So this fix all of this, and make pp_ref()/PP KW ref() be closer in speed
to C/C++/Asm style object type checking, which is almost always going to
be 1 or 2 or 3 ptr equality tests against C constant &sum_vtbl_sum_class,
or in Microsoft ecosystem SW, its a equality test of a 16 byte GUID in
memory, against a 16 byte SSE literal stored in a SSE opcode (TLDR ver).
Just convert backends sv_ref()/sv_reftype() to HEK* retvals, and convert
the front end pp_*() ops to fetch HEK*s and return SV*s with
POK_on SvPVX()== HEK*. In all likely hood, if right side of PP code is
if (ref($self) eq 'HASH') {}
, during the execution ofmemcpy(pv1, pv2, len) as part of pp_eq, pv1 and pv2 are the same mem addr.
But I didn't single step eq operator to verify that yet.
inside PP(pp_reftype) previously the branch sv_setsv(TARG, &PL_sv_undef);
did not fire SMG, after this commit it does, IDK why it wasnt firing
before, or consequences of SMG firing now on sv_set_undef(rsv); path.
I suspect "sv_setsv(TARG, &PL_sv_undef);" and "sv_set_undef(rsv);" are
not perfect behavior copies of each other, in extreme/bizzare/user error
and bad CPAN XS code situtations but I haven't found any side effects of
the switch from sv_setsv(TARG, &PL_sv_undef); to sv_set_undef(rsv)
Untested typothetical cases like
sv_setsv(gv_star, &PL_sv_undef); sv_setsv(hv_star, &PL_sv_undef);
sv_setsv(svt_regexp_star, &PL_sv_undef);
sv_setsv(svt_invlist_star, &PL_sv_undef);
sv_setsv(svt_object_star, &PL_sv_undef);
sv_setsv(svt_io_star, &PL_sv_undef);
sv_sethek() has a severe pathologic performance problem, if args
SV* dsv
andHEK* src_hek
, test true forBut its still better than a strlen()/Newx()/memcpy()/push_save_stack()/
delayed_Safefree(); cycle. Any fix for this would be for the future.
these 2 functions are experimental for now, hence undocumented and not
public API, if they are made public, arg
const int ob
should be removedbecause of its confusing faux-infinite recursion but not real life
infinite recursion. The fuctions are exported so P5P hackers and
CPAN XS devs (unsanctioned by P5P) can benchmark and research these 2 new
functions using Inline::C/EU::PXS.
future improvements not done here, make sv_reftype() and sv_ref() wrappers
around their HEK* counterparts. Note the HEK* must be RC++ed and stuffed
in a new SV*, or a PAD TARG SV*, before the rpp_replace_1_1_NN(TARG); call
because in artificial situations/fuzzing, strange things can happen during
a SvREFCNT_dec_NN(); call, and the HEK* sitting in a C auto might
get freed during the SvREFCNT_dec_NN();
another improvement, sv_sethek(rsv, hek); is somewhat heavy, and doesn't
have a shortcut, to RC-- an existing SVPV HEK* COW itself, instead it
uses SV_THINKFIRST_***() and sv_force_normal***() to RC-- an existing
SVPV HEK* COW. If the SV* PAD TARG, is being used over and over by ref()
opcode, its always going to have a stale HEK* SVPVX() that needs to be
RC--ed.
another improvement, check
if(sv_reftypehek() == SvPVX(targ))
beforecalling sv_sethek(rsv, hek);
another improvement, beyond scope for me, make into 1 OP*/opcode:
and
another improvement, dont deref my_perl->Iop/PL_ptr many times in a row.
I didn't do any CPU opcode/instruction stripping in this commit. Thats
for a future commit.
another improvement, investigate if most of large switch() inside
Perl_sv_reftypehek() can be turned into a
const I8 arr_of_PL_sv_consts_idxs[];
with a couple tiny special cases.todo invert
if (!rsv) {
branch, so hot path (yes cached in PL_sv_consts).comes first in machine code/asm order.