-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Optimize performances of str_pad()
#19272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
ext/standard/string.c
Outdated
@@ -5737,6 +5737,27 @@ PHP_FUNCTION(substr_count) | |||
} | |||
/* }}} */ | |||
|
|||
static inline void php_str_pad_fill(char *dest, size_t pad_chars, const char *pad_str, size_t pad_str_len) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static inline void php_str_pad_fill(char *dest, size_t pad_chars, const char *pad_str, size_t pad_str_len) { | |
static zend_always_inline void php_str_pad_fill(char *dest, size_t pad_chars, const char *pad_str, size_t pad_str_len) { |
Using zend_always_inline
increases the likelihood that the function will actually be inlined.
Also, instead of passing char *dest
and size_t pad_chars
as separate arguments, how about accepting a zend_string
and extracting the values within this inline function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! PR updated with both suggestions
fbfa654
to
24d2823
Compare
24d2823
to
ba1eb4c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that this makes the API of php_str_pad_fill()
a little safer to use, because it makes less assumptions about the target pointer.
With regard to the inlining: I generally trust the compiler to make better than decisions than I can do myself. I would make the function just static void
without any inlining hints and let the compiler decide.
for (i = 0; i < left_pad; i++) | ||
ZSTR_VAL(result)[ZSTR_LEN(result)++] = pad_str[i % pad_str_len]; | ||
if (left_pad > 0) { | ||
php_str_pad_fill(result, left_pad, pad_str, pad_str_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
php_str_pad_fill(result, left_pad, pad_str, pad_str_len); | |
php_str_pad_fill(ZSTR_VAL(result) + ZSTR_LEN(result), left_pad, pad_str, pad_str_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SakiTakamachi advised otherwise if I get it right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears so, but the suggestion doesn't really make sense, since pad_chars
cannot be piggy-backed on the zend_string. Perhaps Saki meant to pass pad_str
and pad_str_len
as a zend_string*
(which would make sense to me).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes the code a bit slower, going from ~48ms per run to ~58ms. Do we still want this?
Please also see #19220 (comment) and #19220 (comment) for benchmarking advice. |
Thanks for the link, I'm having a look to hyperfine for this PR and #19276 |
ba1eb4c
to
78dd005
Compare
PR description updated to use hyperfine |
The |
I think that my system-wide install right now is PHP 8.5 nightly. Just to be sure, I rerun against master and the branch:
|
AWS x86_64 (c7i.24xl)
Laravel 11.1.2 demo app - 30 consecutive runs, 100 requests (sec)
Symfony 2.6.0 demo app - 30 consecutive runs, 100 requests (sec)
Wordpress 6.2 main page - 30 consecutive runs, 20 requests (sec)
bench.php - 25 consecutive runs (sec)
micro_bench.php - 25 consecutive runs (sec)
|
The above comment was due to test the real time benchmark :) Unfortunately, I guess most results should show zero difference, which is not currently the case :( The wordpress results are especially weird. I'll have to dig into it... cc. @iluuu1994 @arnaud-lb |
If I can be of any help, don't hesitate to tell me! It is indeed surprising |
AWS x86_64 (c7i.24xl)
Laravel 11.1.2 demo app - 30 consecutive runs, 100 requests (sec)
Symfony 2.6.0 demo app - 30 consecutive runs, 100 requests (sec)
Wordpress 6.2 main page - 30 consecutive runs, 20 requests (sec)
bench.php - 25 consecutive runs (sec)
micro_bench.php - 25 consecutive runs (sec)
|
I would like to propose this optimization for
str_pad()
. Here is the benchmark code:And the results with right padding:
Left padding results:
The idea is to avoid modulo operation in loops and copying char by char. Instead, this PR prefers the bulk copy approach.