Skip to content

Optimize performances of str_pad() #19272

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

alexandre-daubois
Copy link
Contributor

@alexandre-daubois alexandre-daubois commented Jul 28, 2025

I would like to propose this optimization for str_pad(). Here is the benchmark code:

<?php

$iterations = 1000000;

for ($i = 0; $i < $iterations; $i++) {
    str_pad('hello', 2000, ' ', STR_PAD_RIGHT);
}

And the results with right padding:

Benchmark 1: ./sapi/cli/php.branch validation_benchmark.php
  Time (mean ± σ):      44.6 ms ±   0.8 ms    [User: 41.3 ms, System: 2.3 ms]
  Range (min … max):    43.2 ms …  47.4 ms    64 runs
 
Benchmark 2: ./sapi/cli/php.master validation_benchmark.php
  Time (mean ± σ):      2.977 s ±  0.076 s    [User: 2.962 s, System: 0.009 s]
  Range (min … max):    2.863 s …  3.061 s    10 runs
 
Summary
  ./sapi/cli/php.branch validation_benchmark.php ran
   66.75 ± 2.06 times faster than ./sapi/cli/php.master validation_benchmark.php

Left padding results:

alex@alex-macos php-src % hyperfine './sapi/cli/php.branch validation_benchmark.php' './sapi/cli/php.master validation_benchmark.php' --warmup 10
Benchmark 1: ./sapi/cli/php.branch validation_benchmark.php
  Time (mean ± σ):      43.8 ms ±   1.5 ms    [User: 40.5 ms, System: 2.3 ms]
  Range (min … max):    42.2 ms …  49.8 ms    58 runs
 
Benchmark 2: ./sapi/cli/php.master validation_benchmark.php
  Time (mean ± σ):      1.150 s ±  0.020 s    [User: 1.130 s, System: 0.009 s]
  Range (min … max):    1.125 s …  1.174 s    10 runs
 
Summary
  ./sapi/cli/php.branch validation_benchmark.php ran
   26.23 ± 1.01 times faster than ./sapi/cli/php.master validation_benchmark.php

The idea is to avoid modulo operation in loops and copying char by char. Instead, this PR prefers the bulk copy approach.

@@ -5737,6 +5737,27 @@ PHP_FUNCTION(substr_count)
}
/* }}} */

static inline void php_str_pad_fill(char *dest, size_t pad_chars, const char *pad_str, size_t pad_str_len) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static inline void php_str_pad_fill(char *dest, size_t pad_chars, const char *pad_str, size_t pad_str_len) {
static zend_always_inline void php_str_pad_fill(char *dest, size_t pad_chars, const char *pad_str, size_t pad_str_len) {

Using zend_always_inline increases the likelihood that the function will actually be inlined.

Also, instead of passing char *dest and size_t pad_chars as separate arguments, how about accepting a zend_string and extracting the values within this inline function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! PR updated with both suggestions

Copy link
Member

@TimWolla TimWolla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that this makes the API of php_str_pad_fill() a little safer to use, because it makes less assumptions about the target pointer.

With regard to the inlining: I generally trust the compiler to make better than decisions than I can do myself. I would make the function just static void without any inlining hints and let the compiler decide.

for (i = 0; i < left_pad; i++)
ZSTR_VAL(result)[ZSTR_LEN(result)++] = pad_str[i % pad_str_len];
if (left_pad > 0) {
php_str_pad_fill(result, left_pad, pad_str, pad_str_len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
php_str_pad_fill(result, left_pad, pad_str, pad_str_len);
php_str_pad_fill(ZSTR_VAL(result) + ZSTR_LEN(result), left_pad, pad_str, pad_str_len);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SakiTakamachi advised otherwise if I get it right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears so, but the suggestion doesn't really make sense, since pad_chars cannot be piggy-backed on the zend_string. Perhaps Saki meant to pass pad_str and pad_str_len as a zend_string* (which would make sense to me).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes the code a bit slower, going from ~48ms per run to ~58ms. Do we still want this?

@TimWolla
Copy link
Member

Please also see #19220 (comment) and #19220 (comment) for benchmarking advice.

@alexandre-daubois
Copy link
Contributor Author

Thanks for the link, I'm having a look to hyperfine for this PR and #19276

@alexandre-daubois
Copy link
Contributor Author

PR description updated to use hyperfine

@TimWolla
Copy link
Member

PR description updated to use hyperfine

The php binary appears to be the system binary, it is not unlikely that it was compiled with different flags or a different compiler (version). You should make sure to test two binaries compiled with identical flags and compilers.

@alexandre-daubois
Copy link
Contributor Author

I think that my system-wide install right now is PHP 8.5 nightly. Just to be sure, I rerun against master and the branch:

Benchmark 1: ./sapi/cli/php.branch validation_benchmark.php
  Time (mean ± σ):      44.6 ms ±   0.8 ms    [User: 41.3 ms, System: 2.3 ms]
  Range (min … max):    43.2 ms …  47.4 ms    64 runs
 
Benchmark 2: ./sapi/cli/php.master validation_benchmark.php
  Time (mean ± σ):      2.977 s ±  0.076 s    [User: 2.962 s, System: 0.009 s]
  Range (min … max):    2.863 s …  3.061 s    10 runs
 
Summary
  ./sapi/cli/php.branch validation_benchmark.php ran
   66.75 ± 2.06 times faster than ./sapi/cli/php.master validation_benchmark.php

Copy link

github-actions bot commented Aug 1, 2025

AWS x86_64 (c7i.24xl)

Attribute Value
Environment aws
Runner host
Instance type c7i.metal-24xl (dedicated)
Architecture x86_64
CPU 48 cores
CPU settings disabled deeper C-states, disabled turbo boost, disabled hyper-threading
RAM 188 GB
Kernel 6.1.144-170.251.amzn2023.x86_64
OS Amazon Linux 2023.8.20250721
GCC 11.5.0
Time 2025-08-01 08:20:50 UTC

Laravel 11.1.2 demo app - 30 consecutive runs, 100 requests (sec)

PHP Min Max Std dev Average Average diff % Median Median diff % Memory
PHP - baseline@eaf2 0.46107 0.46896 0.00123 0.46674 0.00% 0.46677 0.00% 43.54 MB
PHP - str-pad-opt 0.46154 0.46938 0.00127 0.46765 0.20% 0.46767 0.19% 43.46 MB

Symfony 2.6.0 demo app - 30 consecutive runs, 100 requests (sec)

PHP Min Max Std dev Average Average diff % Median Median diff % Memory
PHP - baseline@eaf2 0.72773 0.73390 0.00138 0.72954 0.00% 0.72921 0.00% 39.69 MB
PHP - str-pad-opt 0.73090 0.73512 0.00087 0.73262 0.42% 0.73241 0.44% 39.77 MB

Wordpress 6.2 main page - 30 consecutive runs, 20 requests (sec)

PHP Min Max Std dev Average Average diff % Median Median diff % Memory
PHP - baseline@eaf2 0.57803 0.58039 0.00059 0.57886 0.00% 0.57869 0.00% 43.54 MB
PHP - str-pad-opt 0.58101 0.65067 0.02497 0.63512 9.72% 0.64595 11.62% 43.54 MB

bench.php - 25 consecutive runs (sec)

PHP Min Max Std dev Average Average diff % Median Median diff % Memory
PHP - baseline@eaf2 0.21721 0.22099 0.00093 0.21825 0.00% 0.21808 0.00% 26.53 MB
PHP - str-pad-opt 0.21425 0.21712 0.00077 0.21580 -1.12% 0.21585 -1.02% 26.53 MB

micro_bench.php - 25 consecutive runs (sec)

PHP Min Max Std dev Average Average diff % Median Median diff % Memory
PHP - baseline@eaf2 1.26166 1.28426 0.00498 1.26842 0.00% 1.26635 0.00% 20.82 MB
PHP - str-pad-opt 1.31617 1.33954 0.00573 1.32695 4.61% 1.32688 4.78% 20.82 MB

@kocsismate
Copy link
Member

The above comment was due to test the real time benchmark :) Unfortunately, I guess most results should show zero difference, which is not currently the case :( The wordpress results are especially weird. I'll have to dig into it...

cc. @iluuu1994 @arnaud-lb

@alexandre-daubois
Copy link
Contributor Author

If I can be of any help, don't hesitate to tell me! It is indeed surprising

Copy link

github-actions bot commented Aug 1, 2025

AWS x86_64 (c7i.24xl)

Attribute Value
Environment aws
Runner host
Instance type c7i.metal-24xl (dedicated)
Architecture x86_64
CPU 48 cores
CPU settings disabled deeper C-states, disabled turbo boost, disabled hyper-threading
RAM 188 GB
Kernel 6.1.144-170.251.amzn2023.x86_64
OS Amazon Linux 2023.8.20250721
GCC 11.5.0
Time 2025-08-01 09:41:35 UTC

Laravel 11.1.2 demo app - 30 consecutive runs, 100 requests (sec)

PHP Min Max Std dev Average Average diff % Median Median diff % Instr count Memory
PHP - baseline@eaf2 0.45780 0.46629 0.00139 0.46472 0.00% 0.46488 0.00% 177247194 43.48 MB
PHP - str-pad-opt 0.46514 0.46736 0.00050 0.46594 0.26% 0.46581 0.20% 177251261 43.48 MB

Symfony 2.6.0 demo app - 30 consecutive runs, 100 requests (sec)

PHP Min Max Std dev Average Average diff % Median Median diff % Instr count Memory
PHP - baseline@eaf2 0.72645 0.72932 0.00071 0.72785 0.00% 0.72779 0.00% 288207116 39.78 MB
PHP - str-pad-opt 0.72983 0.73390 0.00103 0.73178 0.54% 0.73152 0.51% 288197741 39.78 MB

Wordpress 6.2 main page - 30 consecutive runs, 20 requests (sec)

PHP Min Max Std dev Average Average diff % Median Median diff % Instr count Memory
PHP - baseline@eaf2 0.57872 0.58180 0.00064 0.57969 0.00% 0.57960 0.00% 1129301005 43.56 MB
PHP - str-pad-opt 0.57772 0.58038 0.00056 0.57900 -0.12% 0.57888 -0.12% 1129302558 43.56 MB

bench.php - 25 consecutive runs (sec)

PHP Min Max Std dev Average Average diff % Median Median diff % Instr count Memory
PHP - baseline@eaf2 0.21719 0.22107 0.00105 0.21863 0.00% 0.21866 0.00% 1791734911 26.55 MB
PHP - str-pad-opt 0.21282 0.21641 0.00090 0.21439 -1.94% 0.21441 -1.94% 1794186569 26.55 MB

micro_bench.php - 25 consecutive runs (sec)

PHP Min Max Std dev Average Average diff % Median Median diff % Instr count Memory
PHP - baseline@eaf2 1.25910 1.27670 0.00515 1.26772 0.00% 1.26679 0.00% 1269754366 20.84 MB
PHP - str-pad-opt 1.31593 1.33956 0.00516 1.32448 4.48% 1.32418 4.53% 1271643119 20.83 MB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants