Skip to content

LocalizeChildren pass #7708

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from
Draft

Conversation

kripken
Copy link
Member

@kripken kripken commented Jul 10, 2025

This pass find Binary instructions where the children have effects,
and moves them to locals. This is meant to help with the situation
in #7557 (see details there, but basically, the effects of children can
prevent OptimizeInstructions from optimizing, and moving those
effects outside can help).

This seems generally helpful, though usually it reduces code size
by less than 1%, and it does make compilation 5% slower (not
because of the pass itself, which is very fast, but likely due to the
new locals it adds, that make other things later slower). In more
detail, I tested and saw a small improvement on Kotlin, Dart, and
Rust testcases, and on Emscripten's code size tests I see this:

Emscripten code size diff diff --git a/test/code_size/audio_worklet_wasm.json b/test/code_size/audio_worklet_wasm.json index 5aad516bd2..feeaeefafe 100644 --- a/test/code_size/audio_worklet_wasm.json +++ b/test/code_size/audio_worklet_wasm.json @@ -1,10 +1,10 @@ { "a.html": 519, "a.html.gz": 364, "a.js": 3853, "a.js.gz": 2050, "a.wasm": 1294, - "a.wasm.gz": 864, + "a.wasm.gz": 866, "total": 5666, - "total_gz": 3278 + "total_gz": 3280 } diff --git a/test/code_size/embind_hello_wasm.json b/test/code_size/embind_hello_wasm.json index a9884cc49d..c2c96c190a 100644 --- a/test/code_size/embind_hello_wasm.json +++ b/test/code_size/embind_hello_wasm.json @@ -1,10 +1,10 @@ { "a.html": 552, "a.html.gz": 380, "a.js": 7266, "a.js.gz": 3321, "a.wasm": 7300, - "a.wasm.gz": 3348, + "a.wasm.gz": 3349, "total": 15118, - "total_gz": 7049 + "total_gz": 7050 } diff --git a/test/code_size/embind_val_wasm.json b/test/code_size/embind_val_wasm.json index 542c19cf1f..34c9da070e 100644 --- a/test/code_size/embind_val_wasm.json +++ b/test/code_size/embind_val_wasm.json @@ -1,10 +1,10 @@ { "a.html": 552, "a.html.gz": 380, "a.js": 5367, "a.js.gz": 2540, "a.wasm": 9101, - "a.wasm.gz": 4699, + "a.wasm.gz": 4705, "total": 15020, - "total_gz": 7619 + "total_gz": 7625 } diff --git a/test/code_size/hello_wasm_worker_wasm.json b/test/code_size/hello_wasm_worker_wasm.json index 4b31168c56..37fd223c48 100644 --- a/test/code_size/hello_wasm_worker_wasm.json +++ b/test/code_size/hello_wasm_worker_wasm.json @@ -1,10 +1,10 @@ { "a.html": 519, "a.html.gz": 364, "a.js": 830, "a.js.gz": 530, "a.wasm": 1891, - "a.wasm.gz": 1082, + "a.wasm.gz": 1083, "total": 3240, - "total_gz": 1976 + "total_gz": 1977 } diff --git a/test/code_size/hello_webgl2_wasm.json b/test/code_size/hello_webgl2_wasm.json index c9afcedc35..b496ff409e 100644 --- a/test/code_size/hello_webgl2_wasm.json +++ b/test/code_size/hello_webgl2_wasm.json @@ -1,10 +1,10 @@ { "a.html": 454, "a.html.gz": 328, "a.js": 4386, "a.js.gz": 2252, - "a.wasm": 8286, - "a.wasm.gz": 5617, - "total": 13126, - "total_gz": 8197 + "a.wasm": 8292, + "a.wasm.gz": 5618, + "total": 13132, + "total_gz": 8198 } diff --git a/test/code_size/hello_webgl2_wasm2js.json b/test/code_size/hello_webgl2_wasm2js.json index 89e28d08c8..3929ae6843 100644 --- a/test/code_size/hello_webgl2_wasm2js.json +++ b/test/code_size/hello_webgl2_wasm2js.json @@ -1,8 +1,8 @@ { "a.html": 346, "a.html.gz": 262, "a.js": 18078, - "a.js.gz": 9781, + "a.js.gz": 9784, "total": 18424, - "total_gz": 10043 + "total_gz": 10046 } diff --git a/test/code_size/hello_webgl_wasm.json b/test/code_size/hello_webgl_wasm.json index 2e1ba8e7f8..86cce56c94 100644 --- a/test/code_size/hello_webgl_wasm.json +++ b/test/code_size/hello_webgl_wasm.json @@ -1,10 +1,10 @@ { "a.html": 454, "a.html.gz": 328, "a.js": 3924, "a.js.gz": 2092, - "a.wasm": 8286, - "a.wasm.gz": 5617, - "total": 12664, - "total_gz": 8037 + "a.wasm": 8292, + "a.wasm.gz": 5618, + "total": 12670, + "total_gz": 8038 } diff --git a/test/code_size/hello_webgl_wasm2js.json b/test/code_size/hello_webgl_wasm2js.json index 3a2fcd28a1..c9d40e9826 100644 --- a/test/code_size/hello_webgl_wasm2js.json +++ b/test/code_size/hello_webgl_wasm2js.json @@ -1,8 +1,8 @@ { "a.html": 346, "a.html.gz": 262, "a.js": 17605, - "a.js.gz": 9614, + "a.js.gz": 9622, "total": 17951, - "total_gz": 9876 + "total_gz": 9884 } diff --git a/test/code_size/math_wasm.json b/test/code_size/math_wasm.json index 9d06a35db4..568a479d2e 100644 --- a/test/code_size/math_wasm.json +++ b/test/code_size/math_wasm.json @@ -1,10 +1,10 @@ { "a.html": 552, "a.html.gz": 380, "a.js": 110, "a.js.gz": 125, - "a.wasm": 2687, - "a.wasm.gz": 1658, - "total": 3349, - "total_gz": 2163 + "a.wasm": 2693, + "a.wasm.gz": 1662, + "total": 3355, + "total_gz": 2167 } diff --git a/test/code_size/random_printf_wasm.json b/test/code_size/random_printf_wasm.json index fa6667bef3..66b36767c9 100644 --- a/test/code_size/random_printf_wasm.json +++ b/test/code_size/random_printf_wasm.json @@ -1,6 +1,6 @@ { "a.html": 12515, - "a.html.gz": 6857, + "a.html.gz": 6858, "total": 12515, - "total_gz": 6857 + "total_gz": 6858 } diff --git a/test/code_size/random_printf_wasm2js.json b/test/code_size/random_printf_wasm2js.json index 87d6dfdb9a..29024d0de1 100644 --- a/test/code_size/random_printf_wasm2js.json +++ b/test/code_size/random_printf_wasm2js.json @@ -1,6 +1,6 @@ { - "a.html": 17224, - "a.html.gz": 7558, - "total": 17224, - "total_gz": 7558 + "a.html": 17228, + "a.html.gz": 7560, + "total": 17228, + "total_gz": 7560 } diff --git a/test/other/codesize/test_codesize_cxx_ctors1.size b/test/other/codesize/test_codesize_cxx_ctors1.size index 4cd9784974..9f7ea6c403 100644 --- a/test/other/codesize/test_codesize_cxx_ctors1.size +++ b/test/other/codesize/test_codesize_cxx_ctors1.size @@ -1 +1 @@ -129523 +129499 diff --git a/test/other/codesize/test_codesize_cxx_ctors2.size b/test/other/codesize/test_codesize_cxx_ctors2.size index 825f9c99dd..82fb741f2b 100644 --- a/test/other/codesize/test_codesize_cxx_ctors2.size +++ b/test/other/codesize/test_codesize_cxx_ctors2.size @@ -1 +1 @@ -128951 +128927 diff --git a/test/other/codesize/test_codesize_cxx_except.size b/test/other/codesize/test_codesize_cxx_except.size index 9c576d550e..c0b75c2973 100644 --- a/test/other/codesize/test_codesize_cxx_except.size +++ b/test/other/codesize/test_codesize_cxx_except.size @@ -1 +1 @@ -171291 +171264 diff --git a/test/other/codesize/test_codesize_cxx_except_wasm.size b/test/other/codesize/test_codesize_cxx_except_wasm.size index 73186ab753..1db904cf3c 100644 --- a/test/other/codesize/test_codesize_cxx_except_wasm.size +++ b/test/other/codesize/test_codesize_cxx_except_wasm.size @@ -1 +1 @@ -144653 +144594 diff --git a/test/other/codesize/test_codesize_cxx_except_wasm_legacy.size b/test/other/codesize/test_codesize_cxx_except_wasm_legacy.size index 17dc029298..c53f6bcc31 100644 --- a/test/other/codesize/test_codesize_cxx_except_wasm_legacy.size +++ b/test/other/codesize/test_codesize_cxx_except_wasm_legacy.size @@ -1 +1 @@ -142242 +142183 diff --git a/test/other/codesize/test_codesize_cxx_lto.size b/test/other/codesize/test_codesize_cxx_lto.size index 50420f0c57..0a9a5f1e95 100644 --- a/test/other/codesize/test_codesize_cxx_lto.size +++ b/test/other/codesize/test_codesize_cxx_lto.size @@ -1 +1 @@ -121790 +121789 diff --git a/test/other/codesize/test_codesize_cxx_mangle.size b/test/other/codesize/test_codesize_cxx_mangle.size index 06a97f0e9f..f3ed20a321 100644 --- a/test/other/codesize/test_codesize_cxx_mangle.size +++ b/test/other/codesize/test_codesize_cxx_mangle.size @@ -1 +1 @@ -235338 +235311 diff --git a/test/other/codesize/test_codesize_cxx_noexcept.size b/test/other/codesize/test_codesize_cxx_noexcept.size index d502821bac..01e6459671 100644 --- a/test/other/codesize/test_codesize_cxx_noexcept.size +++ b/test/other/codesize/test_codesize_cxx_noexcept.size @@ -1 +1 @@ -131941 +131917 diff --git a/test/other/codesize/test_codesize_cxx_wasmfs.size b/test/other/codesize/test_codesize_cxx_wasmfs.size index 24149d6189..0709f692df 100644 --- a/test/other/codesize/test_codesize_cxx_wasmfs.size +++ b/test/other/codesize/test_codesize_cxx_wasmfs.size @@ -1 +1 @@ -169798 +169789 diff --git a/test/other/codesize/test_codesize_files_wasmfs.size b/test/other/codesize/test_codesize_files_wasmfs.size index 3c46971663..98177161f2 100644 --- a/test/other/codesize/test_codesize_files_wasmfs.size +++ b/test/other/codesize/test_codesize_files_wasmfs.size @@ -1 +1 @@ -50330 +50354 diff --git a/test/other/codesize/test_codesize_hello_O3.size b/test/other/codesize/test_codesize_hello_O3.size index b339887848..2cbd341de0 100644 --- a/test/other/codesize/test_codesize_hello_O3.size +++ b/test/other/codesize/test_codesize_hello_O3.size @@ -1 +1 @@ -1733 +1679 diff --git a/test/other/codesize/test_codesize_hello_Os.size b/test/other/codesize/test_codesize_hello_Os.size index a858d2d47b..ad0b314c27 100644 --- a/test/other/codesize/test_codesize_hello_Os.size +++ b/test/other/codesize/test_codesize_hello_Os.size @@ -1 +1 @@ -1724 +1722 diff --git a/test/other/codesize/test_codesize_hello_Oz.size b/test/other/codesize/test_codesize_hello_Oz.size index 771034cb6a..b0536459d7 100644 --- a/test/other/codesize/test_codesize_hello_Oz.size +++ b/test/other/codesize/test_codesize_hello_Oz.size @@ -1 +1 @@ -1259 +1205 diff --git a/test/other/codesize/test_codesize_hello_dylink.size b/test/other/codesize/test_codesize_hello_dylink.size index afd43f8827..c36f42d794 100644 --- a/test/other/codesize/test_codesize_hello_dylink.size +++ b/test/other/codesize/test_codesize_hello_dylink.size @@ -1 +1 @@ -18547 +18521 diff --git a/test/other/codesize/test_codesize_hello_single_file.gzsize b/test/other/codesize/test_codesize_hello_single_file.gzsize index 64d519daab..13f3698f99 100644 --- a/test/other/codesize/test_codesize_hello_single_file.gzsize +++ b/test/other/codesize/test_codesize_hello_single_file.gzsize @@ -1 +1 @@ -3620 +3587 diff --git a/test/other/codesize/test_codesize_hello_single_file.jssize b/test/other/codesize/test_codesize_hello_single_file.jssize index 8755c7be20..b0b20f1984 100644 --- a/test/other/codesize/test_codesize_hello_single_file.jssize +++ b/test/other/codesize/test_codesize_hello_single_file.jssize @@ -1 +1 @@ -6611 +6539 diff --git a/test/other/codesize/test_codesize_hello_wasmfs.size b/test/other/codesize/test_codesize_hello_wasmfs.size index b339887848..2cbd341de0 100644 --- a/test/other/codesize/test_codesize_hello_wasmfs.size +++ b/test/other/codesize/test_codesize_hello_wasmfs.size @@ -1 +1 @@ -1733 +1679 diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size index 7193414dbf..cb15afe743 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size +++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size @@ -1 +1 @@ -4097 +4101 diff --git a/test/other/codesize/test_codesize_minimal_pthreads.size b/test/other/codesize/test_codesize_minimal_pthreads.size index 45f705e322..f637d9f6da 100644 --- a/test/other/codesize/test_codesize_minimal_pthreads.size +++ b/test/other/codesize/test_codesize_minimal_pthreads.size @@ -1 +1 @@ -19417 +19422 diff --git a/test/other/codesize/test_codesize_minimal_pthreads_memgrowth.size b/test/other/codesize/test_codesize_minimal_pthreads_memgrowth.size index 11826fc9de..7baaed78af 100644 --- a/test/other/codesize/test_codesize_minimal_pthreads_memgrowth.size +++ b/test/other/codesize/test_codesize_minimal_pthreads_memgrowth.size @@ -1 +1 @@ -19418 +19423

The results there are mixed, but e.g. hello world -O3 is 3% smaller.

This may may sense to land, but

  1. The cases where the results are worse should be investigated.
  2. The compilation slowdown should be mitigated - perhaps we can
    pick more carefully when to use locals.

@kripken
Copy link
Member Author

kripken commented Jul 10, 2025

Last commits make this only localize when we see a falling-through constant, which we take as a sign that it is worth adding locals in the hopes of later optimizations finding things. That is enough for the motivating use cases. This makes the pass take almost 0 time, so speed is no longer an issue. It reduces the changes to real-world code, leaving in Emscripten this:

diff --git a/test/other/codesize/test_codesize_files_wasmfs.size b/test/other/codesize/test_codesize_files_wasmfs.size
index 3c46971663..002a2213dd 100644
--- a/test/other/codesize/test_codesize_files_wasmfs.size
+++ b/test/other/codesize/test_codesize_files_wasmfs.size
@@ -1 +1 @@
-50330
+50336
diff --git a/test/other/codesize/test_codesize_hello_O3.size b/test/other/codesize/test_codesize_hello_O3.size
index b339887848..357a340dae 100644
--- a/test/other/codesize/test_codesize_hello_O3.size
+++ b/test/other/codesize/test_codesize_hello_O3.size
@@ -1 +1 @@
-1733
+1681
diff --git a/test/other/codesize/test_codesize_hello_Oz.size b/test/other/codesize/test_codesize_hello_Oz.size
index 771034cb6a..de0cde04c8 100644
--- a/test/other/codesize/test_codesize_hello_Oz.size
+++ b/test/other/codesize/test_codesize_hello_Oz.size
@@ -1 +1 @@
-1259
+1207
diff --git a/test/other/codesize/test_codesize_hello_single_file.gzsize b/test/other/codesize/test_codesize_hello_single_file.gzsize
index 64d519daab..468cbfccfe 100644
--- a/test/other/codesize/test_codesize_hello_single_file.gzsize
+++ b/test/other/codesize/test_codesize_hello_single_file.gzsize
@@ -1 +1 @@
-3620
+3589
diff --git a/test/other/codesize/test_codesize_hello_single_file.jssize b/test/other/codesize/test_codesize_hello_single_file.jssize
index 8755c7be20..aff74eed6b 100644
--- a/test/other/codesize/test_codesize_hello_single_file.jssize
+++ b/test/other/codesize/test_codesize_hello_single_file.jssize
@@ -1 +1 @@
-6611
+6543
diff --git a/test/other/codesize/test_codesize_hello_wasmfs.size b/test/other/codesize/test_codesize_hello_wasmfs.size
index b339887848..357a340dae 100644
--- a/test/other/codesize/test_codesize_hello_wasmfs.size
+++ b/test/other/codesize/test_codesize_hello_wasmfs.size
@@ -1 +1 @@
-1733
+1681

Still not 100% positive, but mostly so, and keeps that 3% win on hello world -O3.

@kripken
Copy link
Member Author

kripken commented Jul 11, 2025

I investigated that small regression, and it is basically noise. Looking at more real-world things, on most there is a few bytes of noise one way or the other, but it does help a bit sometimes (50 bytes on LZMA, which is 0.1%), and sometimes by more than a bit (6% better on Poppler).

Overall this looks like it might be worth landing. We can consider expanding what it does later (more than Binary, and more things than falling-through constants).

@xuruiyang2002
Copy link
Contributor

Looks good. However, the evaluation focuses on compile-time metrics like code size and compilation time. Since this pass introduces more local get/set operations, which could affect runtime performance. Have we considered measuring runtime impact to check for any regressions?

@kripken
Copy link
Member Author

kripken commented Jul 14, 2025

The locals that it adds should get removed by later passes (unless they end up important). In theory that could be missed by the data I reported above (size could be smaller while local operations increase, if something else decreases enough), but it's unlikely. Here is the diff for hello world:

12c12
<  [total]        : 797     
---
>  [total]        : 770     
14,16c14,16
<  Binary         : 94      
<  Block          : 45      
<  Break          : 50      
---
>  Binary         : 91      
>  Block          : 41      
>  Break          : 45      
19,20c19,20
<  Const          : 137     
<  Drop           : 8       
---
>  Const          : 132     
>  Drop           : 7       
24,27c24,27
<  Load           : 75      
<  LocalGet       : 198     
<  LocalSet       : 67      
<  Loop           : 11      
---
>  Load           : 73      
>  LocalGet       : 196     
>  LocalSet       : 66      
>  Loop           : 10      
34c34
<  Unary          : 11      
---
>  Unary          : 8       

Everything decreases there.

Note also that adding more local operations usually does not affect runtime performance. VMs lower local operations into SSA form, which optimizes away unneeded operations. (Though interpreters and baseline tiers do less, and might be slower.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants