-
Notifications
You must be signed in to change notification settings - Fork 792
Track indirect call types in RemoveUnusedModuleElements #7728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see we already have signature-based analysis for CallRef
. Would it be possible to reuse that? (It should be if we handle the existence of TableGet
correctly, but it looks like maybe we don't?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also looks like maybe doing a table.init with a passive segment + a call_indirect into the table will not properly keep the function body alive.
Ran some more measurements, using the emscripten benchmark suite. Nothing improves close to 1%, except for Poppler, which shrinks by 6.5%. Odd that just that one benefits and none of the many others... |
Yeah, the TODOs for table operations may get in the way. But in general I think we want to consider the two things separately, say, if we see no
I added tests for that now. Note that they pass: We do carefully assume the worst about table operations, from before this PR, binaryen/src/passes/RemoveUnusedModuleElements.cpp Lines 149 to 153 in 962e62f
Improving that could help this PR do better. |
binaryen/src/passes/RemoveUnusedModuleElements.cpp Lines 146 to 154 in 962e62f
Previous |
@aheejin The main difference is that before, a reference to a table would mark everything in the table as used: binaryen/src/passes/RemoveUnusedModuleElements.cpp Lines 416 to 423 in c303248
That code is removed in this PR, and instead, we only mark functions of the right type: So if there is a function in the table of signature S and no call_indirect is reached of that type, we can remove that function (and the things it reaches, potentially). |
An indirect call to a type in a table now only forces functions of
that type to be marked as used: functions of other types are
left alone, potentially leaving them unreached.
This is more precise than assuming any indirect call can
reach anywhere, which is more or less what we did before.
There is a downside to this: the pass is around 10% slower. This is one
of our faster passes, so this may be acceptable, however.
This has some benefits, here is the Emscripten diff:
Example
diff --git a/test/code_size/embind_val_wasm.json b/test/code_size/embind_val_wasm.json index 939ad737b3..1c8b8a1546 100644 --- a/test/code_size/embind_val_wasm.json +++ b/test/code_size/embind_val_wasm.json @@ -1,10 +1,10 @@ { "a.html": 552, "a.html.gz": 373, "a.js": 5356, "a.js.gz": 2526, - "a.wasm": 7468, - "a.wasm.gz": 3461, - "total": 13376, - "total_gz": 6360 + "a.wasm": 5831, + "a.wasm.gz": 2713, + "total": 11739, + "total_gz": 5612 } diff --git a/test/code_size/random_printf_wasm.json b/test/code_size/random_printf_wasm.json index 89da22d7c8..9685b59d93 100644 --- a/test/code_size/random_printf_wasm.json +++ b/test/code_size/random_printf_wasm.json @@ -1,6 +1,6 @@ { - "a.html": 12511, - "a.html.gz": 6848, - "total": 12511, - "total_gz": 6848 + "a.html": 12507, + "a.html.gz": 6822, + "total": 12507, + "total_gz": 6822 } diff --git a/test/code_size/random_printf_wasm2js.json b/test/code_size/random_printf_wasm2js.json index 5b21705c95..7d168dbd6a 100644 --- a/test/code_size/random_printf_wasm2js.json +++ b/test/code_size/random_printf_wasm2js.json @@ -1,6 +1,6 @@ { - "a.html": 17224, - "a.html.gz": 7551, - "total": 17224, - "total_gz": 7551 + "a.html": 17229, + "a.html.gz": 7542, + "total": 17229, + "total_gz": 7542 } diff --git a/test/other/codesize/test_codesize_files_wasmfs.size b/test/other/codesize/test_codesize_files_wasmfs.size index 82b16397a9..20191f896a 100644 --- a/test/other/codesize/test_codesize_files_wasmfs.size +++ b/test/other/codesize/test_codesize_files_wasmfs.size @@ -1 +1 @@ -50314 +50233 diff --git a/test/other/codesize/test_codesize_hello_O3.size b/test/other/codesize/test_codesize_hello_O3.size index b0539e90d9..b339887848 100644 --- a/test/other/codesize/test_codesize_hello_O3.size +++ b/test/other/codesize/test_codesize_hello_O3.size @@ -1 +1 @@ -1735 +1733 diff --git a/test/other/codesize/test_codesize_hello_Os.size b/test/other/codesize/test_codesize_hello_Os.size index 1c38c9071a..9b5f360cc2 100644 --- a/test/other/codesize/test_codesize_hello_Os.size +++ b/test/other/codesize/test_codesize_hello_Os.size @@ -1 +1 @@ -1725 +1723 diff --git a/test/other/codesize/test_codesize_hello_Oz.size b/test/other/codesize/test_codesize_hello_Oz.size index 771034cb6a..6bbc2a3cd4 100644 --- a/test/other/codesize/test_codesize_hello_Oz.size +++ b/test/other/codesize/test_codesize_hello_Oz.size @@ -1 +1 @@ -1259 +1257 diff --git a/test/other/codesize/test_codesize_hello_single_file.jssize b/test/other/codesize/test_codesize_hello_single_file.jssize index 4cd877762a..8755c7be20 100644 --- a/test/other/codesize/test_codesize_hello_single_file.jssize +++ b/test/other/codesize/test_codesize_hello_single_file.jssize @@ -1 +1 @@ -6615 +6611 diff --git a/test/other/codesize/test_codesize_hello_wasmfs.size b/test/other/codesize/test_codesize_hello_wasmfs.size index b0539e90d9..b339887848 100644 --- a/test/other/codesize/test_codesize_hello_wasmfs.size +++ b/test/other/codesize/test_codesize_hello_wasmfs.size @@ -1 +1 @@ -1735 +1733 diff --git a/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.funcs b/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.funcs index 86fd2dc144..7f12daaeba 100644 --- a/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.funcs +++ b/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.funcs @@ -1,2 +1 @@ -$__wasm_call_ctors $_start diff --git a/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.size b/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.size index 7296f257eb..94361d49fd 100644 --- a/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.size +++ b/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.size @@ -1 +1 @@ -136 +132 diff --git a/test/other/codesize/test_codesize_mem_O3_grow_standalone.funcs b/test/other/codesize/test_codesize_mem_O3_grow_standalone.funcs index 8a606d1279..19dd45693e 100644 --- a/test/other/codesize/test_codesize_mem_O3_grow_standalone.funcs +++ b/test/other/codesize/test_codesize_mem_O3_grow_standalone.funcs @@ -1,3 +1,2 @@ -$__wasm_call_ctors $_start $sbrk diff --git a/test/other/codesize/test_codesize_mem_O3_grow_standalone.size b/test/other/codesize/test_codesize_mem_O3_grow_standalone.size index ab5b9efed7..848ef7c501 100644 --- a/test/other/codesize/test_codesize_mem_O3_grow_standalone.size +++ b/test/other/codesize/test_codesize_mem_O3_grow_standalone.size @@ -1 +1 @@ -5553 +5549 diff --git a/test/other/codesize/test_codesize_mem_O3_standalone.funcs b/test/other/codesize/test_codesize_mem_O3_standalone.funcs index 8a606d1279..19dd45693e 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone.funcs +++ b/test/other/codesize/test_codesize_mem_O3_standalone.funcs @@ -1,3 +1,2 @@ -$__wasm_call_ctors $_start $sbrk diff --git a/test/other/codesize/test_codesize_mem_O3_standalone.size b/test/other/codesize/test_codesize_mem_O3_standalone.size index 7bcda5ba23..7e9732ae43 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone.size +++ b/test/other/codesize/test_codesize_mem_O3_standalone.size @@ -1 +1 @@ -5478 +5474 diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg.funcs b/test/other/codesize/test_codesize_mem_O3_standalone_narg.funcs index 8a606d1279..19dd45693e 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone_narg.funcs +++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg.funcs @@ -1,3 +1,2 @@ -$__wasm_call_ctors $_start $sbrk diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg.size b/test/other/codesize/test_codesize_mem_O3_standalone_narg.size index 05112f24d5..b54c900141 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone_narg.size +++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg.size @@ -1 +1 @@ -5271 +5267 diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.funcs b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.funcs index 8a606d1279..19dd45693e 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.funcs +++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.funcs @@ -1,3 +1,2 @@ -$__wasm_call_ctors $_start $sbrk diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size index 603c2df295..bbdd8cef02 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size +++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size @@ -1 +1 @@ -4084 +4080
One lucky embind test shrinks by 20%, but all other changes are just
a few bytes, far less than 1%. I looked at real-world codebases, and
see no real benefit there. My hunch is that this is expected because of
signature overlap: when you generate random graphs of size
n
andchance for each edge to exist
p
, then even ifp
decreases to 0 thegraph will tend to end up fully connected [1]. And, in wasm,
p
does not even decrease to 0:
{i32} -> {}
(i32 param, no result).q>0
for that signature to be called,and some chance
r>0
for that signature to exist in the code.p >= O(rq) > 0
because all it takes for a connection to exist is that thatsignature exists on one side and is called on the other.
That is, in large codebases there is an overlap in signatures, and
statistically this means that all the code will end up reachable, at
least in the limit. In small programs you may get lucky, but not in
the long run. And even in the mid run, you will quickly see weird
stuff like a game engine's physics code seeming to be able to call
networking or audio (impossible in general, but they can overlap
on signatures).
To really fix that we need more than structural typing of indirect
calls, something like knowing the possible targets at each callsite.
Devirtualization can provide this, based on source language info.
Still, this PR may be of some benefit in some cases.
[1] https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model#Properties_of_G(n,_p)