Skip to content

Track indirect call types in RemoveUnusedModuleElements #7728

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 45 commits into
base: main
Choose a base branch
from

Conversation

kripken
Copy link
Member

@kripken kripken commented Jul 15, 2025

An indirect call to a type in a table now only forces functions of
that type to be marked as used: functions of other types are
left alone, potentially leaving them unreached.

This is more precise than assuming any indirect call can
reach anywhere, which is more or less what we did before.

There is a downside to this: the pass is around 10% slower. This is one
of our faster passes, so this may be acceptable, however.

This has some benefits, here is the Emscripten diff:

Example diff --git a/test/code_size/embind_val_wasm.json b/test/code_size/embind_val_wasm.json index 939ad737b3..1c8b8a1546 100644 --- a/test/code_size/embind_val_wasm.json +++ b/test/code_size/embind_val_wasm.json @@ -1,10 +1,10 @@ { "a.html": 552, "a.html.gz": 373, "a.js": 5356, "a.js.gz": 2526, - "a.wasm": 7468, - "a.wasm.gz": 3461, - "total": 13376, - "total_gz": 6360 + "a.wasm": 5831, + "a.wasm.gz": 2713, + "total": 11739, + "total_gz": 5612 } diff --git a/test/code_size/random_printf_wasm.json b/test/code_size/random_printf_wasm.json index 89da22d7c8..9685b59d93 100644 --- a/test/code_size/random_printf_wasm.json +++ b/test/code_size/random_printf_wasm.json @@ -1,6 +1,6 @@ { - "a.html": 12511, - "a.html.gz": 6848, - "total": 12511, - "total_gz": 6848 + "a.html": 12507, + "a.html.gz": 6822, + "total": 12507, + "total_gz": 6822 } diff --git a/test/code_size/random_printf_wasm2js.json b/test/code_size/random_printf_wasm2js.json index 5b21705c95..7d168dbd6a 100644 --- a/test/code_size/random_printf_wasm2js.json +++ b/test/code_size/random_printf_wasm2js.json @@ -1,6 +1,6 @@ { - "a.html": 17224, - "a.html.gz": 7551, - "total": 17224, - "total_gz": 7551 + "a.html": 17229, + "a.html.gz": 7542, + "total": 17229, + "total_gz": 7542 } diff --git a/test/other/codesize/test_codesize_files_wasmfs.size b/test/other/codesize/test_codesize_files_wasmfs.size index 82b16397a9..20191f896a 100644 --- a/test/other/codesize/test_codesize_files_wasmfs.size +++ b/test/other/codesize/test_codesize_files_wasmfs.size @@ -1 +1 @@ -50314 +50233 diff --git a/test/other/codesize/test_codesize_hello_O3.size b/test/other/codesize/test_codesize_hello_O3.size index b0539e90d9..b339887848 100644 --- a/test/other/codesize/test_codesize_hello_O3.size +++ b/test/other/codesize/test_codesize_hello_O3.size @@ -1 +1 @@ -1735 +1733 diff --git a/test/other/codesize/test_codesize_hello_Os.size b/test/other/codesize/test_codesize_hello_Os.size index 1c38c9071a..9b5f360cc2 100644 --- a/test/other/codesize/test_codesize_hello_Os.size +++ b/test/other/codesize/test_codesize_hello_Os.size @@ -1 +1 @@ -1725 +1723 diff --git a/test/other/codesize/test_codesize_hello_Oz.size b/test/other/codesize/test_codesize_hello_Oz.size index 771034cb6a..6bbc2a3cd4 100644 --- a/test/other/codesize/test_codesize_hello_Oz.size +++ b/test/other/codesize/test_codesize_hello_Oz.size @@ -1 +1 @@ -1259 +1257 diff --git a/test/other/codesize/test_codesize_hello_single_file.jssize b/test/other/codesize/test_codesize_hello_single_file.jssize index 4cd877762a..8755c7be20 100644 --- a/test/other/codesize/test_codesize_hello_single_file.jssize +++ b/test/other/codesize/test_codesize_hello_single_file.jssize @@ -1 +1 @@ -6615 +6611 diff --git a/test/other/codesize/test_codesize_hello_wasmfs.size b/test/other/codesize/test_codesize_hello_wasmfs.size index b0539e90d9..b339887848 100644 --- a/test/other/codesize/test_codesize_hello_wasmfs.size +++ b/test/other/codesize/test_codesize_hello_wasmfs.size @@ -1 +1 @@ -1735 +1733 diff --git a/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.funcs b/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.funcs index 86fd2dc144..7f12daaeba 100644 --- a/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.funcs +++ b/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.funcs @@ -1,2 +1 @@ -$__wasm_call_ctors $_start diff --git a/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.size b/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.size index 7296f257eb..94361d49fd 100644 --- a/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.size +++ b/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.size @@ -1 +1 @@ -136 +132 diff --git a/test/other/codesize/test_codesize_mem_O3_grow_standalone.funcs b/test/other/codesize/test_codesize_mem_O3_grow_standalone.funcs index 8a606d1279..19dd45693e 100644 --- a/test/other/codesize/test_codesize_mem_O3_grow_standalone.funcs +++ b/test/other/codesize/test_codesize_mem_O3_grow_standalone.funcs @@ -1,3 +1,2 @@ -$__wasm_call_ctors $_start $sbrk diff --git a/test/other/codesize/test_codesize_mem_O3_grow_standalone.size b/test/other/codesize/test_codesize_mem_O3_grow_standalone.size index ab5b9efed7..848ef7c501 100644 --- a/test/other/codesize/test_codesize_mem_O3_grow_standalone.size +++ b/test/other/codesize/test_codesize_mem_O3_grow_standalone.size @@ -1 +1 @@ -5553 +5549 diff --git a/test/other/codesize/test_codesize_mem_O3_standalone.funcs b/test/other/codesize/test_codesize_mem_O3_standalone.funcs index 8a606d1279..19dd45693e 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone.funcs +++ b/test/other/codesize/test_codesize_mem_O3_standalone.funcs @@ -1,3 +1,2 @@ -$__wasm_call_ctors $_start $sbrk diff --git a/test/other/codesize/test_codesize_mem_O3_standalone.size b/test/other/codesize/test_codesize_mem_O3_standalone.size index 7bcda5ba23..7e9732ae43 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone.size +++ b/test/other/codesize/test_codesize_mem_O3_standalone.size @@ -1 +1 @@ -5478 +5474 diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg.funcs b/test/other/codesize/test_codesize_mem_O3_standalone_narg.funcs index 8a606d1279..19dd45693e 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone_narg.funcs +++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg.funcs @@ -1,3 +1,2 @@ -$__wasm_call_ctors $_start $sbrk diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg.size b/test/other/codesize/test_codesize_mem_O3_standalone_narg.size index 05112f24d5..b54c900141 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone_narg.size +++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg.size @@ -1 +1 @@ -5271 +5267 diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.funcs b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.funcs index 8a606d1279..19dd45693e 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.funcs +++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.funcs @@ -1,3 +1,2 @@ -$__wasm_call_ctors $_start $sbrk diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size index 603c2df295..bbdd8cef02 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size +++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size @@ -1 +1 @@ -4084 +4080

One lucky embind test shrinks by 20%, but all other changes are just
a few bytes, far less than 1%. I looked at real-world codebases, and
see no real benefit there. My hunch is that this is expected because of
signature overlap: when you generate random graphs of size n and
chance for each edge to exist p, then even if p decreases to 0 the
graph will tend to end up fully connected [1]. And, in wasm, p
does not even decrease to 0:

  • Consider some common signature like {i32} -> {} (i32 param, no result).
  • In real-world code, there is some chance q>0 for that signature to be called,
    and some chance r>0 for that signature to exist in the code.
  • p >= O(rq) > 0 because all it takes for a connection to exist is that that
    signature exists on one side and is called on the other.

That is, in large codebases there is an overlap in signatures, and
statistically this means that all the code will end up reachable, at
least in the limit. In small programs you may get lucky, but not in
the long run. And even in the mid run, you will quickly see weird
stuff like a game engine's physics code seeming to be able to call
networking or audio (impossible in general, but they can overlap
on signatures).

To really fix that we need more than structural typing of indirect
calls, something like knowing the possible targets at each callsite.
Devirtualization can provide this, based on source language info.
Still, this PR may be of some benefit in some cases.

[1] https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model#Properties_of_G(n,_p)

@kripken kripken requested a review from tlively July 15, 2025 22:51
Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we already have signature-based analysis for CallRef. Would it be possible to reuse that? (It should be if we handle the existence of TableGet correctly, but it looks like maybe we don't?).

Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also looks like maybe doing a table.init with a passive segment + a call_indirect into the table will not properly keep the function body alive.

@kripken
Copy link
Member Author

kripken commented Jul 15, 2025

Ran some more measurements, using the emscripten benchmark suite. Nothing improves close to 1%, except for Poppler, which shrinks by 6.5%. Odd that just that one benefits and none of the many others...

@kripken
Copy link
Member Author

kripken commented Jul 16, 2025

I see we already have signature-based analysis for CallRef. Would it be possible to reuse that? (It should be if we handle the existence of TableGet correctly, but it looks like maybe we don't?).

Yeah, the TODOs for table operations may get in the way. But in general I think we want to consider the two things separately, say, if we see no table.sets then we could know call_indirect can't call address-taken functions (but call_ref still can).

It also looks like maybe doing a table.init with a passive segment + a call_indirect into the table will not properly keep the function body alive.

I added tests for that now. Note that they pass: We do carefully assume the worst about table operations, from before this PR,

// Note a possible call of a function reference as well, as something might
// be written into the table during runtime. With precise tracking of what
// is written into the table we could do better here; we could also see
// which tables are immutable. TODO
noteCallRef(curr->heapType);

Improving that could help this PR do better.

@aheejin
Copy link
Member

aheejin commented Jul 16, 2025

void visitCallIndirect(CallIndirect* curr) {
note({ModuleElementKind::Table, curr->table});
noteIndirectCall(curr->table, curr->heapType);
// Note a possible call of a function reference as well, as something might
// be written into the table during runtime. With precise tracking of what
// is written into the table we could do better here; we could also see
// which tables are immutable. TODO
noteCallRef(curr->heapType);
}

Previous visitCallIndirect was doing something regarding types... So what was this function doing before and what this improves on?

@kripken
Copy link
Member Author

kripken commented Jul 16, 2025

@aheejin The main difference is that before, a reference to a table would mark everything in the table as used:

case ModuleElementKind::Table:
ModuleUtils::iterTableSegments(
*module, value, [&](ElementSegment* segment) {
if (!segment->data.empty()) {
use({ModuleElementKind::ElementSegment, segment->name});
}
});
break;

That code is removed in this PR, and instead, we only mark functions of the right type:

https://github.com/WebAssembly/binaryen/pull/7728/files#diff-f0c968562d115b75fb1671c7b72222e08776d01059355efc9271c96e7b720c30R357-R374

So if there is a function in the table of signature S and no call_indirect is reached of that type, we can remove that function (and the things it reaches, potentially).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants