Skip to content

Strip the producers section in --strip #1875

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 31, 2019
Merged

Strip the producers section in --strip #1875

merged 5 commits into from
Jan 31, 2019

Conversation

kripken
Copy link
Member

@kripken kripken commented Jan 18, 2019

The producers section is a form of metadata (no semantic information / not necessary for executing the wasm). Like debug info, names, etc., --strip should remove it.

As emcc runs binaryen --strip at higher optimization levels (and when no debug info / profiling info, etc., which is the default in higher optimization levels), this means that such binaries will not have producer sections.

This saves 187 bytes on each wasm file. It will also unbreak the wasm waterfall (the metadce tests fails on a file now being over 3x larger than expected).

cc @lukewagner

@kripken kripken requested a review from tlively January 18, 2019 19:00
Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks fine and the change makes sense to me. Let's see what Luke thinks.

@lukewagner
Copy link
Member

Is --strip run by default as part of normal opt builds? If so, then I think it'd be better to leave the producers section in, since it's fairly miniscule in size, unlike the names and other sections that people are probably meaning to remove when they --strip. How about having a separate --strip-producers or --strip-all for testing and for when people really mean to remove the producers?

@kripken
Copy link
Member Author

kripken commented Jan 18, 2019

Is --strip run by default as part of normal opt builds?

Yes, when no debug or profiling or other metadata is requested by the user.

it's fairly miniscule in size

It is over 100 bytes, which is small compared to a large wasm I agree, but not all wasms are large, and also it's 100 bytes times every wasm downloaded over the internet, which adds up.

How about having a separate --strip-producers or --strip-all for testing and for when people really mean to remove the producers?

This is the key issue, I think. When someone tells emscripten "optimize my code for size as best you can" then they do mean to remove any metadata and unnecessary space, which implies removing this section.

I don't see developers asking for this extra size increase to their binaries. I think you need to convince them of that value if we want this on by default. Meanwhile we can add a flag for the devs that do want to emit this.

@lukewagner
Copy link
Member

It is over 100 bytes, which is small compared to a large wasm I agree, but not all wasms are large, and also it's 100 bytes times every wasm downloaded over the internet, which adds up.

Since this is mostly just a matter of deciding what are reasonable defaults, how about only stripping the producers section by default when doing so provides a meaningful size reduction, say, >.1%.

@kripken
Copy link
Member Author

kripken commented Jan 18, 2019

It still would not fit in with what emcc currently does - when users optimize in emcc, they want the smallest and fastest binary. These new 100 bytes have no practical benefit for the developer, so it's not like a speed/size tradeoff. The consistent behaviour in emcc is to strip them.

However, we are in binaryen here, and some devs may want to keep these, so I'm fine with adding a --strip variant that does not remove this section. However, I don't think emcc would use it by default.

@lukewagner
Copy link
Member

There's also a benefit to the toolchain and ecosystem to get reporting on what tools are being used and how much, and that will, practically speaking, be entirely lost if the defaults that everyone use strip the producers section by default. So if there is a <.1% size reduction, it seems like the downside is negligible while there is a concrete upside.

@kripken
Copy link
Member Author

kripken commented Jan 18, 2019

I am open to being convinced about that benefit to the ecosystem, but I don't see it yet. What would that benefit be? (Not that I'm not curious on how used the tools I work on are, but I see that as idle curiosity.)

Let's discuss this more. For now, I opened emscripten-core/emscripten#7892 to hack the test to pass for the wasm waterfall.

@lukewagner
Copy link
Member

More than idle curiosity, seeing what tools are used, and how much, can help prioritize time spent on various projects and validate time already spent. Just normal data-driven development type stuff.

@kripken
Copy link
Member Author

kripken commented Jan 18, 2019

I'm not sure I see how useful that is. Yes it's interesting, but we already have a vague notion of how popular the various tools are, and that's more than good enough for prioritization decisions, isn't it?

Is there precedent for this? Do JS bundlers, for example, bundle some producer info about themselves, which minifiers were used, etc.?

@lukewagner
Copy link
Member

Getting away from vague notions of popularity is the goal of using real measurement data though :) Ultimately, usefulness is subjective, but this was discussed a couple times in various wasm CG meetings and the issue and multiple groups seemed think it was useful. @dschuff thoughts?

@kripken
Copy link
Member Author

kripken commented Jan 30, 2019

WebAssembly/tool-conventions#93 has a summary of emscripten's current thinking on this. For Binaryen, we don't want to do anything to the producers section by default, but do want it to be possible to optionally remove it. To achieve that, this PR now

  • creates a --strip-producers pass that removes that section.
  • creates a --strip-debug pass that removes debug info, same as the old --strip, which is still around but deprecated.

A followup in emscripten will use this pass by default.

Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM apart from missing word in warning string.

@@ -719,7 +719,7 @@ void WasmBinaryBuilder::readUserSection(size_t payloadLen) {
} else {
// an unfamiliar custom section
if (sectionName.equals(BinaryConsts::UserSections::Linking)) {
std::cerr << "warning: linking section is present, which binaryen cannot handle yet - relocations will be invalidated!\n";
std::cerr << "warning: linking section is present, so this is not a standard wasm file - binaryen cannot this properly!\n";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot handle?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixing.

@kripken kripken merged commit ddb5acd into master Jan 31, 2019
@kripken kripken deleted the strip2 branch January 31, 2019 17:35
kripken added a commit to emscripten-core/emscripten that referenced this pull request Jan 31, 2019
Update binaryen to get the pass. Also use --strip-debug for debug (the old --strip means the same, but is less specific). See WebAssembly/binaryen#1875

This lets us re-enable part of the metadce test (where the producers section was skewing the size of a tiny binary).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants