Tweak the vec-calloc runtime check to only apply to shortish-arrays #96596

scottmcm · 2022-05-01T07:25:32Z

@nbdd0121 pointed out in #95362 (comment) that LLVM currently doesn't constant-fold the IsZero check for long arrays, so that seems like a reasonable justification for limiting it.

It appears that it's based on length, not byte size, (https://godbolt.org/z/4s48Y81dP), so that's what I used in the PR. Maybe it's a "the number of inlining shall be three" sort of situation.

Certainly there's more that could be done here -- that generated code that checks long arrays byte-by-byte is highly suboptimal, for example -- but this is an easy, low-risk tweak.

Mark-Simulacrum · 2022-05-01T11:30:25Z

Would a codegen test be feasible to make sure llvm keeps doing it (and maybe a test for "one more" where llvm doesn't?

r=me with that added, or if it proves too difficult to be worthwhile

scottmcm · 2022-05-02T06:36:14Z

I've added a couple codegen tests that different kinds of vec![ZERO; n] fold away the non-zeroed allocation calls. I'm marking rollup=iffy because while they pass locally and in CI, the variety of builders in the full bors set is somewhat more prone to failing codegen tests.

@bors r=Mark-Simulacrum rollup=iffy

I've intentionally not included a negative codegen test, because those are really hard to detect anything meaningful, and it's not clear to me that them breaking for an improved implementation is more valuable than troublesome. Let me know if there's something specific you'd like to see, though, and I'm happy to add things.

bors · 2022-05-02T06:36:17Z

📌 Commit 2830dbd has been approved by Mark-Simulacrum

bors · 2022-05-02T09:05:25Z

⌛ Testing commit 2830dbd with merge 6b6c1ff...

bors · 2022-05-02T11:22:20Z

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 6b6c1ff to master...

rust-timer · 2022-05-02T13:37:53Z

Finished benchmarking commit (6b6c1ff): comparison url.

Summary: This benchmark run did not return any relevant results.

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

AngelicosPhosphoros · 2022-05-30T19:47:21Z

@scottmcm It seems that constant-folding optimization with arrays even more fragile because if there is another call to Vec::from_element with same type parameter, it prefers not to inline it so constant folding breaks.

If you add to your vec-calloc.rs test this function, your codegen test would fail.

// CHECK-LABEL: @vec_one_array_32
#[no_mangle]
pub fn vec_one_array_32(n: usize) -> Vec<[i64; 32]> {
    vec![[1_i64; 32]; n]
}

It seems that we should either measure performance gain from checking array (which is tricky since there is different allocators available with different performance) or limit our implementation by very low size of array like 8 for u8 or 3 for u32.

Also, there is a little issue for me that IsZero trait is now lying: it says that [0;1024] is not zero. Maybe we should rename trait somehow?

AngelicosPhosphoros · 2022-05-31T09:47:33Z

Also, godbolt link with compilation results.
godbolt

rust-highfive assigned Mark-Simulacrum May 1, 2022

This comment was marked as resolved.

Sign in to view

rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label May 1, 2022

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 1, 2022

Tweak the calloc optimization to only apply to shortish-arrays

2830dbd

scottmcm force-pushed the limited-calloc branch from 2b7c8be to 2830dbd Compare May 2, 2022 05:28

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 2, 2022

bors added the merged-by-bors This PR was explicitly merged by bors. label May 2, 2022

bors merged commit 6b6c1ff into rust-lang:master May 2, 2022

rustbot added this to the 1.62.0 milestone May 2, 2022

scottmcm deleted the limited-calloc branch May 2, 2022 15:23

scottmcm mentioned this pull request May 6, 2022

Support arrays of zeros in Vec's __rust_alloc_zeroed optimization #95362

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tweak the vec-calloc runtime check to only apply to shortish-arrays #96596

Tweak the vec-calloc runtime check to only apply to shortish-arrays #96596

Uh oh!

scottmcm commented May 1, 2022

Uh oh!

This comment was marked as resolved.

Mark-Simulacrum commented May 1, 2022 •

edited

Loading

Uh oh!

scottmcm commented May 2, 2022

Uh oh!

bors commented May 2, 2022

Uh oh!

bors commented May 2, 2022

Uh oh!

bors commented May 2, 2022

Uh oh!

rust-timer commented May 2, 2022

Uh oh!

AngelicosPhosphoros commented May 30, 2022

Uh oh!

AngelicosPhosphoros commented May 31, 2022

Uh oh!

Uh oh!

Tweak the vec-calloc runtime check to only apply to shortish-arrays #96596

Tweak the vec-calloc runtime check to only apply to shortish-arrays #96596

Uh oh!

Conversation

scottmcm commented May 1, 2022

Uh oh!

This comment was marked as resolved.

Mark-Simulacrum commented May 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scottmcm commented May 2, 2022

Uh oh!

bors commented May 2, 2022

Uh oh!

bors commented May 2, 2022

Uh oh!

bors commented May 2, 2022

Uh oh!

rust-timer commented May 2, 2022

Uh oh!

AngelicosPhosphoros commented May 30, 2022

Uh oh!

AngelicosPhosphoros commented May 31, 2022

Uh oh!

Uh oh!

Mark-Simulacrum commented May 1, 2022 •

edited

Loading