Skip to content

Conversation

madhav-madhusoodanan
Copy link
Contributor

Context

This is a redo of PR #1814, since a lot of details have changed with PRs #1863, #1862, #1861, #1852.

r? @folkertdev
cc: @Amanieu

@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch 2 times, most recently from 41db5a8 to 5fc0f3b Compare August 5, 2025 10:16
Comment on lines +199 to +407
match str::parse::<u32>(etype_processed.as_str()) {
Ok(value) => data.bit_len = Some(value),
Err(_) => {
data.bit_len = match data.kind() {
TypeKind::Char(_) => Some(8),
TypeKind::BFloat => Some(16),
TypeKind::Int(_) => Some(32),
TypeKind::Float => Some(32),
_ => None,
};
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are only some type kinds covered here? Maybe this could be a method on TypeKind?

@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch from 9e28106 to 111cd5d Compare August 5, 2025 16:22
Copy link
Contributor

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should rebase on top of the upstream master branch instead of merging it in. That keeps the git history clean.

Comment on lines 188 to 204
x86_64-unknown-linux-gnu*)
CPPFLAGS="${TEST_CPPFLAGS}" RUSTFLAGS="${HOST_RUSTFLAGS}" RUST_LOG=warn \
cargo run "${INTRINSIC_TEST}" "${PROFILE}" \
--bin intrinsic-test -- intrinsics_data/x86-intel.xml \
--runner "${TEST_RUNNER}" \
--cppcompiler "${TEST_CXX_COMPILER}" \
--target "${TARGET}"
;;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll have to see how to do this exactly, but we want to split these out of the main CI job to speed it up

@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch from f3f87f2 to 2ec747c Compare August 9, 2025 12:20
@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch from 2ec747c to a8313d0 Compare September 5, 2025 08:16
@madhav-madhusoodanan
Copy link
Contributor Author

Seems like the CI run at this point failed due to this error. I'll retry shortly:

#6 39.45   Could not connect to archive.ubuntu.com:80 (185.125.190.82), connection timed out Could not connect to archive.ubuntu.com:80 (185.125.190.83), connection timed out Could not connect to archive.ubuntu.com:80 (91.189.91.83), connection timed out Could not connect to archive.ubuntu.com:80 (185.125.190.81), connection timed out Could not connect to archive.ubuntu.com:80 (91.189.91.81), connection timed out Could not connect to archive.ubuntu.com:80 (185.125.190.36), connection timed out Could not connect to archive.ubuntu.com:80 (185.125.190.39), connection timed out Could not connect to archive.ubuntu.com:80 (91.189.91.82), connection timed out
#6 39.45   Unable to connect to archive.ubuntu.com:80:
#6 39.45 Fetched 126 kB in 39s (3208 B/s)
#6 39.45 Reading package lists...
#6 39.46 W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/questing/InRelease  Unable to connect to archive.ubuntu.com:80:
#6 39.46 W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/questing-updates/InRelease  Unable to connect to archive.ubuntu.com:80:
#6 39.46 W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/questing-backports/InRelease  Unable to connect to archive.ubuntu.com:80:
#6 39.46 W: Some index files failed to download. They have been ignored, or old ones used instead.

@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch 13 times, most recently from 126a3ce to e79129a Compare September 10, 2025 05:58
@folkertdev
Copy link
Contributor

Can you rebase this (and remove the merge commits) sometime? That would make it a lot easier to see what has actually changed.

Also, why do you push to then have CI fail? Running this locally is a lot faster than having CI do it, because you can skip the earlier steps and run just the intrinsic tests.

Copy link
Contributor

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, wait, it probably is up to date? it's just that xml file is enormous. that's crazy. Should we like, try to trim that down?

@madhav-madhusoodanan
Copy link
Contributor Author

@folkertdev I've been syncing using the below command whenever I'm updating from master:

git rebase master

Am I doing it correctly?

@madhav-madhusoodanan
Copy link
Contributor Author

About the XML file specification, may I ask what "trim down" means?

@folkertdev
Copy link
Contributor

depends a bit on your git setup, but it seems to work allright. I was just confused by the enormous number of lines changed, but that's all due to that XML file.

@folkertdev
Copy link
Contributor

What i mean is, could we process that XML file into a smaller XML file that only stores what we need? That would reduce the size of the repo (not sure how big that file is, maybe it compresses well?) and the speed of the intrinsic tests.

There are risks too, e.g. what we generate could go out of date with the official XML file. So maybe it's fine this way.

@madhav-madhusoodanan
Copy link
Contributor Author

madhav-madhusoodanan commented Sep 10, 2025

Ahh, makes sense.

I think it might be best to keep the source of truth as unchanged as possible, since there is no direct way to obtain the sources sometimes.

For example, the XML file originally existed in the stdarch-verify crate, and I've had to check it by manually downloading the x86 reference (which comes in a folder containing HTML, CSS and JS files) at the link called Download: Offline Intel® Intrinsics Guide in the Intel intrinsics reference site

The entire XML data can be found stored as a hardcoded string that is assigned to a variable in a JS file.

variants for compatibility with the Rust version
@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch 4 times, most recently from c9ea3bd to 18aed56 Compare October 15, 2025 06:45
@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch from 18aed56 to 5162772 Compare October 15, 2025 07:03
@madhav-madhusoodanan
Copy link
Contributor Author

madhav-madhusoodanan commented Oct 15, 2025

@folkertdev I've added the ability to sample intrinsics. Currently it's implemented for x86 and about 5% of intrinsics are randomly chosen for testing.

This is intended as a temporary solution really, we can remove it once we restructure the C++/Rust testfiles to run all intrinsics in one go.

@folkertdev
Copy link
Contributor

Neat!

So, this does increase the longest CI job to 20 minutes, up from 12 minutes. I think we should try to make this its own CI job at least. I'm also not sure that we want to randomly sample the tests on each run. If we do hit a failure, that will be hard to reproduce. The status quo is no testing at all, so I think testing even a fixed subset is an improvement.

@madhav-madhusoodanan there is already a version of the x86-intel.xml file in crates/stdarch-verify, you can just move that file to intrinsics-data (and change the path in the crates/stdarch-verify/tests/x86-intel.rs)

did this happen? It still looks like the file is added again here.

@madhav-madhusoodanan
Copy link
Contributor Author

Ohh no, let me do that too.

@madhav-madhusoodanan
Copy link
Contributor Author

About the hit failure, hmm yes you have a point.
Let's do a naive partitioning in that case.

to char 2. including variable names in template strings instead of
passing them as arguments to macros
@madhav-madhusoodanan
Copy link
Contributor Author

I'll need to rename the variables in arm/config.rs and x86/config.rs. I'll do that.

@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch from 524bc52 to 6df5eca Compare October 16, 2025 07:01
@madhav-madhusoodanan
Copy link
Contributor Author

madhav-madhusoodanan commented Oct 16, 2025

Okay, so I've split the intrinsic-test run onto separate CI processes.
The x86 run of the same is the longest among them and takes 10 mins.

Now the longest process among all the CI processes, is the Test CI process for x86 which takes 12 mins.

Copy link
Contributor

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good, but the CI changes are quite large, so could you split those out into their own PR (and then just add the x86 stuff here)?

ci/run.sh Outdated
Comment on lines 94 to 95
PATH="$PATH":"$(pwd)"/c_programs
export PATH
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed? it really should not be I think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so it turns out that for aarch64, qemu's executable resolution algorithm looks into the current directory too.
However, sde64 looks only at PATH.

This was intended to fix that discrepancy, but now that I think about it, it might be cleaner to use ./intrinsic-test-programs in compare_outputs.rs.

@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch from 6df5eca to 1abb103 Compare October 17, 2025 16:57
Comment on lines +41 to +47
uint16_t temp = 0;
memcpy(&temp, &value, sizeof(float16_t));
std::stringstream ss;
ss << "0x" << std::setfill('0') << std::setw(4) << std::hex << temp;
os << ss.str();
return os;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@folkertdev do we need stringstream here? Seems to me that we can make this definition shorter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My c++ is not good enough to know for sure. If there is something shorter that works I'm all for it

}
}

pub fn chunk_info(intrinsic_count: usize) -> (usize, usize) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll need to remove this, since we aren't using this.
The CI processes on x86 broke bcoz of RAM resource constraints.

@madhav-madhusoodanan
Copy link
Contributor Author

Okay, so I've reverted that last one commit (that split the CI job) and the PATH update.
I'll make that PR as soon as we merge this one.

@folkertdev
Copy link
Contributor

Maybe this was not clear but my intention was to merge the ci changes separately first, them rebase this pr on top

@madhav-madhusoodanan
Copy link
Contributor Author

madhav-madhusoodanan commented Oct 18, 2025

Ohh ohh, my bad.
Sure let me do that.

Created a PR for that: #1941

@madhav-madhusoodanan
Copy link
Contributor Author

Turns out std::hex is sticky for the stream that it is applied on.
So stringstream's usage actually makes sense.

@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch from 212e16d to 1abb103 Compare October 18, 2025 07:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants