Skip to content

Conversation

erifan
Copy link
Contributor

@erifan erifan commented Sep 5, 2025

Intrinsic support for VectorMask.laneIsSet with a variable input index was introduced in PR #14200, but was inadvertently broken by PR #25673. This PR restores the intrinsic functionality and adds some JTReg tests.

Benchmarks on Nvidia Grace machine with 128-bit SVE:

Benchmark			            Unit	Before		Score Error	After		Score Error	Uplift
microMaskLaneIsSetByte128_var	ops/ms	21702.14415	91.902159	103472.9391	36.057447	4.767867
microMaskLaneIsSetByte64_var	ops/ms	21468.51868	107.94177	103365.6561	69.47736	4.814754
microMaskLaneIsSetDouble128_var	ops/ms	77489.32791	153.242699	413499.4127	311.854079	5.336211
microMaskLaneIsSetFloat128_var	ops/ms	41034.95204	399.421823	206840.0988	74.702234	5.040583
microMaskLaneIsSetFloat64_var	ops/ms	77607.40268	175.938921	413745.3001	149.716794	5.33126
microMaskLaneIsSetInt128_var	ops/ms	41452.48893	76.143208	206845.9754	59.371129	4.989953
microMaskLaneIsSetInt64_var	    ops/ms	77726.2542	173.180518	413427.8838	363.575023	5.319024
microMaskLaneIsSetLong128_var	ops/ms	77646.11218	177.496587	413403.4404	236.609314	5.3242
microMaskLaneIsSetShort128_var	ops/ms	21374.93265	48.13101	103417.4618	34.827021	4.838259
microMaskLaneIsSetShort64_var	ops/ms	41066.19395	353.320621	206801.109	106.408938	5.035799

Benchmarks on Intel 6444y machine with 512-bit avx3:

Benchmark			            Unit	Before		Score Error	After		Score Error	Uplift
microMaskLaneIsSetByte128_var	ops/ms	57658.45497	240.209309	211643.8406	29.214532	3.670647
microMaskLaneIsSetByte256_var	ops/ms	57451.68169	116.994128	211609.4652	160.48513	3.683259
microMaskLaneIsSetByte512_var	ops/ms	57530.22411	311.63868	199802.8084	408.144015	3.473005
microMaskLaneIsSetByte64_var	ops/ms	57642.2672	161.406221	205252.4464	196.86852	3.560797
microMaskLaneIsSetDouble256_var	ops/ms	114401.3789	231.797375	361400.344	565.593984	3.159055
microMaskLaneIsSetDouble512_var	ops/ms	57379.27882	159.699503	211476.1138	136.980026	3.685583
microMaskLaneIsSetFloat128_var	ops/ms	113943.9512	141.062663	360855.3915	494.471996	3.166955
microMaskLaneIsSetFloat256_var	ops/ms	57682.78182	138.142053	211659.5098	30.167972	3.66937
microMaskLaneIsSetFloat512_var	ops/ms	57617.66405	301.748599	211246.8588	597.18949	3.666355
microMaskLaneIsSetInt128_var	ops/ms	113914.5062	118.681382	360856.4465	555.097397	3.167783
microMaskLaneIsSetInt256_var	ops/ms	57681.79883	112.391639	211555.6742	217.556981	3.667633
microMaskLaneIsSetInt512_var	ops/ms	57350.20346	206.146723	211657.7207	68.461571	3.690618
microMaskLaneIsSetLong256_var	ops/ms	113838.3822	415.784529	360782.0645	710.076899	3.169247
microMaskLaneIsSetLong512_var	ops/ms	57314.02695	190.1762	211690.8492	26.47233	3.693526
microMaskLaneIsSetShort128_var	ops/ms	57675.58965	65.940976	211549.9551	276.57545	3.667928
microMaskLaneIsSetShort256_var	ops/ms	57628.8642	91.957833	211694.0864	16.559412	3.673403
microMaskLaneIsSetShort512_var	ops/ms	57845.35211	160.537421	211358.872	660.777147	3.65386
microMaskLaneIsSetShort64_var	ops/ms	113848.8846	222.787418	360294.6295	491.425656	3.164674

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8366588: VectorAPI: Re-intrinsify VectorMask.laneIsSet where the input index is a variable (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27113/head:pull/27113
$ git checkout pull/27113

Update a local copy of the PR:
$ git checkout pull/27113
$ git pull https://git.openjdk.org/jdk.git pull/27113/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 27113

View PR using the GUI difftool:
$ git pr show -t 27113

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27113.diff

Using Webrev

Link to Webrev Comment

…t index is a variable

Intrinsic support for `VectorMask.laneIsSet` with a **variable** input
index was introduced in PR openjdk#14200, but was inadvertently broken by PR openjdk#25673.
This PR restores the intrinsic functionality and adds some JTReg tests.

Benchmarks on Nvidia Grace machine with 128-bit SVE:
```
Benchmark			Unit	Before		Score Error	After		Score Error	Uplift
microMaskLaneIsSetByte128_var	ops/ms	21702.14415	91.902159	103472.9391	36.057447	4.767867
microMaskLaneIsSetByte64_var	ops/ms	21468.51868	107.94177	103365.6561	69.47736	4.814754
microMaskLaneIsSetDouble128_var	ops/ms	77489.32791	153.242699	413499.4127	311.854079	5.336211
microMaskLaneIsSetFloat128_var	ops/ms	41034.95204	399.421823	206840.0988	74.702234	5.040583
microMaskLaneIsSetFloat64_var	ops/ms	77607.40268	175.938921	413745.3001	149.716794	5.33126
microMaskLaneIsSetInt128_var	ops/ms	41452.48893	76.143208	206845.9754	59.371129	4.989953
microMaskLaneIsSetInt64_var	ops/ms	77726.2542	173.180518	413427.8838	363.575023	5.319024
microMaskLaneIsSetLong128_var	ops/ms	77646.11218	177.496587	413403.4404	236.609314	5.3242
microMaskLaneIsSetShort128_var	ops/ms	21374.93265	48.13101	103417.4618	34.827021	4.838259
microMaskLaneIsSetShort64_var	ops/ms	41066.19395	353.320621	206801.109	106.408938	5.035799
```

Benchmarks on Intel 6444y machine with 512-bit avx3:
```
Benchmark			Unit	Before		Score Error	After		Score Error	Uplift
microMaskLaneIsSetByte128_var	ops/ms	57658.45497	240.209309	211643.8406	29.214532	3.670647
microMaskLaneIsSetByte256_var	ops/ms	57451.68169	116.994128	211609.4652	160.48513	3.683259
microMaskLaneIsSetByte512_var	ops/ms	57530.22411	311.63868	199802.8084	408.144015	3.473005
microMaskLaneIsSetByte64_var	ops/ms	57642.2672	161.406221	205252.4464	196.86852	3.560797
microMaskLaneIsSetDouble256_var	ops/ms	114401.3789	231.797375	361400.344	565.593984	3.159055
microMaskLaneIsSetDouble512_var	ops/ms	57379.27882	159.699503	211476.1138	136.980026	3.685583
microMaskLaneIsSetFloat128_var	ops/ms	113943.9512	141.062663	360855.3915	494.471996	3.166955
microMaskLaneIsSetFloat256_var	ops/ms	57682.78182	138.142053	211659.5098	30.167972	3.66937
microMaskLaneIsSetFloat512_var	ops/ms	57617.66405	301.748599	211246.8588	597.18949	3.666355
microMaskLaneIsSetInt128_var	ops/ms	113914.5062	118.681382	360856.4465	555.097397	3.167783
microMaskLaneIsSetInt256_var	ops/ms	57681.79883	112.391639	211555.6742	217.556981	3.667633
microMaskLaneIsSetInt512_var	ops/ms	57350.20346	206.146723	211657.7207	68.461571	3.690618
microMaskLaneIsSetLong256_var	ops/ms	113838.3822	415.784529	360782.0645	710.076899	3.169247
microMaskLaneIsSetLong512_var	ops/ms	57314.02695	190.1762	211690.8492	26.47233	3.693526
microMaskLaneIsSetShort128_var	ops/ms	57675.58965	65.940976	211549.9551	276.57545	3.667928
microMaskLaneIsSetShort256_var	ops/ms	57628.8642	91.957833	211694.0864	16.559412	3.673403
microMaskLaneIsSetShort512_var	ops/ms	57845.35211	160.537421	211358.872	660.777147	3.65386
microMaskLaneIsSetShort64_var	ops/ms	113848.8846	222.787418	360294.6295	491.425656	3.164674
```
@bridgekeeper
Copy link

bridgekeeper bot commented Sep 5, 2025

👋 Welcome back erifan! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Sep 5, 2025

@erifan This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8366588: VectorAPI: Re-intrinsify VectorMask.laneIsSet where the input index is a variable

Reviewed-by: shade, xgong, epeter

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 101 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@shipilev, @XiaohongGong, @eme64) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk
Copy link

openjdk bot commented Sep 5, 2025

@erifan The following labels will be automatically applied to this pull request:

  • core-libs
  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@mlbridge
Copy link

mlbridge bot commented Sep 5, 2025

Webrevs

Copy link
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me. I took another look at JDK-8358749, and I think this is the only place where we can really accept the non-constant input. In all other cases, we either pull is_con() or const_oop() out of the input.

I think we will bikeshed about the tests a bit.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Sep 5, 2025
@eme64
Copy link
Contributor

eme64 commented Sep 9, 2025

@erifan This is a regression / bug fix for #25673, right? If so, please convert the JBS issue into a bug.

@eme64
Copy link
Contributor

eme64 commented Sep 9, 2025

The patch looks reasonable, thanks for fixing this and writing an IR test!
I'm launching some internal testing now, should hopefully not take much more than 24h.

@XiaohongGong
Copy link

@erifan This is a regression / bug fix for #25673, right? If so, please convert the JBS issue into a bug.

Thanks for your review! I'v changed the JBS type to bug.

@erifan
Copy link
Contributor Author

erifan commented Sep 9, 2025

Thanks for your help @eme64 @XiaohongGong @shipilev

Copy link

@XiaohongGong XiaohongGong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch looks good to me, testing passed :)

Thanks for working on this @erifan !

@erifan
Copy link
Contributor Author

erifan commented Sep 10, 2025

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Sep 10, 2025
@openjdk
Copy link

openjdk bot commented Sep 10, 2025

@erifan
Your change (at version a672dd2) is now ready to be sponsored by a Committer.

@XiaohongGong
Copy link

/sponsor

@openjdk
Copy link

openjdk bot commented Sep 10, 2025

Going to push as commit 53b3e05.
Since your change was applied there have been 101 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Sep 10, 2025
@openjdk openjdk bot closed this Sep 10, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Sep 10, 2025
@openjdk
Copy link

openjdk bot commented Sep 10, 2025

@XiaohongGong @erifan Pushed as commit 53b3e05.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@erifan erifan deleted the JDK-8366588-laneIsSet branch September 10, 2025 02:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants