-
Notifications
You must be signed in to change notification settings - Fork 74
Vector Permutation Design Document #251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Vector Permutation Design Document #251
Conversation
core/InstArchInfo.hpp
Outdated
COMPRESS, | ||
WHOLE_REG_MOVE, | ||
UNKNOWN, | ||
NONE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NONE
should be before UNKNOWN
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified.
@govardhnn your design doc is missing the critical information that I'm looking for which is "what uops are generated for each type of permutation instruction?". It would be great to include a simple example of uop generation for each type of uop generator that you plan to add. |
From today's meeting (2nd June, 2025), the following is the link to
my slides with the initial block diagram for the vector slide instructions.
Link: Olympia: Vector Permutation Design Proposal
<https://docs.google.com/presentation/d/1JPNQCGP9xFT4H0yEiLLE2OtRa35D_gz6fdhwKsWmmy0/edit?usp=sharing>
The reference uArch for the other `vcompress` and `vgather`
that I presented today is linked below:
[Efficient Implementation of RISC-V Vector Permutation Instructions
<https://arxiv.org/abs/2505.07112>
arXiv:2505.07112 <https://arxiv.org/abs/2505.07112>] - and will be cited in
`docs/vector_permutation.adoc`
The vector permutation design document PR#251 will also soon be extended
based on today's review feedback.
Thanks,
Govardhan
…On Mon, May 19, 2025 at 9:47 PM Kathlene Magnus ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In arches/isa_json/olympia_uarch_rv64v.json
<#251 (comment)>
:
> @@ -104,7 +104,7 @@
{
"mnemonic": "vcompress.vm",
"pipe": "vpermute",
- "uop_gen": "PERMUTE",
+ "uop_gen": "COMPRESS",
This file is generated by gen_uarch_rv64v_json.py so it shouldn't be
updated directly. You can modify the Python script and then run it to
generate this file.
—
Reply to this email directly, view it on GitHub
<#251 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AI2SKHTVJLPP3PJAAAOTUEL27H7ZBAVCNFSM6AAAAAB2RNA4VSVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDQNJRGM3DCMJTGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***
com>
--
Sai Govardhan
InCore Semiconductors
|
Hey @govardhnn where are you with this PR? Specifically, did you address @kathlenemagnus requests? Also, can you get regression to pass again? |
Hi @klingaard and @kathlenemagnus-mips I have been on a personal break for a while - I should have informed the team earlier, apologies.. Thanks, |
Hey @govardhnn, yes, please feel free to take the time you need. We do appreciate the contributions that folks make to the model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preemptive acceptance.
Hi @klingaard, I will have to discontinue this submission since I have joined a startup in stealth recently. I will be unable to contribute to open source for sometime - and intend to come back in full force once I have the permissions to do so. I will be happy if any other volunteers would like to build on this vector permutation proposal. Thanks, |
Understood and we do appreciate the contributions you've made! We can take it from here. Best of luck in your endeavors. |
Thanks @klingaard! |
&VectorUopGenerator::generateSlideUops_<InstArchInfo::UopGenType::SLIDE1DOWN>); | ||
|
||
// Vector permute uop generator | ||
// Vector general slide uop generators |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be great if could add some examples for the new uop generators.
} | ||
|
||
InstPtr VectorUopGenerator::generatePermuteUops_() | ||
template <InstArchInfo::UopGenType Type> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessary to have a template parameter if this method is only valid for InstArchInfo::UopGenType::SCALAR_MOVE
(according to the static assert on line 512).
|
||
InstPtr VectorUopGenerator::generatePermuteUops_() | ||
template <InstArchInfo::UopGenType Type> | ||
InstPtr VectorUopGenerator::generateScalarMoveUops_() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand what this new generator type provides that isn't already supported. Right now, they are not sequenced at all so no uops are generated and we just send the parent inst down the pipe. If there is a need to generate a uop, we could use the ELEMENTWISE
generator and set num_uops_to_generate_
to 1.
return makeInst_(srcs, dests); | ||
} | ||
|
||
template <InstArchInfo::UopGenType Type> InstPtr VectorUopGenerator::generateSlideGeneralUops_() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this generator looks like it's doing the same thing as ELEMENTWISE
. Not sure why it is needed.
@govardhnn This work looks incomplete to me. I would expect this PR to contain modifications to Execute since the uops generated cannot be executed independently of each other. For example, with your vrgather example in your doc:
It's possible that the indexes specified in v4 will need to gather elements from multiple vs2 registers (v8-v11) to write the result for v20. UOP 1 on its own will not read the right source registers to be able to write the correct value to v20. The result is that v20 will be marked ready earlier than is functionally possible. I see a similar issue with some of the vslide instructions. Did you have more work planned for this project that you did not get to? If so, please document it so someone else can continue this work. |
|General slides |`SLIDEUP`/`SLIDEDOWN` |VPERMUTE |4 UOPs | ||
|Slide1 operations |`SLIDE1UP`/`SLIDE1DOWN` |VINT/VFLOAT |4 UOPs | ||
|Register gather |`RGATHER` |VPERMUTE |4 UOPs | ||
|Vector compress |`COMPRESS` |VPERMUTE |1 (always) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vector compress always has 1 dest but could have up to 8 vector sources. Is that how you planned to support this instruction type?
No description provided.