Skip to content

Conversation

hpkfft
Copy link

@hpkfft hpkfft commented Jul 19, 2024

Intrinsics such as _mm512_add_round_ps take a rounding mode argument to specify the floating point rounding mode. This, and similar instructions, do NOT round their result to an integer. Thus it is inappropriate for user code to specify the existing _MM_FROUND_TO_NEAREST_INT when desiring to round to the nearest floating point number. This commit adds a suitable macro definition.

Intrinsics such as `_mm512_add_round_ps` take a rounding mode argument to specify the floating point rounding mode.  This, and similar instructions, do NOT round their result to an integer.   Thus it is inappropriate for user code to specify the existing `_MM_FROUND_TO_NEAREST_INT` when desiring to round to the nearest floating point number.  This commit adds a suitable macro definition.
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics labels Jul 19, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 19, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-x86

Author: hpkfft.com (hpkfft)

Changes

Intrinsics such as _mm512_add_round_ps take a rounding mode argument to specify the floating point rounding mode. This, and similar instructions, do NOT round their result to an integer. Thus it is inappropriate for user code to specify the existing _MM_FROUND_TO_NEAREST_INT when desiring to round to the nearest floating point number. This commit adds a suitable macro definition.


Full diff: https://github.com/llvm/llvm-project/pull/99691.diff

1 Files Affected:

  • (modified) clang/lib/Headers/avx512fintrin.h (+1)
diff --git a/clang/lib/Headers/avx512fintrin.h b/clang/lib/Headers/avx512fintrin.h
index 4f172c74b31cb..2d7b5c534e554 100644
--- a/clang/lib/Headers/avx512fintrin.h
+++ b/clang/lib/Headers/avx512fintrin.h
@@ -43,6 +43,7 @@ typedef unsigned short __mmask16;
 
 /* Rounding mode macros.  */
 #define _MM_FROUND_TO_NEAREST_INT   0x00
+#define _MM_FROUND_TIES_TO_EVEN     0x00
 #define _MM_FROUND_TO_NEG_INF       0x01
 #define _MM_FROUND_TO_POS_INF       0x02
 #define _MM_FROUND_TO_ZERO          0x03

@hpkfft
Copy link
Author

hpkfft commented Jul 19, 2024

Note that IEEE Std 754-2019 for floating-point arithmetic specifies two rounding direction attributes to nearest: roundTiesToEven and roundTiesToAway. Also, it specifies three directed rounding attributes: roundTowardPositive, roundTowardNegative, and roundTowardZero.

This patch follows that naming precedent by assuming round-to-nearest is understood. If there is a consensus to use a longer macro name (such as _MM_FROUND_TO_NEAREST_TIES_EVEN or similar) that is fine too.

@phoebewang
Copy link
Contributor

What's the ties mean here? The SDM uses Round to nearest (even). As for target intrinsic, I think we sholuld be close to target document. Also, will GCC accept the name? We need compatibility with GCC, or even MSVC.

@hpkfft
Copy link
Author

hpkfft commented Jul 20, 2024

I understand that the details of floating point arithmetic are somewhat of a specialized area of computer science. I suggest that this obligates a greater duty to show care and diligence, and so I thank you for not dismissing my concerns as too trivial to be worth discussing.

Rounding is necessary when the infinitely precise mathematical result of a floating point computation is not representable as a floating point number in the specified format (e.g., single precision). Let's ignore directed rounding (roundTowardPositive, roundTowardNegative, and roundTowardZero) for this comment. Often the mathematical result is closer to one particular representable floating point number than it is to any other. Then, this result (the nearest floating point number) is returned as the result. However, it may happen that the infinitely precise mathematical result is exactly half-way between two representable values. This is a tie. In this case, the IEEE Std 754-2019 rounding attribute roundTiesToEven specifies that, of the two equidistant representable results, the result returned should have a bit pattern ending in 0.

Hardware correctly implements the IEEE rounding attribute roundTiesToEven, so it seems best not to innovate in naming the macro. I would suggest that the Intel SDM phrasing Round to nearest (even) is less than optimal. A reader might be forgiven for mistakenly thinking that hardware always rounds to an even result. In a macro definition, there is merit in brevity, but not so much in a developer manual. I suggest Round to nearest with ties rounding to even.

Note that some few instructions, e.g., ROUNDPS, do round to an integer, so the existing macro definition _MM_FROUND_TO_NEAREST_INT ought not be deprecated.

I have written to the GCC mailing list to inquire as to whether they will accept this additional macro definition.

Regards,
Paul

@phoebewang
Copy link
Contributor

Thanks @hpkfft for the thorough explanation. I'm fine with either _MM_FROUND_TIES_TO_EVEN or _MM_FROUND_TO_NEAREST_TIES_EVEN. Let's see GCC community preference.

@hpkfft hpkfft changed the title Add _MM_FROUND_TIES_TO_EVEN to avx512fintrin.h Add _MM_FROUND_TO_NEAREST_TIES_EVEN to avx512fintrin.h Jul 22, 2024
Copy link
Contributor

@KanRobert KanRobert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the tests for m512_maskz_cvt_roundepu64_ps e.t.c (in avx512dq-builtins.c )?

@topperc
Copy link
Collaborator

topperc commented Jul 23, 2024

I happened to find this document from Intel https://www.intel.com/content/www/us/en/content-details/669773/intel-avx-512-fp16-instruction-set-for-intel-xeon-processor-based-products-technology-guide.html that contains this text

"The C intrinsic constant selector name _MM_FROUND_TO_NEAREST_INT is not ideal, but that name has been historically used for so long in all common compilers that it is difficult to change to something more meaningful."

@hpkfft
Copy link
Author

hpkfft commented Jul 23, 2024

I disagree, of course, that it's difficult to change. One need only merge this PR. I don't see a reason not to do so.
I suggest that this rounding control is rarely used, and then only by floating point experts. That implies that this PR has limited value to the greater world, but also that the relevant user code is small enough to change if developers wish to opt-in. ROUND_TO_NEAREST_INT is wrong for most floating point instructions. Why not show a sense of pride and craftsmanship in LLVM? I do in my FFT code....

Yes, I plan to update the tests if this PR is merged. I'll make that a separate PR. (It looks to be about 100 lines in the GCC code base and fewer here. I consider that an easy change, but there's no point in my doing the work if this PR is rejected.)

@robincharlesgilbert
Copy link

Former dev for the math libraries of the Intel compiler here.

It is true that the name _MM_FROUND_TO_NEAREST_INT is quite old. For instance (since I see an Intel document quoted here for which I may or may not have been one of the authors, and that "all common compilers" is typical codeword for MSVC or GCC), this name has been sleeping in GCC inside config/i386/smmintrin.h for 17 years now --- a file, like many before and since, that was added by an Intel colleague, H.J. Lu, in GCC. However, even back then, this was seen as a problematic name in the microscopic world of computational arithmetic (which is called Numerics at Intel), since, indeed, it does not round to an integer. Given that's the default rounding mode, that name did not leave the greatest impression. At least it's _MM_ROUND_NEAREST in xmmintrin.h, but that was a few years earlier.

This commit does not change the existing name, it only adds a much less confusing one while respecting the nomenclature pattern. It will give the name that should have been, and it will help remove some confusion about what round to nearest actually is.

@hpkfft
Copy link
Author

hpkfft commented Jul 23, 2024

Note that the following in xmmintrin.h

#define _MM_ROUND_NEAREST     (0x0000U)
#define _MM_ROUND_DOWN        (0x2000U)
#define _MM_ROUND_UP          (0x4000U)
#define _MM_ROUND_TOWARD_ZERO (0x6000U)
#define _MM_ROUND_MASK        (0x6000U)

sets different bits than the _MM_FROUND_TO_* naming scheme which is the subject of this patch.
The former applies to _mm_getcsr(); this patch applies to instruction intrinsics (_mm512_add_round_ps, etc.)

@topperc
Copy link
Collaborator

topperc commented Jul 24, 2024

How can we get MSVC to add the new name too?

@robincharlesgilbert
Copy link

How can we get MSVC to add the new name too?

Regarding the Intel compiler, which is (unlike the LLVM project but like MSVC) proprietary, all such modifications are (by far) prompted by customer requests. These requests are communicated either directly to the engineers (for example, the Argonne National Laboratory makes all sorts of requests to MKL directly) or indirectly via tickets opened by the team who handles customers. Assuming that the process is similar at Microsoft, then you would need a (significant) customer of MSVC to contact them regarding adding the name.

Now, it is possible that features found in open-source projects end up being ported in MSVC. All it needs is for a (significant) customer of MSVC to request it, because for some reason this is something they saw in GCC/LLVM/etc and they want it in MSVC. But before that happens, it needs a precedent; if the name is never put anywhere, then nobody will ever request it in MSVC.

@topperc
Copy link
Collaborator

topperc commented Jul 24, 2024

Regarding the Intel compiler, which is (unlike the LLVM project but like MSVC) proprietary

Aren't the latest versions of Intel's compiler, icx, based on LLVM?

@robincharlesgilbert
Copy link

Aren't the latest versions of Intel's compiler, icx, based on LLVM?

Correct. However, the libraries (e.g. math libraries) were all ported from icl to icx. Getting the math libraries of icl to compile with icx was among the last things I did for the company.

@RKSimon
Copy link
Collaborator

RKSimon commented Jul 25, 2024

GCC thread (for reference): https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657957.html

@hpkfft
Copy link
Author

hpkfft commented Jul 26, 2024

I see that both phoebewang and KanRobert work for Intel, so Intel's LLVM compiler icx is sure to get this patch.
Intel itself is both a customer and partner of Microsoft, so MSVC ought not be a problem. After all, it is in both Intel's and Microsoft's interest to delight their developer community by supplying a macro that is suitably named for what it does.
But it needs to start here (to demonstrate some measure of community review and support).

Regarding the use of NEAREST_INT for instructions that do not round to integer, I am reminded of the line spoken by Iñigo Montoya:

You keep using that word. I do not think it means what you think it means.

Yes, it will take some time for this change to percolate though to other code bases and documents. But as in the proverb:

Society grows great when old men plant trees in whose shade they know they shall never sit.

@phoebewang
Copy link
Contributor

I'm good with adding the macro. But how it propagates to other compilers would be an chicken or the egg problem. User won't use this flag if they care the portability, while other compilers may be reluctant to add it if no requiest from their user.

@robincharlesgilbert
Copy link

To be fair, regarding MSVC, it might be more than just a chicken and the egg problem. It is sometimes in the best interests of Microsoft to make it as painful as possible for their customers to port their code onto other platforms, or to use standard libraries which are not implemented by Microsoft. On the other hand, when it comes to the rounding modes (for other functions than the intrinsics of this PR), MSVC has non-standard functions found nowhere else (e.g. _controlfp) which are using more objectionable names (_RC_CHOP for round toward zero... instead of a more natural name like _RC_ZERO).

Outside the Microsoft ecosystem, code portability is usually less of an issue. There is also less of a need to twist the hand of a patron to get the owner of a proprietary compiler to fix something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants