Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ROO-70] [mlir] [arith] emulate wide int #3

Conversation

egebeysel
Copy link

@egebeysel egebeysel commented Jan 27, 2025

Well, emulation pass for arith.fptoui and arith.fptosi.

The basic algorithm looks like this:

const double TWO_POW_BW = (uint_64_t(1) << bitwidth); // 2^BW

// f is a floating-point value representing the 64-bit number.
uint32_t hi = (uint32_t)(f / TWO_POW_BW);         // Truncates the division result.
uint32_t lo = (uint32_t)(f - hi * TWO_POW_BW);       // Subtracts to get the lower bits.

f - hi * TWO_POW_BW is emitted via arith.remf.

arith.fptosi emits fptoui with the absolute value of the input fp. It also does a bounds check and emits MAX_SIGNED_INT or MIN_SIGNED_INT w.r.t. the sign of the input.

I added the runner tests, but here are some remarks:

According to LLVM LangRef https://llvm.org/docs/LangRef.html#fptoui-to-instruction "If the value cannot fit in target int type, the result is a poison value.". So basically, the +-inf and overflow values - but they result in the same poison value in the case of fptoui when we run it with and without emulation (see the integration test). Also, NaN values result in different results with fptosi and fptoui, namely -2^63(INT64_MIN) and -1 respectively. I'm not entirely sure if this is UB/poison or not.

Lastly, numbers that are representable with unsigned integers but not with signed ones (>=2^63 in the case of int64) also result in poison-looking numbers: fptosi(2^63) emits -2^63 with the mlir-cpu-runner without the emulation, which seems vague to be honest.

https://llvm.org/docs/LangRef.html#behavior-of-floating-point-nan-values also does not say much about the behavior of NaN bitcast/conversion ops.

https://llvm.org/docs/LangRef.html#llvm-fptoui-sat-intrinsic gives saturating semantics but default lowering does not result in these. I guess not doing anything and leaving it to the specific target/lowering might emit these? Not sure...

So if I actually have to handle the cases of abs(fp) >= MAX_SINT in arith.fptosi and overflows in arith.fptosi are unclear. I guess the best course of action here would be that I post the PR upstream and ask for reviews over there, after your rounds of review.

Copy link
Author

egebeysel commented Jan 27, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@egebeysel egebeysel changed the title Emulates arith.fptoui for wide integers. vector<*xi2N> is emulated as vector<*x2xiN>. scalars are also emulated with vector<2xiN>. [ROO-70] [mlir] [arith] emulate wide int Jan 27, 2025
@egebeysel egebeysel marked this pull request as ready for review January 27, 2025 14:38
@egebeysel egebeysel marked this pull request as draft January 27, 2025 14:38
@egebeysel egebeysel marked this pull request as ready for review January 27, 2025 14:38
@egebeysel egebeysel marked this pull request as draft January 27, 2025 14:39
@egebeysel egebeysel force-pushed the beysel/roo-70-spirv-add-pattern-for-fptosi-wide-int-emulation branch 3 times, most recently from cc383a2 to 06d0eaf Compare February 3, 2025 16:54
@egebeysel egebeysel marked this pull request as ready for review February 4, 2025 09:40
Copy link

Can you create LIT tests? Will be a lot easier to review with them

@egebeysel
Copy link
Author

yes, sorry, was going to do it the next time I'm working

chrsmcgrr pushed a commit that referenced this pull request Feb 5, 2025
llvm#123877)

Reverts llvm#122811 due to buildbot breakage e.g.,
https://lab.llvm.org/buildbot/#/builders/52/builds/5421/steps/11/logs/stdio

ASan output from local re-run:
```
==2780289==ERROR: AddressSanitizer: use-after-poison on address 0x7e0b87e28d28 at pc 0x55a979a99e7e bp 0x7ffe4b18f0b0 sp 0x7ffe4b18f0a8
READ of size 1 at 0x7e0b87e28d28 thread T0
    #0 0x55a979a99e7d in getStorageClass /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/Object/COFF.h:344
    #1 0x55a979a99e7d in isSectionDefinition /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/Object/COFF.h:429:9
    #2 0x55a979a99e7d in getSymbols /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/LLDMapFile.cpp:54:42
    #3 0x55a979a99e7d in lld::coff::writeLLDMapFile(lld::coff::COFFLinkerContext const&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/LLDMapFile.cpp:103:40
    #4 0x55a979a16879 in (anonymous namespace)::Writer::run() /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Writer.cpp:810:3
    #5 0x55a979a00aac in lld::coff::writeResult(lld::coff::COFFLinkerContext&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Writer.cpp:354:15
    llvm#6 0x55a97985f7ed in lld::coff::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Driver.cpp:2826:3
    llvm#7 0x55a97984cdd3 in lld::coff::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Driver.cpp:97:15
    llvm#8 0x55a9797f9793 in lld::unsafeLldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>, bool) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:163:12
    llvm#9 0x55a9797fa3b6 in operator() /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:188:15
    llvm#10 0x55a9797fa3b6 in void llvm::function_ref<void ()>::callback_fn<lld::lldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>)::$_0>(long) /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:46:12
    llvm#11 0x55a97966cb93 in operator() /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:69:12
    llvm#12 0x55a97966cb93 in llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:426:3
    llvm#13 0x55a9797f9dc3 in lld::lldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:187:14
    llvm#14 0x55a979627512 in lld_main(int, char**, llvm::ToolContext const&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/tools/lld/lld.cpp:103:14
    llvm#15 0x55a979628731 in main /usr/local/google/home/thurston/buildbot_repro/llvm_build_asan/tools/lld/tools/lld/lld-driver.cpp:17:10
    llvm#16 0x7ffb8b202c89 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    llvm#17 0x7ffb8b202d44 in __libc_start_main csu/../csu/libc-start.c:360:3
    llvm#18 0x55a97953ef60 in _start (/usr/local/google/home/thurston/buildbot_repro/llvm_build_asan/bin/lld+0x8fd1f60)
```
@egebeysel egebeysel force-pushed the beysel/roo-70-spirv-add-pattern-for-fptosi-wide-int-emulation branch from 06d0eaf to a44d373 Compare February 7, 2025 21:49
Copy link
Author

Can you create LIT tests? Will be a lot easier to review with them

I just pushed the LIT Tests for the fptoui in scalar and vector forms and fptosi in scalar form. I'll also push the vector form of the latter when I have time. Also, I tried keeping the tests as relevant to the actual operation and isolated from other emulations as possible, but am open to any suggestions over here.

Copy link
Author

I hope this makes it a little bit easier to review, although the tests themselves are also pretty crowded.

Copy link
Author

egebeysel commented Feb 7, 2025

Also, please write up the naming suggestions for the variables in code/tests if they're really not understandable :)

Copy link
Collaborator

@chrsmcgrr chrsmcgrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I'm going to try and review the code this week, but just want to submit my comment for also adding runner tests to make sure we are functionally correct at least with a simple CPU pipeline.

@egebeysel egebeysel force-pushed the beysel/roo-70-spirv-add-pattern-for-fptosi-wide-int-emulation branch from a44d373 to 5e3ee1f Compare February 14, 2025 23:13
chrsmcgrr pushed a commit that referenced this pull request Feb 17, 2025
In contrast to SelectionDAG, GlobalISel created a new virtual register
for the return value of invariant.start, leaving subsequent users of the
invariant.start value with an undefined reference.
A minimal example:
```
  %tmp = alloca i32, align 4, addrspace(5)
  %tmpI = call ptr @llvm.invariant.start.p5(i64 4, ptr addrspace(5) %tmp) #3
  call void @llvm.invariant.end.p5(ptr %tmpI, i64 4, ptr addrspace(5) %tmp) #3
  store i32 %i, ptr %tmpI, align 4
```
Although the return value of invariant.start might not be intended for
any use beyond invariant.end (the fuzzer might not have created a
sensible situation here), an implicit definition of the corresponding
virtual register avoids a segfault in the target instruction selector
later.

This LLVM defect was identified via the AMD Fuzzing project.
chrsmcgrr pushed a commit that referenced this pull request Feb 17, 2025
We can't guaranty that underlying string is
0-terminated and [String.size()] is even in the
same allocation.


https://lab.llvm.org/buildbot/#/builders/94/builds/4152/steps/17/logs/stdio
```
==c-index-test==1846256==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0  in clang::cxstring::createRef(llvm::StringRef) llvm-project/clang/tools/libclang/CXString.cpp:96:36
    #1  in DumpCXCommentInternal llvm-project/clang/tools/c-index-test/c-index-test.c:521:39
    #2  in DumpCXCommentInternal llvm-project/clang/tools/c-index-test/c-index-test.c:674:7
    #3  in DumpCXCommentInternal llvm-project/clang/tools/c-index-test/c-index-test.c:674:7
    #4  in DumpCXComment llvm-project/clang/tools/c-index-test/c-index-test.c:685:3
    #5  in PrintCursorComments llvm-project/clang/tools/c-index-test/c-index-test.c:768:7

  Memory was marked as uninitialized
    #0  in __msan_allocated_memory llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:1023:5
    #1  in Allocate llvm-project/llvm/include/llvm/Support/Allocator.h:172:7
    #2  in Allocate llvm-project/llvm/include/llvm/Support/Allocator.h:216:12
    #3  in Allocate llvm-project/llvm/include/llvm/Support/AllocatorBase.h:53:43
    #4  in Allocate<char> llvm-project/llvm/include/llvm/Support/AllocatorBase.h:76:29
    #5  in convertCodePointToUTF8 llvm-project/clang/lib/AST/CommentLexer.cpp:42:30
    llvm#6  in clang::comments::Lexer::resolveHTMLDecimalCharacterReference(llvm::StringRef) const llvm-project/clang/lib/AST/CommentLexer.cpp:76:10
    llvm#7  in clang::comments::Lexer::lexHTMLCharacterReference(clang::comments::Token&) llvm-project/clang/lib/AST/CommentLexer.cpp:615:16
    llvm#8  in consumeToken llvm-project/clang/include/clang/AST/CommentParser.h:62:9
    llvm#9  in clang::comments::Parser::parseParagraphOrBlockCommand() llvm-project/clang/lib/AST/CommentParser.cpp
    llvm#10 in clang::comments::Parser::parseFullComment() llvm-project/clang/lib/AST/CommentParser.cpp:925:22
    llvm#11 in clang::RawComment::parse(clang::ASTContext const&, clang::Preprocessor const*, clang::Decl const*) const llvm-project/clang/lib/AST/RawCommentList.cpp:221:12
    llvm#12 in clang::ASTContext::getCommentForDecl(clang::Decl const*, clang::Preprocessor const*) const llvm-project/clang/lib/AST/ASTContext.cpp:714:35
    llvm#13 in clang_Cursor_getParsedComment llvm-project/clang/tools/libclang/CXComment.cpp:36:35
    llvm#14 in PrintCursorComments llvm-project/clang/tools/c-index-test/c-index-test.c:756:25
 ```
chrsmcgrr pushed a commit that referenced this pull request Feb 17, 2025
Reverts llvm#125020


https://lab.llvm.org/buildbot/#/builders/24/builds/5252/steps/12/logs/stdio

```
==c-index-test==2512295==ERROR: AddressSanitizer: heap-use-after-free on address 0xe19338c27992 at pc 0xc66be4784830 bp 0xe0e33660df00 sp 0xe0e33660d6e8
READ of size 23 at 0xe19338c27992 thread T1
    #0 0xc66be478482c in printf_common(void*, char const*, std::__va_list) /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors_format.inc:563:9
    #1 0xc66be478643c in vprintf /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:1699:1
    #2 0xc66be478643c in printf /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:1757:1
    #3 0xc66be4839384 in FilteredPrintingVisitor /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/c-index-test/c-index-test.c:1359:5
    #4 0xe4e3454f12e8 in clang::cxcursor::CursorVisitor::Visit(CXCursor, bool) /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/libclang/CIndex.cpp:227:11
    #5 0xe4e3454f48a8 in bool clang::cxcursor::CursorVisitor::visitPreprocessedEntities<clang::PreprocessingRecord::iterator>(clang::PreprocessingRecord::iterator, clang::PreprocessingRecord::iterator, clang::PreprocessingRecord&, clang::FileID) CIndex.cpp
    
0xe19338c27992 is located 82 bytes inside of 105-byte region [0xe19338c27940,0xe19338c279a9)
freed by thread T1 here:
    #0 0xc66be480040c in free /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:51:3
    #1 0xc66be4839728 in GetCursorSource c-index-test.c
    #2 0xc66be4839368 in FilteredPrintingVisitor /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/c-index-test/c-index-test.c:1360:12
    #3 0xe4e3454f12e8 in clang::cxcursor::CursorVisitor::Visit(CXCursor, bool) /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/libclang/CIndex.cpp:227:11
    #4 0xe4e3454f48a8 in bool clang::cxcursor::CursorVisitor::visitPreprocessedEntities<clang::PreprocessingRecord::iterator>(clang::PreprocessingRecord::iterator, clang::PreprocessingRecord::iterator, clang::PreprocessingRecord&, clang::FileID) CIndex.cpp


previously allocated by thread T1 here:
    #0 0xc66be4800680 in malloc /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:67:3
    #1 0xe4e3456379b0 in safe_malloc /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/Support/MemAlloc.h:26:18
    #2 0xe4e3456379b0 in createDup /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/libclang/CXString.cpp:95:40
    #3 0xe4e3456379b0 in clang::cxstring::createRef(llvm::StringRef) /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/libclang/CXString.cpp:90:10
```
chrsmcgrr pushed a commit that referenced this pull request Feb 17, 2025
…127087)

Fixes the following crash in clang-repl

```c++
clang-repl> try { throw 1; } catch { 0; }
In file included from <<< inputs >>>:1:
input_line_1:1:23: error: expected '('
    1 | try { throw 1; } catch { 0; }
      |                       ^
      |                       (
clang-repl: /home/vipul-cariappa/Documents/Workspace/cpp-py/llvms/llvm-project-a/clang/lib/AST/DeclBase.cpp:1757: void clang::DeclContext::addHiddenDecl(clang::Decl*): Assertion `D->getLexicalDeclContext() == this && "Decl inserted into wrong lexical context"' failed.
 #0 0x000059b28459e6da llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/vipul-cariappa/Documents/Workspace/cpp-py/llvms/llvm-project-a/llvm/lib/Support/Unix/Signals.inc:804:22
 #1 0x000059b28459eaed PrintStackTraceSignalHandler(void*) /home/vipul-cariappa/Documents/Workspace/cpp-py/llvms/llvm-project-a/llvm/lib/Support/Unix/Signals.inc:880:1
 #2 0x000059b28459bf7f llvm::sys::RunSignalHandlers() /home/vipul-cariappa/Documents/Workspace/cpp-py/llvms/llvm-project-a/llvm/lib/Support/Signals.cpp:105:20
 #3 0x000059b28459df8e SignalHandler(int, siginfo_t*, void*) /home/vipul-cariappa/Documents/Workspace/cpp-py/llvms/llvm-project-a/llvm/lib/Support/Unix/Signals.inc:418:13
 #4 0x000077cdf444ea50 (/usr/lib/libc.so.6+0x42a50)
 #5 0x000077cdf44aee3b pthread_kill (/usr/lib/libc.so.6+0xa2e3b)
 llvm#6 0x000077cdf444e928 raise (/usr/lib/libc.so.6+0x42928)
 llvm#7 0x000077cdf443156c abort (/usr/lib/libc.so.6+0x2556c)
 llvm#8 0x000077cdf44314d2 __assert_perror_fail (/usr/lib/libc.so.6+0x254d2)
 llvm#9 0x000077cdf4444c56 (/usr/lib/libc.so.6+0x38c56)
llvm#10 0x000059b28495bfc4 clang::DeclContext::addHiddenDecl(clang::Decl*) /home/vipul-cariappa/Documents/Workspace/cpp-py/llvms/llvm-project-a/clang/lib/AST/DeclBase.cpp:1759:3
llvm#11 0x000059b28495c0f5 clang::DeclContext::addDecl(clang::Decl*) /home/vipul-cariappa/Documents/Workspace/cpp-py/llvms/llvm-project-a/clang/lib/AST/DeclBase.cpp:1785:37
llvm#12 0x000059b28773cc2a clang::Sema::ActOnStartTopLevelStmtDecl(clang::Scope*) /home/vipul-cariappa/Documents/Workspace/cpp-py/llvms/llvm-project-a/clang/lib/Sema/SemaDecl.cpp:20302:18
llvm#13 0x000059b286f1efdf clang::Parser::ParseTopLevelStmtDecl() /home/vipul-cariappa/Documents/Workspace/cpp-py/llvms/llvm-project-a/clang/lib/Parse/ParseDecl.cpp:6024:62
llvm#14 0x000059b286ef18ee clang::Parser::ParseExternalDeclaration(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*) /home/vipul-cariappa/Documents/Workspace/cpp-py/llvms/llvm-project-a/clang/lib/Parse/Parser.cpp:1065:35
llvm#15 0x000059b286ef0702 clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, clang::Sema::ModuleImportState&) /home/vipul-cariappa/Documents/Workspace/cpp-py/llvms/llvm-project-a/clang/lib/Parse/Parser.cpp:758:36
llvm#16 0x000059b28562dff2 clang::IncrementalParser::ParseOrWrapTopLevelDecl() /home/vipul-cariappa/Documents/Workspace/cpp-py/llvms/llvm-project-a/clang/lib/Interpreter/IncrementalParser.cpp:66:36
llvm#17 0x000059b28562e5b7 clang::IncrementalParser::Parse(llvm::StringRef) /home/vipul-cariappa/Documents/Workspace/cpp-py/llvms/llvm-project-a/clang/lib/Interpreter/IncrementalParser.cpp:132:8
llvm#18 0x000059b28561832b clang::Interpreter::Parse(llvm::StringRef) /home/vipul-cariappa/Documents/Workspace/cpp-py/llvms/llvm-project-a/clang/lib/Interpreter/Interpreter.cpp:570:8
llvm#19 0x000059b285618cbd clang::Interpreter::ParseAndExecute(llvm::StringRef, clang::Value*) /home/vipul-cariappa/Documents/Workspace/cpp-py/llvms/llvm-project-a/clang/lib/Interpreter/Interpreter.cpp:649:8
llvm#20 0x000059b2836f9343 main /home/vipul-cariappa/Documents/Workspace/cpp-py/llvms/llvm-project-a/clang/tools/clang-repl/ClangRepl.cpp:255:59
llvm#21 0x000077cdf443388e (/usr/lib/libc.so.6+0x2788e)
llvm#22 0x000077cdf443394a __libc_start_main (/usr/lib/libc.so.6+0x2794a)
llvm#23 0x000059b2836f7965 _start (./bin/clang-repl+0x73b8965)
fish: Job 1, './bin/clang-repl' terminated by signal SIGABRT (Abort)
```

With this change:
```c++
clang-repl> try { throw 1; } catch { 0; }
In file included from <<< inputs >>>:1:
input_line_1:1:23: error: expected '('
    1 | try { throw 1; } catch { 0; }
      |                       ^
      |                       (
error: Parsing failed.
clang-repl> 1;
clang-repl> %quit
```
@egebeysel egebeysel force-pushed the beysel/roo-70-spirv-add-pattern-for-fptosi-wide-int-emulation branch from 5e3ee1f to 7a41ae7 Compare March 4, 2025 17:01
@egebeysel egebeysel requested a review from chrsmcgrr March 4, 2025 17:01
@egebeysel egebeysel force-pushed the beysel/roo-70-spirv-add-pattern-for-fptosi-wide-int-emulation branch from 7a41ae7 to 36a89ea Compare March 4, 2025 17:18
Copy link
Author

added runner tests 👍

// CHECK-NEXT: 1099512676352
func.call @check_fptoui(%cst_pow40ppow20) : (f64) -> ()
// CHECK-NEXT: -1
func.call @check_fptoui(%cst_i64_overflow) : (f64) -> ()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm not sure if I'm supposed to handle/check the cases below this

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems that for float->int IEEE defines it as this:

If a NaN, infinite, or out-of-range numeric value cannot be represented in the destination format and no other indication is possible, an invalid operation exception will be triggered.

Which is what for example we see with x86: https://www.felixcloutier.com/x86/vcvtss2usi. So its left to the platform to handle this it seems, not the language AFAICS.

So I would assume that all the arith ops we use for this pattern should also handle these special cases and it is left to the platform to signal an exception. What I am surprised about is we get an actual value out.

Since we are using a lower-bitwdith fptoui anyway I don't think this transform can do anything to handle these edge cases. Maybe only ensure that if there is a overflow we set both high and low to one of these special values.

@maxbartel what do you think?

Copy link
Collaborator

@chrsmcgrr chrsmcgrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its almost there. Just some changes maybe. And can you retarget this PR to our latest integrate-iree-20250217 branch that would be great.

Real tricky on what to do with the special cases.

Just looking at what llvm does and gcc they seem to be more pragmatic in the emulation and just saturate when large or go to zero in other cases.
https://github.com/llvm-mirror/compiler-rt/blob/master/lib/builtins/fp_fixuint_impl.inc#L16
https://gcc.gnu.org/onlinedocs/gccint/Soft-float-library-routines.html

Don't know if you read through these as well. AFAICS the special cases are undefined behavior.

// CHECK-NEXT: 1099512676352
func.call @check_fptoui(%cst_pow40ppow20) : (f64) -> ()
// CHECK-NEXT: -1
func.call @check_fptoui(%cst_i64_overflow) : (f64) -> ()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems that for float->int IEEE defines it as this:

If a NaN, infinite, or out-of-range numeric value cannot be represented in the destination format and no other indication is possible, an invalid operation exception will be triggered.

Which is what for example we see with x86: https://www.felixcloutier.com/x86/vcvtss2usi. So its left to the platform to handle this it seems, not the language AFAICS.

So I would assume that all the arith ops we use for this pattern should also handle these special cases and it is left to the platform to signal an exception. What I am surprised about is we get an actual value out.

Since we are using a lower-bitwdith fptoui anyway I don't think this transform can do anything to handle these edge cases. Maybe only ensure that if there is a overflow we set both high and low to one of these special values.

@maxbartel what do you think?

Copy link
Collaborator

@chrsmcgrr chrsmcgrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general it looks good to me. The LLVM lowering in CPU runner seems to generate different results than what we get with clang. I would be curious to know why. We just need to make sure its not this transform but another pass.

Copy link
Author

Do you mean with an emulated fptou/si cast or a non-emulated one? And different results for the edge cases or the normal ones? @chrsmcgrr

@egebeysel egebeysel force-pushed the beysel/roo-70-spirv-add-pattern-for-fptosi-wide-int-emulation branch from 36a89ea to 2e338b4 Compare March 14, 2025 10:23
Copy link
Collaborator

I would expect the emulated fptoui should generate the same output as the non-emulated for nan and +inf/-inf what I would be curious to know is why it does not. The LIT CPU Runner produces -1 in the case of nan and if I compile with clang on a x86_64 system I get -9223372036854775808.

@egebeysel egebeysel force-pushed the beysel/roo-70-spirv-add-pattern-for-fptosi-wide-int-emulation branch from 2e338b4 to 55ce2b0 Compare March 14, 2025 10:25
@egebeysel egebeysel changed the base branch from integrate-llvm-project-20250121 to integrate-llvm-project-20250217 March 14, 2025 10:28
Copy link
Author

egebeysel commented Mar 14, 2025

For nan and +-inf I get -1 with the emulated and non-emulated cases on the LIT CPU runner for the fptoui case. For the fptosi case, the non-emulated version emits 9223372036854775808 for all 3 while I get different results for these with the emulated case. But that was exactly what I was asking previously, if we have UB here anyways, how does that matter what result we emit over here? I don't think we should match the UB over here, do we? If we should, then I have to handle these as special cases of some sort, because I believe the working on the absolute value is what's changing the result, so those would not be deferred to fptoui from the signed case.

Regarding clang producing different results, did you compile the result of the mlir-opt or did you just compile an equivalent C++ code? Can that be the case that you looked at the signed case with the clang program and the unsigned case with the LIT CPU runner?

@chrsmcgrr

@egebeysel egebeysel force-pushed the beysel/roo-70-spirv-add-pattern-for-fptosi-wide-int-emulation branch from 55ce2b0 to c21c159 Compare March 14, 2025 11:35
Copy link
Author

@chrsmcgrr looked into this a little bit more, the fptosi case yielding different results for nan and -+inf are related to the pass itself, but I don't really see a way of achieving that without special-casing for those cases, even if we don't defer to fptoui and handle the both parts in the signed case explicitly.

Copy link
Collaborator

@chrsmcgrr chrsmcgrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok then let's merge it as is. At least we are fully aware of the corner cases.

Make sure you rebase on the latest integration and then ping me here and I will merge this.

@egebeysel egebeysel force-pushed the beysel/roo-70-spirv-add-pattern-for-fptosi-wide-int-emulation branch from c21c159 to bdc32ac Compare March 18, 2025 15:01
Copy link
Author

The latest integration is integrate-llvm-project-20250217, right? If so, I rebased it on that :) @chrsmcgrr

Copy link
Collaborator

chrsmcgrr commented Mar 18, 2025

@egebeysel exactly that's it. I will merge it now.

@chrsmcgrr chrsmcgrr merged commit ebb2984 into integrate-llvm-project-20250217 Mar 18, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants