Fix build failure in issues #178 #187

zhaoshiz · 2024-10-28T23:11:57Z

Fixing below compilation errors in include/triton-shared/Conversion/TritonArithToLinalg/ConversionPatterns.hpp:

/home/runner/work/triton-shared/triton-shared/triton_shared/include/triton-shared/Conversion/TritonArithToLinalg/ConversionPatterns.hpp:839:68: error: no member named 'getFile' in 'mlir::triton::AssertOp'
llvm::formatv("{0}.py:{1}: {2} Assertion {3} failed", op.getFile(),
~~ ^
/home/runner/work/triton-shared/triton-shared/triton_shared/include/triton-shared/Conversion/TritonArithToLinalg/ConversionPatterns.hpp:840:26: error: no member named 'getLine' in 'mlir::triton::AssertOp'
op.getLine(), op.getFunc(), op.getMessage());
~~ ^
/home/runner/work/triton-shared/triton-shared/triton_shared/include/triton-shared/Conversion/TritonArithToLinalg/ConversionPatterns.hpp:840:40: error: no member named 'getFunc' in 'mlir::triton::AssertOp'
op.getLine(), op.getFunc(), op.getMessage());
~~ ^
3 errors generated.

This fix builds with triton @ab07e5472bcb414a0c8dd7ecab80f84370c4894e, and llvm @cfd3289a1f9a87e220737a634904a886a82d424a.

nhat-nguyen · 2024-11-05T19:56:10Z

@zhaoshiz Thank you so much! Would you mind also updating the submodule commit to match in this PR too? Otherwise, our build will fail.

zhaoshiz · 2024-11-06T01:49:33Z

@nhat-nguyen, I've updated the submodule sha and fixed additional compilation errors. I'm working with legal dept. to get the CLA approved.

zhaoshiz · 2024-11-15T01:08:37Z

@microsoft-github-policy-service agree company="Qualcomm Innovation Center, Inc."

mdehling · 2024-11-20T18:35:44Z

Thank you, this saved me some time! :)

include/triton-shared/Conversion/TritonArithToLinalg/ConversionPatterns.hpp

Newer triton versions put an additional symlink in the llvm folder, so the `ls` command ends up listing two file names separated by a newline which breaks the pipeline run in #187. Use `find` with `top -1` to ensure we only ever get one llvm path.

Newer triton versions put an additional symlink in the llvm folder, so the `ls` command ends up listing two file names separated by a newline which breaks the pipeline run in microsoft#187. Use `find` with `top -1` to ensure we only ever get one llvm path.

nhat-nguyen · 2025-01-10T20:11:06Z

Looks like the CPU backend needs to be updated with some of the new methods from the BaseBackend class

zhaoshiz · 2025-01-10T21:19:43Z

Looks like the CPU backend needs to be updated with some of the new methods from the BaseBackend class

Oh, sorry I missed that part.

That's get_active_torch_device: https://github.com/triton-lang/triton/blob/f9d9fad1b7b648e73ef03332737f000bed258f13/python/triton/backends/driver.py#L22C1-L24C13.

Should I just add below to CPUDriver class?

def get_active_torch_device(self):
    import torch
    return torch.device("cpu")

zhaoshiz · 2025-01-10T23:02:23Z

test_core.py:1650: in
@pytest.mark.skipif(torch.cuda.get_device_capability()[0] < 9 or is_hip(),
...
ERROR test_core.py - AssertionError: Torch not compiled with CUDA enabled

Shall we add a check for torch.is_cuda_available()?
@pytest.mark.skipif(not torch.is_cuda_available() or torch.cuda.get_device_capability()[0] < 9 or is_hip(),

nhat-nguyen · 2025-01-10T23:07:23Z

Yeah looks like we need to. Although the test_core file is symlinked to the file in the triton submodule. @parsifal-47 is there a way we can work around this?

zhaoshiz · 2025-01-10T23:10:43Z

https://github.com/triton-lang/triton/blob/110b66e649711a8fdda66359db5054a8a0ede9d2/python/test/unit/language/test_core.py#L1661C21-L1661C34

seems fixed by Triton already, let me try updating Triton

nhat-nguyen · 2025-01-10T23:16:43Z

Could we also extract the splat op change to a separate PR? That would make it easier to keep track of things.

zhaoshiz · 2025-01-10T23:22:32Z

Could we also extract the splat op change to a separate PR? That would make it easier to keep track of things.

sure. I'll revert the commit and xfail related test in this PR, and create another one.

parsifal-47 · 2025-01-11T00:24:12Z

Yeah looks like we need to. Although the test_core file is symlinked to the file in the triton submodule. @parsifal-47 is there a way we can work around this?

yes, so far I was able to workaround by describing exceptions in this file https://github.com/microsoft/triton-shared/blob/main/python/examples/conftest.py if you need to disable test cases please let me know, I am not sure whether the issue is resolved or not

Newer triton versions put an additional symlink in the llvm folder, so the `ls` command ends up listing two file names separated by a newline which breaks the pipeline run in #187. Use `find` with `top -1` to ensure we only ever get one llvm path.

nhat-nguyen · 2025-01-14T19:01:56Z

looks like lots of tests are failing in test_core -- some of them seem to be because triton added a constexpr type which we don't handle in the CPU backend yet

zhaoshiz · 2025-01-14T19:37:30Z

looks like lots of tests are failing in test_core -- some of them seem to be because triton added a constexpr type which we don't handle in the CPU backend yet

I've looked into the failures and am working on constexpr.

python/examples/conftest.py

Fixing below compilation errors in include/triton-shared/Conversion/TritonArithToLinalg/ConversionPatterns.hpp: /home/runner/work/triton-shared/triton-shared/triton_shared/include/triton-shared/Conversion/TritonArithToLinalg/ConversionPatterns.hpp:839:68: error: no member named 'getFile' in 'mlir::triton::AssertOp' llvm::formatv("{0}.py:{1}: {2} Assertion `{3}` failed", op.getFile(), ~~ ^ /home/runner/work/triton-shared/triton-shared/triton_shared/include/triton-shared/Conversion/TritonArithToLinalg/ConversionPatterns.hpp:840:26: error: no member named 'getLine' in 'mlir::triton::AssertOp' op.getLine(), op.getFunc(), op.getMessage()); ~~ ^ /home/runner/work/triton-shared/triton-shared/triton_shared/include/triton-shared/Conversion/TritonArithToLinalg/ConversionPatterns.hpp:840:40: error: no member named 'getFunc' in 'mlir::triton::AssertOp' op.getLine(), op.getFunc(), op.getMessage()); ~~ ^ 3 errors generated. This fix builds with triton @ab07e5472bcb414a0c8dd7ecab80f84370c4894e, and llvm @cfd3289a1f9a87e220737a634904a886a82d424a.

Fix compilation errors caused by LLVM commits: f18c3e4e7335df282c468b6dff3d29be1822a96d [mlir][Transforms] Dialect Conversion: Simplify materialization fn result type (#113031) 8c4bc1e75de27adfbaead34b895b0efbaf17bd02 [mlir][Transforms] Merge 1:1 and 1:N type converters (#113032) Update Triton to 94684d326723b67b146f23f342623ea058a32098

Remove parts of llvm::formatv string correspond to File, Line and Func arguments of triton::AssertOp.

Add `sanitize_overflow: bool = True` to class CPUOptions in compiler.py and `get_benchmarker(self)` to class CPUDriver in driver.py to run the tests. XFAILing TritonToLinalg tests since this pass will be retire soon: test/Conversion/TritonToLinalg/wraparound_side_by_side.mlir test/Conversion/TritonToLinalg/wraparound_stacked.mlir XFAILing StructuredToMemref tests due to LLVM commit 889b67c9d30e3024a1317431d66c22599f6c2011 asserts that dynamic shapes like <2x?> and <?x?> are mismatch: test/Conversion/StructuredToMemref/wraparound_side_by_side.mlir test/Conversion/StructuredToMemref/wraparound_stacked.mlir

Update CMakeList.txt for python and pybind11 headers. Fixed test/Conversion/TritonArithToLinalg/split.mlir. Working on test/Conversion/StructuredToMemref/get_num_programs.mlir. Builds with Triton@acc25d91fba850c18c099e7e577962ba56bdd06c and LLVM@86b69c31642e98f8357df62c09d118ad1da4e16a.

Add rewriteSplatOp() in PtrAnalysis pass. This function creates a tts.makeptr for the case below: %6 = tt.splat %arg0 : !tt.ptr<i32> -> tensor<1x!tt.ptr<i32>> Previously we rely on rewriteAddPtrOp to create the tts.makeptr: %3 = arith.constant 0 : index %6 = tt.splat %arg0 : !tt.ptr<i32> -> tensor<1x!tt.ptr<i32>> %7 = tt.addptr %6, %3 : tensor<1x!tt.ptr<i32>>, tensor<1xi32> Creating a constant 0 and adding it to a pointer is optimized away by Triton.

This reverts commit 2153c53. The commit being reverted will be sent in a separate PR.

In commit 9743ec0dca5bbd9dbce20adc3ee273af6b095f94, Triton moved to use "constexpr"s instead of "constant"s in its function signature.

Also update Triton to 2efb067bfc0f9acabcd8b4ffe7c55ad248dfb282.

Change various lambda return type from `std::optional<Value>` to `Value` per LLVM API change.

zhaoshiz · 2025-01-17T22:56:17Z

rebased and fixed build failures

zhaoshiz · 2025-01-17T22:58:41Z

yes, so far I was able to workaround by describing exceptions in this file https://github.com/microsoft/triton-shared/blob/main/python/examples/conftest.py if you need to disable test cases please let me know, I am not sure whether the issue is resolved or not

I have disabled several tests from Triton's test_core.py in conftest.py. I think some are not supported by Triton-Shared but I'm unsure about FP8 data types on CPUs. Please take a look: 12d816e

parsifal-47 · 2025-01-17T23:16:38Z

I have disabled several tests from Triton's test_core.py in conftest.py. I think some are not supported by Triton-Shared but I'm unsure about FP8 data types on CPUs. Please take a look: 12d816e

you also provided comments for each disabled test, looks good to me, thanks a lot for doing that!

test/Conversion/StructuredToMemref/get_num_programs.mlir

nhat-nguyen · 2025-01-23T20:43:43Z

@zhaoshiz Looks like we're very close! For the modulo tests, you can use my patch here to fix both the lit tests and the CPU backend tests:

diff --git a/lib/Conversion/StructuredToMemref/StructuredToMemref.cpp b/lib/Conversion/StructuredToMemref/StructuredToMemref.cpp
index fa195c6..8fac5d8 100644
--- a/lib/Conversion/StructuredToMemref/StructuredToMemref.cpp
+++ b/lib/Conversion/StructuredToMemref/StructuredToMemref.cpp
@@ -176,9 +176,9 @@ private:
         SmallVector<int64_t>(resultShape.size(), ShapedType::kDynamic),
         /* result shape */
         SmallVector<int64_t>{
-
-            // Row stays the same
-            resultShape[0],
+            // Row stays the same, but mlir doesn't allow this anymore. Put
+            // dynamic.
+            ShapedType::kDynamic,
 
             // Column is dynamic, in most cases, this
             // should be the same as the original column.
@@ -286,9 +286,9 @@ private:
             // around.
             ShapedType::kDynamic,
 
-            // Col stays the same.
-            resultShape[1],
-        });
+            // Col stays the same, which is resultShape[1], but mlir doesn't
+            // allow this anymore. So we put dynamic instead.
+            ShapedType::kDynamic});
 
     Value rowSize = rewriter.create<arith::ConstantOp>(
         loc, rewriter.getIndexAttr(op.getSizes()[0]));

We can disable some of the tests in core.py to unblock this update.

Thanks Nhat Nguyen for the fix. UnXFAILed and updated wraparound_side_by_side.mlir and wraparound_stacked.mlir in test/Conversion/StructuredToMemref.

…ams.mlir

zhaoshiz · 2025-01-24T00:53:23Z

@zhaoshiz Looks like we're very close! For the modulo tests, you can use my patch here to fix both the lit tests and the CPU backend tests:
...
We can disable some of the tests in core.py to unblock this update.

Thanks! I was looking to fix it in MLIR but this is a better solution.

nhat-nguyen

thank you @zhaoshiz for your help here!

zhaoshiz · 2025-01-27T23:26:28Z

thank you @zhaoshiz for your help here!

my pleasure!

zhaoshiz mentioned this pull request Nov 1, 2024

Nightly Build Failure 2024-09-21 #178

Closed

mdehling reviewed Nov 20, 2024

View reviewed changes

include/triton-shared/Conversion/TritonArithToLinalg/ConversionPatterns.hpp Outdated Show resolved Hide resolved

zhaoshiz closed this Dec 12, 2024

zhaoshiz force-pushed the main branch from 7aaea82 to d5b7bee Compare December 12, 2024 01:13

zhaoshiz reopened this Dec 12, 2024

nhat-nguyen mentioned this pull request Jan 9, 2025

Fix incorrect llvm path in pipeline #220

Merged

zhaoshiz commented Jan 17, 2025

View reviewed changes

python/examples/conftest.py Show resolved Hide resolved

zhaoshiz commented Jan 17, 2025

View reviewed changes

python/examples/conftest.py Show resolved Hide resolved

zhaoshiz commented Jan 17, 2025

View reviewed changes

python/examples/conftest.py Show resolved Hide resolved

zhaoshiz added 4 commits January 17, 2025 14:25

Fix format string in ConversionPatterns.hpp

2a9097e

Remove parts of llvm::formatv string correspond to File, Line and Func arguments of triton::AssertOp.

zhaoshiz and others added 9 commits January 17, 2025 14:30

Update Triton to 755d4164081b92a909df2e1ad4c56174c8ce5529

3d23cac

Add get_active_torch_device(self) to CPUDriver class

f49ac69

Update Triton to 110b66e649711a8fdda66359db5054a8a0ede9d2

53cdebb

Revert "Rewrite triton::SplatOp in PtrAnalysis"

36144c6

This reverts commit 2153c53. The commit being reverted will be sent in a separate PR.

Handle "constexpr" type in CPU backend

2707e84

In commit 9743ec0dca5bbd9dbce20adc3ee273af6b095f94, Triton moved to use "constexpr"s instead of "constant"s in its function signature.

Run pytest in python/examples and xfail unsupported tests

12d816e

Also update Triton to 2efb067bfc0f9acabcd8b4ffe7c55ad248dfb282.

Fix build failures

26565d4

Change various lambda return type from `std::optional<Value>` to `Value` per LLVM API change.

zhaoshiz force-pushed the main branch from e9c51be to 26565d4 Compare January 17, 2025 22:53

nhat-nguyen reviewed Jan 20, 2025

View reviewed changes

test/Conversion/StructuredToMemref/get_num_programs.mlir Outdated Show resolved Hide resolved

zhaoshiz added 2 commits January 23, 2025 16:25

Fix dynamic shape conversion

1f17d12

Thanks Nhat Nguyen for the fix. UnXFAILed and updated wraparound_side_by_side.mlir and wraparound_stacked.mlir in test/Conversion/StructuredToMemref.

Add note on XFAILing test/Conversion/StructuredToMemref/get_num_progr…

19fdd6d

…ams.mlir

nhat-nguyen approved these changes Jan 27, 2025

View reviewed changes

nhat-nguyen merged commit 560c064 into microsoft:main Jan 27, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix build failure in issues #178 #187

Fix build failure in issues #178 #187

zhaoshiz commented Oct 28, 2024

nhat-nguyen commented Nov 5, 2024

zhaoshiz commented Nov 6, 2024

zhaoshiz commented Nov 15, 2024

mdehling commented Nov 20, 2024

nhat-nguyen commented Jan 10, 2025

zhaoshiz commented Jan 10, 2025 •

edited

Loading

zhaoshiz commented Jan 10, 2025

nhat-nguyen commented Jan 10, 2025

zhaoshiz commented Jan 10, 2025

nhat-nguyen commented Jan 10, 2025

zhaoshiz commented Jan 10, 2025 •

edited

Loading

parsifal-47 commented Jan 11, 2025

nhat-nguyen commented Jan 14, 2025

zhaoshiz commented Jan 14, 2025

zhaoshiz commented Jan 17, 2025

zhaoshiz commented Jan 17, 2025 •

edited

Loading

parsifal-47 commented Jan 17, 2025

nhat-nguyen commented Jan 23, 2025

zhaoshiz commented Jan 24, 2025

nhat-nguyen left a comment

zhaoshiz commented Jan 27, 2025

Fix build failure in issues #178 #187

Fix build failure in issues #178 #187

Conversation

zhaoshiz commented Oct 28, 2024

nhat-nguyen commented Nov 5, 2024

zhaoshiz commented Nov 6, 2024

zhaoshiz commented Nov 15, 2024

mdehling commented Nov 20, 2024

nhat-nguyen commented Jan 10, 2025

zhaoshiz commented Jan 10, 2025 • edited Loading

zhaoshiz commented Jan 10, 2025

nhat-nguyen commented Jan 10, 2025

zhaoshiz commented Jan 10, 2025

nhat-nguyen commented Jan 10, 2025

zhaoshiz commented Jan 10, 2025 • edited Loading

parsifal-47 commented Jan 11, 2025

nhat-nguyen commented Jan 14, 2025

zhaoshiz commented Jan 14, 2025

zhaoshiz commented Jan 17, 2025

zhaoshiz commented Jan 17, 2025 • edited Loading

parsifal-47 commented Jan 17, 2025

nhat-nguyen commented Jan 23, 2025

zhaoshiz commented Jan 24, 2025

nhat-nguyen left a comment

Choose a reason for hiding this comment

zhaoshiz commented Jan 27, 2025

zhaoshiz commented Jan 10, 2025 •

edited

Loading

zhaoshiz commented Jan 10, 2025 •

edited

Loading

zhaoshiz commented Jan 17, 2025 •

edited

Loading