Issue on run_matmul #135

oysx · 2024-10-31T10:06:50Z

Describe the bug
Currently I'm do some inference on my laptop using this python package. I found an issue described below.

I use the following code to test the result for classical matrix multiple operation:

import torch
from intel_npu_acceleration_library.backend.runtime import run_matmul
a = torch.randn(1000, 1000, dtype=torch.float16)
b = torch.randn(1000, 1000, dtype=torch.float16)

c1 = run_matmul(a, b.transpose(0, 1))
c2 = torch.matmul(a, b)
assert c1.equal(c2)

The difference between c1 and c2 is huge!
I don't know why?

Test environment:
Ubuntu 24.04, Intel Arc Graphics with NPU enabled.
I checkout this commit from github and install it locally: 3e3dee3
Also I have installed required NPU drivers and it can leverage NPU well because the inference is much faster than pure CPU.

The text was updated successfully, but these errors were encountered:

alessandropalla · 2024-10-31T13:33:40Z

Few questions:

What driver version do you have?
Do you have the same behaviour in this example?

I suspect the issue here is in the b.transpose(0, 1) part

oysx · 2024-11-01T15:56:30Z

The driver versions are as below:
/usr/lib/x86_64-linux-gnu/libze_intel_gpu.so.1.3.29735
/usr/lib/x86_64-linux-gnu/libze_intel_vpu.so.1.8.0
/usr/lib/x86_64-linux-gnu/libze_loader.so.1.17.44
/usr/lib/x86_64-linux-gnu/libze_tracing_layer.so.1.17.44
I've tried this example, the result is also different:
Here is my code:

def run_matmul(inC, outC, batch):

# Create both inputs
X1 = np.random.uniform(-1, 1, (batch, inC)).astype(np.float16)
X2 = np.random.uniform(-1, 1, (outC, inC)).astype(np.float16)

mm = MatMul(inC, outC, batch, profile=False)

intel = mm.run(X1, X2)
intel = torch.tensor(intel)

tor = torch.matmul(torch.tensor(X1), torch.tensor(X2).transpose(-2, -1))
print(tor.equal(intel))

Because the shape for torch.matmul() is different with run_matmul(), so I transpose it.

oysx · 2024-11-05T15:01:51Z

Is this actual an issue or I misuse it?
Is there any equivalent acceleration method I can use to replace the torch.matmul() API?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue on run_matmul #135

Issue on run_matmul #135

oysx commented Oct 31, 2024

alessandropalla commented Oct 31, 2024

oysx commented Nov 1, 2024

oysx commented Nov 5, 2024

Issue on run_matmul #135

Issue on run_matmul #135

Comments

oysx commented Oct 31, 2024

I use the following code to test the result for classical matrix multiple operation:

c1 = run_matmul(a, b.transpose(0, 1)) c2 = torch.matmul(a, b) assert c1.equal(c2)

alessandropalla commented Oct 31, 2024

oysx commented Nov 1, 2024

oysx commented Nov 5, 2024

c1 = run_matmul(a, b.transpose(0, 1))
c2 = torch.matmul(a, b)
assert c1.equal(c2)