Open
Description
Describe the bug
We have a model called elser_model_2_linux-x86_64.pt
. It can be downloaded from https://ml-models.elastic.co/elser_model_2_linux-x86_64.pt. As the name suggests it's been quantized for Intel x86_64 hardware.
When this model was used with PyTorch 1.13.1 and IPEX v1.13.100+cpu it worked.
When this model is used with PyTorch 2.1.1 and IPEX v2.1.0+cpu it causes an internal assertion to trip inside PyTorch.
The following Python script can reproduce the problem (assuming elser_model_2_linux-x86_64.pt
has been downloaded into the current directory using the link above):
import torch
import intel_extension_for_pytorch
model = torch.jit.load("elser_model_2_linux-x86_64.pt")
model.eval()
input_ids = [101, 1996, 2143, 2001, 2307, 999, 999, 102]
attention_mask = [1, 1, 1, 1, 1, 1, 1, 1]
token_type_ids = [0, 0, 0, 0, 0, 0, 0, 0]
position_ids = [0, 1, 2, 3, 4, 5, 6, 7]
results = model(torch.tensor([input_ids]), torch.tensor([attention_mask]), torch.tensor([token_type_ids]), torch.tensor([position_ids]))
print(results)
input_ids = [101, 1996, 3185, 2001, 12476, 999, 999, 102]
attention_mask = [1, 1, 1, 1, 1, 1, 1, 1]
token_type_ids = [0, 0, 0, 0, 0, 0, 0, 0]
position_ids = [0, 1, 2, 3, 4, 5, 6, 7]
results = model(torch.tensor([input_ids]), torch.tensor([attention_mask]), torch.tensor([token_type_ids]), torch.tensor([position_ids]))
print(results)
Output:
$ python3.10 ./test_elser.py
tensor([[0., 0., 0., ..., 0., 0., 0.]])
Traceback (most recent call last):
File "/home/ubuntu/./test_elser.py", line 19, in <module>
results = model(torch.tensor([input_ids]), torch.tensor([attention_mask]), torch.tensor([token_type_ids]), torch.tensor([position_ids]))
File "/usr/local/gcc103/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/gcc103/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
RuntimeError: outputs_.size() == 1 INTERNAL ASSERT FAILED at "/root/anaconda3/envs/pytorch_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/jit/ir/ir.h":510, please report a bug to PyTorch.
Output if the import intel_extension_for_pytorch
line is deleted:
$ python3.10 ./test_elser.py
tensor([[0., 0., 0., ..., 0., 0., 0.]])
tensor([[0., 0., 0., ..., 0., 0., 0.]])
So something in IPEX is causing the assertion failure.
Versions
Collecting environment information...
PyTorch version: 2.1.1+cu121
PyTorch CXX11 ABI: No
IPEX version: 2.1.0+cpu
IPEX commit: 94f4320
Build type: Release
OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: (GCC) 10.3.0
Clang version: N/A
IGC version: N/A
CMake version: version 3.27.9
Libc version: glibc-2.27
Python version: 3.10.9 (main, Dec 11 2023, 11:18:08) [GCC 10.3.0] (64-bit runtime)
Python platform: Linux-5.4.0-1103-aws-x86_64-with-glibc2.27
Is XPU available: False
DPCPP runtime version: N/A
MKL version: N/A
GPU models and configuration:
Intel OpenCL ICD version: N/A
Level Zero version: N/A
CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
Stepping: 4
CPU MHz: 3402.319
BogoMIPS: 5999.99
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 25344K
NUMA node0 CPU(s): 0-7
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke
Versions of relevant libraries:
[pip3] intel-extension-for-pytorch==2.1.0
[pip3] numpy==1.26.2
[pip3] torch==2.1.1
[conda] N/A