@@ -742,12 +742,12 @@ descriptors that can be queried using `get_info` API.
742
742
[frame="none",options="header"]
743
743
|======================
744
744
| Device descriptors | Return type| Description
745
- |`ext::oneapi::experimental::info::device::matrix::combinations ` |
745
+ |`ext::oneapi::experimental::info::device::matrix_combinations ` |
746
746
`std::vector<combination>`| tells the set of supported matrix sizes
747
747
and types on this device
748
748
|======================
749
749
750
- The runtime query returns a vector of `combinations ` of `combination`
750
+ The runtime query returns a vector of `matrix_combinations ` of `combination`
751
751
type. Each combination includes the sizes and the types for the
752
752
matrices A, B, C, and D. Note that for each matrix hardware,
753
753
the query returns `max_msize, max_nsize, max_ksize` or `msize, nsize,
@@ -791,7 +791,7 @@ struct combination {
791
791
} // namespace sycl::ext::oneapi::experimental::matrix
792
792
```
793
793
794
- Each combination of the `combinations ` vector composes the types and
794
+ Each combination of the `matrix_combinations ` vector composes the types and
795
795
sizes of A, B, C, and D matrices supported by the device
796
796
implementation. The table below provides a description of each member
797
797
of the `combination` struct.
@@ -833,7 +833,7 @@ the `T` template parameter as follows: +
833
833
```c++
834
834
// Ta, Tb, Tc, and Td are the types used in applications
835
835
std::vector<combination> combinations =
836
- device.get_info<info::device::matrix::combinations >();
836
+ device.get_info<info::device::matrix_combinations >();
837
837
for (int i = 0; sizeof(combinations); i++) {
838
838
if (Ta == combinations[i].atype &&
839
839
Tb == combinations[i].btype &&
@@ -850,7 +850,7 @@ for (int i = 0; sizeof(combinations); i++) {
850
850
The table below provides a list of the combinations that
851
851
`joint_matrix` implementations support on each of Intel AMX and Intel
852
852
XMX hardware. Note that these can be returned using
853
- `ext::oneapi::experimental::info::device::matrix::combinations `.
853
+ `ext::oneapi::experimental::info::device::matrix_combinations `.
854
854
855
855
==== Intel AMX Supported Combinations
856
856
This is currently available in devices with the architecture
@@ -864,44 +864,52 @@ table below.
864
864
| A type | B type | C and D type | M | N | K
865
865
| `matrix_type::uint8` | `matrix_type::uint8` |
866
866
`matrix_type::sint32` | +<=+ 16 | +<=+ 16 | +<=+ 64
867
- | `matrix_type::uint8` | `matrix_type::int8 ` |
867
+ | `matrix_type::uint8` | `matrix_type::sint8 ` |
868
868
`matrix_type::sint32` | +<=+ 16 | +<=+ 16 | +<=+ 64
869
- | `matrix_type::int8 ` | `matrix_type::uint8` |
869
+ | `matrix_type::sint8 ` | `matrix_type::uint8` |
870
870
`matrix_type::sint32` | +<=+ 16 | +<=+ 16 | +<=+ 64
871
- | `matrix_type::int8 ` | `matrix_type::int8 ` |
871
+ | `matrix_type::sint8 ` | `matrix_type::sint8 ` |
872
872
`matrix_type::sint32` | +<=+ 16 | +<=+ 16 | +<=+ 64
873
873
| `matrix_type::bf16` | `matrix_type::bf16` |
874
874
`matrix_type::fp32` | +<=+ 16 | +<=+ 16 | +<=+ 32
875
875
|======================
876
876
877
877
==== Intel XMX Supported Combinations
878
878
This is currently available in devices with the architecture
879
- `architecture::intel_gpu_pvc` and `architecture::intel_gpu_dg2`. In
880
- these architectures' implementation, the type of the C matrix must be
881
- the same as the type of the D matrix. Therefore, that common type is
882
- shown in a single column in the table below.
879
+ `architecture::intel_gpu_pvc`, `architecture::intel_gpu_dg2_g10`,
880
+ `architecture::intel_gpu_dg2_g11`, and
881
+ `architecture::intel_gpu_dg2_g12`. In these architectures'
882
+ implementation, the type of the C matrix must be the same as the type
883
+ of the D matrix. Therefore, that common type is shown in a single
884
+ column in the table below.
883
885
884
886
[frame="none",options="header"]
885
887
|======================
886
888
| A type | B type | C and D type | M | N | K | device
887
- | `matrix_type::uint8` | `matrix_type::uint8` |
888
- `matrix_type::int32` | +<=+ 8 | 16 | 32 | architecture::intel_gpu_pvc
889
- | | | | |8||architecture::intel_gpu_dg2
890
- | `matrix_type::uint8` | `matrix_type::int8` |
891
- `matrix_type::int32` | +<=+ 8 | 16 | 32 | architecture::intel_gpu_pvc
892
- | | | | |8||architecture::intel_gpu_dg2
893
- | `matrix_type::int8` | `matrix_type::uint8` |
894
- `matrix_type::int32` | +<=+ 8 | 16 | 32 | architecture::intel_gpu_pvc
895
- | | | | |8||architecture::intel_gpu_dg2
896
- | `matrix_type::int8` | `matrix_type::int8` |
897
- `matrix_type::int32` | +<=+ 8 | 16 | 32 | architecture::intel_gpu_pvc
898
- | | | | |8||architecture::intel_gpu_dg2
899
- | `matrix_type::fp16` | `matrix_type::fp16` |
900
- `matrix_type::fp32` | +<=+ 8 | 16 | 16 | architecture::intel_gpu_pvc
901
- | | | | |8|| architecture::intel_gpu_dg2
902
- | `matrix_type::bf16` | `matrix_type::bf16` |
903
- `matrix_type::fp32` | +<=+ 8 | 16 | 16 | architecture::intel_gpu_pvc
904
- | | | | |8|| architecture::intel_gpu_dg2
889
+ .2+| `matrix_type::uint8` .2+| `matrix_type::uint8` .2+|
890
+ `matrix_type::sint32` .2+| +<=+ 8 | 16 .2+| 32
891
+ |`architecture::intel_gpu_pvc`|8|`architecture::intel_gpu_dg2_g10,
892
+ architecture::intel_gpu_dg2_g11, architecture::intel_gpu_dg2_g12`
893
+ .2+| `matrix_type::uint8` .2+| `matrix_type::sint8` .2+|
894
+ `matrix_type::sint32` .2+| +<=+ 8 | 16 .2+| 32 |
895
+ `architecture::intel_gpu_pvc`|8|`architecture::intel_gpu_dg2_g10,
896
+ architecture::intel_gpu_dg2_g11, architecture::intel_gpu_dg2_g12`
897
+ .2+| `matrix_type::sint8` .2+| `matrix_type::uint8` .2+|
898
+ `matrix_type::sint32` .2+| +<=+ 8 | 16 .2+| 32 |
899
+ `architecture::intel_gpu_pvc`|8|`architecture::intel_gpu_dg2_g10,
900
+ architecture::intel_gpu_dg2_g11, architecture::intel_gpu_dg2_g12`
901
+ .2+| `matrix_type::sint8` .2+| `matrix_type::sint8` .2+|
902
+ `matrix_type::sint32` .2+| +<=+ 8 | 16 .2+| 32 |
903
+ `architecture::intel_gpu_pvc`|8|`architecture::intel_gpu_dg2_g10,
904
+ architecture::intel_gpu_dg2_g11, architecture::intel_gpu_dg2_g12`
905
+ .2+|`matrix_type::fp16` .2+| `matrix_type::fp16` .2+|
906
+ `matrix_type::fp32` .2+| +<=+ 8 | 16 .2+| 16 |
907
+ `architecture::intel_gpu_pvc`|8| `architecture::intel_gpu_dg2_g10,
908
+ architecture::intel_gpu_dg2_g11, architecture::intel_gpu_dg2_g12`
909
+ .2+| `matrix_type::bf16` .2+| `matrix_type::bf16` .2+|
910
+ `matrix_type::fp32` .2+| +<=+ 8 | 16 .2+| 16 |
911
+ `architecture::intel_gpu_pvc` |8| `architecture::intel_gpu_dg2_g10,
912
+ architecture::intel_gpu_dg2_g11, architecture::intel_gpu_dg2_g12`
905
913
|======================
906
914
907
915
==== Nvidia Tensor Cores Supported Combinations
@@ -933,11 +941,11 @@ supported parameter combination is specified in the following table.
933
941
|16 |16 |16
934
942
|8 |32 |16
935
943
|32 |8 |16
936
- .3+| `matrix_type::int8 ` .3+| `matrix_type::int32 `
944
+ .3+| `matrix_type::sint8 ` .3+| `matrix_type::sint32 `
937
945
|16 |16 |16 .6+| sm_72
938
946
|8 |32 |16
939
947
|32 |8 |16
940
- .3+|`matrix_type::uint8` .3+|`matrix_type::int32 `
948
+ .3+|`matrix_type::uint8` .3+|`matrix_type::sint32 `
941
949
|16 |16 |16
942
950
|8 |32 |16
943
951
|32 |8 |16
0 commit comments