Skip to content

[SYCL][matrix] Update the query interface with the latest joint matrix approved syntax #11004

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Sep 21, 2023

Conversation

dkhaldi
Copy link
Contributor

@dkhaldi dkhaldi commented Aug 29, 2023

No description provided.

@dkhaldi
Copy link
Contributor Author

dkhaldi commented Aug 29, 2023

This PR replaces #10847. It addresses most of the comments raised there.
An outstanding issue is on how to add a new value for CPU device (intel_cpu_spr), see #10847 (comment)

@dkhaldi
Copy link
Contributor Author

dkhaldi commented Aug 29, 2023

It's more complicated than adding new item here for CPU device: it won't be usable if user tries to use it in device AOT or host scenarios.
(1) What we can do right now is that we can add explicit limitation to the spec and add the actual functionality later
(2) Another option is to introduce all intel_cpu_* items as separate PR and make this PR dependent on that PR (e.g., regarding host scenario we already have the updated OpenCL extension in OpenCL CPU RT which allows us to query the unique IDs of Intel CPU architectures)
@gmlueck what do you think between (1) and (2) or maybe other option?

I'd like to have a plan for how this new SPR enumerator fits in with all of the other uses of the architecture enum. Currently, all of the following are related:

The enumerators in architecture.
The legal value for the -fsycl-targets command line switch.
The if_architecture_is function that can be called from device code.
The device::ext_oneapi_architecture_is function that can be called from host code.
It seems like it would be nice to support syntax like -fsycl-targets=intel_cpu_spr, which would AOT compile device code for SPR. This would provide a consistent way to AOT compile for either GPU or CPU devices. I think this would also enable support for if_architecture_is as a side-effect.

For device::ext_oneapi_architecture_is, can we use the CPUID instruction to determine if the CPU is SPR?

It might be useful to have a meeting to discuss this further.

Currently -march=sapphirerapids can be used to specify compilation for SPR.
@yubingex007-a11y, is there something like -fsycl-targets=intel_cpu_spr for SPR?
Yes, I have used CPUID to detect whether this is SPR but not sure this is reliable:
(model == 15 && family == 6 && extended_model == 8 &&
extended_family == 0)

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@dkhaldi
Copy link
Contributor Author

dkhaldi commented Sep 11, 2023

In an offline discussion, we agreed on adding the new SPR enumerator intel_cpu_spr to the list of devices.
But we will need to confirm that there are no issues/objections on having -fsycl-targets include intel_cpu_spr for AOT compilation.

@dkhaldi
Copy link
Contributor Author

dkhaldi commented Sep 15, 2023

@dm-vodopyanov, @gmlueck can you please approve the spec changes?

Copy link
Contributor

@YuriPlyakhin YuriPlyakhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dm-vodopyanov
Copy link
Contributor

SYCL Pre Commit on Windows / Windows / SYCL E2E on Windows (pull_request) https://github.com/intel/llvm/actions/runs/6223657584/job/16898974221?pr=11004

seems like a common CI problem.

@dkhaldi is this patch ready to be merged?

@dm-vodopyanov
Copy link
Contributor

@dkhaldi oh last-minute comment: spec has usage of matrix_type::int8, but there is no int8 in matrix_type enum. Should it be sint8? I can fix it together with my upcoming patch.

@dkhaldi
Copy link
Contributor Author

dkhaldi commented Sep 19, 2023

@dkhaldi oh last-minute comment: spec has usage of matrix_type::int8, but there is no int8 in matrix_type enum. Should it be sint8? I can fix it together with my upcoming patch.

Thanks, I fixed it here.

@dkhaldi
Copy link
Contributor Author

dkhaldi commented Sep 19, 2023

@gmlueck, do the recent changes to the SPR addition look good now? If so, please approve the spec portion.

@dkhaldi dkhaldi temporarily deployed to WindowsCILock September 20, 2023 19:19 — with GitHub Actions Inactive
@dkhaldi dkhaldi temporarily deployed to WindowsCILock September 20, 2023 20:16 — with GitHub Actions Inactive
<< combinations.size() << std::endl;

bool max_sizes;
if (combinations[0].maxsize == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is maxsize? There is no maxsize in combination struct. I will fix in the upcoming patch and enable (and re-write) the test for PVC, as this is a platform where we definitely test in this particular CI.

@dm-vodopyanov dm-vodopyanov merged commit 56de25d into intel:sycl Sep 21, 2023
void matrix_runtime_query(queue q) {

std::vector<combination> combinations =
q.get_device().get_info<sycl::info::device::matrix_combinations>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sycl::ext::oneapi::experimental::info::device::matrix_combinations

@@ -39,7 +39,8 @@ void matrix_multiply(big_matrix<T1, NUM_ROWS_C, NUM_COLS_C> &C,
size_t K = NUM_COLS_A;
assert(NUM_ROWS_C == NUM_ROWS_A && NUM_COLS_A == NUM_ROWS_B * 4);

using myparams2 = tpu_params<tpu::amx, int8_t, int8_t, int>;
using myparams2 =
matrix_params<architecture::intel_cpu_spr, int8_t, int8_t, int>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be sycl::ext::oneapi::experimental::architecture.
otherwise, errmsg is reported
error: use of undeclared identifier 'architecture'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants