-
Notifications
You must be signed in to change notification settings - Fork 765
[SYCL][L0] Implement reuse of l0 events in plugin #5947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Alexander Flegontov <[email protected]>
/summary:run |
/summary:run |
// This counts how many PI events used a particular L0 event. If the counter | ||
// becomes zero when we release the PI event, this means it was the last user | ||
// of the L0 event and we can safely put it back in the cache. | ||
std::unordered_map<ze_event_handle_t, size_t> ZeEventUseCount; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why you need any new counts for this. It should be at the time we currently call zeEventDestroy
that the event is known to be no longer needed and returned to the cache.
Also how do you ensure that the pool containing the event is not destroyed to early but is destroyed when all events in it are destroyed? (since you are adding a new structure to cache unused events it needs to sync with the exisitng pool structure). And I think they can use the same mutex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why you need any new counts for this. It should be at the time we currently call zeEventDestroy that the event is known to be no longer needed and returned to the cache.
Since we reuse L0 event multiple times in different PI events, there is a case when we release the PI event but we should not really add the L0 event to cache and we can safely add L0 event into cache only after last PI event is released.
So we need to know that current PI event is last which used the L0 event, for that I added new counter here.
Also how do you ensure that the pool containing the event is not destroyed to early but is destroyed when all events in it are destroyed? (since you are adding a new structure to cache unused events it needs to sync with the exisitng pool structure). And I think they can use the same mutex.
L0 events from the cache will be removed inside _pi_context::finalize right before the pool is destroyed.
Is that enough to guarantee what you say? should we still have the same mutex per pool and per cache as you suggested before?
if (auto Res = Queue->resetLastEventIfNeeded(CommandList)) | ||
return Res; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is reset of the unused events can't happen in the executeCommandList?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably can, but then we should add a new parameter to executeCommandList
to separate the cases when resetLastEventIfNeeded
is executed and when it is not.
For example, one of the problematic cases: we should not reset the L0 event when closing the command-list if we take this event later as a dependency (this situation occurs when the event is from another queue, this happens inside createAndRetainPiZeEventList
for LastCommandEvent
which should be used as a dependency.)
I'm trying to say that the general rule is to reset the L0 event immediately after queue operations such as zeCommandListAppendLaunchKernel
, zeCommandListAppendMemoryCopy
etc. (or after zeCommandListAppendWaitOnEvents
if the queue operation used it, see next paragraph about it).
On the other hand, in the future we may do some optimization: currently for cases when we already had zeCommandListAppendWaitOnEvents
, we could skip it in resetLastEventIfNeeded
and for this we can add one parameter to resetLastEventIfNeeded
function, but if we transfer it to executeCommandList
, then for such optimization you have to add two parameters to executeCommandList
function. That is, we can do as you suggest, but the only question is whether it is necessary?
Hope with this effort, the reset is unnecessary. |
Signed-off-by: Alexander Flegontov [email protected]