Skip to content

RFC: Add support for launch_attr in LaunchConfig ctor #496

@realarnavgoel

Description

@realarnavgoel

Today, LaunchConfig only supports cuLaunchKernel driver API to launch kernels on a single GPU. When extending to broader usecases where there is a need for inter-SM synchronization or multi-GPU synchronization, one would need to use cuLaunchCooperativeKernel to launch kernels safely in a deadlock-free manner. To support this, one could extend LaunchConfig(..., launch_attr=None) with an optional launch_attr that could set equivalent cuda-python data-type for CUlaunchAttribute.

Background:
This issue came out of discussion: NVIDIA/numba-cuda#128 (comment) where existing implementation of cuda driver bindings in numba-cuda uses cuLaunchCooperativeKernel or cuLaunchKernel based on the existence of grid.sync() in the kernel and in the effort to migrate it to cuda.core, one would need to provide the capability to select launch kernel API variant at runtime based on the LaunchConfig.

Metadata

Metadata

Assignees

Labels

P1Medium priority - Should docuda.coreEverything related to the cuda.core modulefeatureNew feature or request

Type

No type

Projects

Status

Todo

Relationships

None yet

Development

No branches or pull requests

Issue actions