-
Notifications
You must be signed in to change notification settings - Fork 117
Description
CPU multithreading can be easily accomplished by adding !$acc
directives to loops and adding the -ta=multicore
command line option. Since no device-to-host memory is required, no update device
(and maybe even no declare create
) clauses are required, so this should be a relatively simple task. It would also require the request of multiple cores per task, but this is already part of SLURM. This feature would be particularly useful for simulations that use unified memory on GH200 and MI300A chips, where pre_process
and post_process
can take a significant amount of time if run on only one core. It would also potentially be useful for problems that involve STLs, which require a ray tracing step in pre_process
, and when derived quantities like vorticity of Q-criterion are needed in post_process
. I know this works with NVHPC, but I haven't tried it with CCE yet.