Skip to content

Use OpenACC multithreading in pre and post process #755

@wilfonba

Description

@wilfonba

CPU multithreading can be easily accomplished by adding !$acc directives to loops and adding the -ta=multicore command line option. Since no device-to-host memory is required, no update device (and maybe even no declare create) clauses are required, so this should be a relatively simple task. It would also require the request of multiple cores per task, but this is already part of SLURM. This feature would be particularly useful for simulations that use unified memory on GH200 and MI300A chips, where pre_process and post_process can take a significant amount of time if run on only one core. It would also potentially be useful for problems that involve STLs, which require a ray tracing step in pre_process, and when derived quantities like vorticity of Q-criterion are needed in post_process. I know this works with NVHPC, but I haven't tried it with CCE yet.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions