CANN: Fix ggml_cann_set_device to avoid redundant device switches #15935

noemotiovon · 2025-09-11T08:37:01Z

What does this PR do?

Added a check to skip aclrtSetDevice if the current device is already set.
Prevents unnecessary context switches while keeping thread/device consistency.

noemotiovon · 2025-09-11T08:42:43Z

Qwen2.5-0.5B Model Inference Test in 2 NPUs

......
llama_perf_sampler_print:    sampling time =      40.70 ms /   175 runs   (    0.23 ms per token,  4299.54 tokens per second)
llama_perf_context_print:        load time =    7582.66 ms
llama_perf_context_print: prompt eval time =      26.60 ms /    20 tokens (    1.33 ms per token,   751.94 tokens per second)
llama_perf_context_print:        eval time =     709.67 ms /   154 runs   (    4.61 ms per token,   217.00 tokens per second)
llama_perf_context_print:       total time =    1885.73 ms /   174 tokens
llama_perf_context_print:    graphs reused =        153

hipudding

LGTM

- Added a check to skip aclrtSetDevice if the current device is already set. - Prevents unnecessary context switches while keeping thread/device consistency.

noemotiovon added Ascend NPU issues specific to Ascend NPUs ggml changes relating to the ggml tensor library for machine learning labels Sep 11, 2025

hipudding approved these changes Sep 11, 2025

View reviewed changes

noemotiovon mentioned this pull request Sep 11, 2025

优化set_device逻辑 cosdt/llama.cpp#32

Open

1 task

noemotiovon force-pushed the set_device_opti branch 2 times, most recently from b711387 to 4a99538 Compare September 13, 2025 09:25

CANN: Fix ggml_cann_set_device to avoid redundant device switches

76d24c3

- Added a check to skip aclrtSetDevice if the current device is already set. - Prevents unnecessary context switches while keeping thread/device consistency.

noemotiovon force-pushed the set_device_opti branch from 4a99538 to 76d24c3 Compare September 15, 2025 02:11

CANN: add device default id

aedcfce

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CANN: Fix ggml_cann_set_device to avoid redundant device switches #15935

CANN: Fix ggml_cann_set_device to avoid redundant device switches #15935

noemotiovon commented Sep 11, 2025

Uh oh!

noemotiovon commented Sep 11, 2025 •

edited

Loading

Uh oh!

hipudding left a comment

Uh oh!

Uh oh!

CANN: Fix ggml_cann_set_device to avoid redundant device switches #15935

Are you sure you want to change the base?

CANN: Fix ggml_cann_set_device to avoid redundant device switches #15935

Conversation

noemotiovon commented Sep 11, 2025

What does this PR do?

Uh oh!

noemotiovon commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Qwen2.5-0.5B Model Inference Test in 2 NPUs

Uh oh!

hipudding left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

noemotiovon commented Sep 11, 2025 •

edited

Loading