-
Notifications
You must be signed in to change notification settings - Fork 12.2k
ggml: introduce GGML_NUMA_MIGRATE to optimize cross NUMA op computation #14232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
53370cb
f6a69ed
1f3fdfb
1fc281f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -152,6 +152,11 @@ set(GGML_BLAS_VENDOR ${GGML_BLAS_VENDOR_DEFAULT} CACHE STRING | |
"ggml: BLAS library vendor") | ||
option(GGML_LLAMAFILE "ggml: use LLAMAFILE" ${GGML_LLAMAFILE_DEFAULT}) | ||
|
||
option(GGML_NUMA_MIGRATE "ggml: use NUMA_MIGRATE" OFF) | ||
set(GGML_NUMA_MIGRATE_NODES "2" CACHE STRING | ||
"ggml: the number of NUMA nodes during page migration") | ||
option(GGML_NUMA_MIGRATE_DEBUG "ggml: enable debugging of NUMA_MIGRATE" OFF) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Avoid this by checking an environment variable instead. See |
||
|
||
option(GGML_CUDA "ggml: use CUDA" OFF) | ||
option(GGML_MUSA "ggml: use MUSA" OFF) | ||
option(GGML_CUDA_FORCE_MMQ "ggml: use mmq kernels instead of cuBLAS" OFF) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,6 +12,9 @@ extern "C" { | |
struct ggml_cplan { | ||
size_t work_size; // size of work buffer, calculated by `ggml_graph_plan()` | ||
uint8_t * work_data; // work buffer, to be allocated by caller before calling to `ggml_graph_compute()` | ||
#ifdef GGML_USE_NUMA_MIGRATE | ||
uint8_t * work_data_numa[GGML_NUMA_MIGRATE_NODES]; | ||
#endif | ||
Comment on lines
+15
to
+17
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have to find a way to avoid so many There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah, it's a valid concern, i'll need to find a better way. |
||
|
||
int n_threads; | ||
struct ggml_threadpool * threadpool; | ||
|
@@ -28,6 +31,9 @@ extern "C" { | |
GGML_NUMA_STRATEGY_ISOLATE = 2, | ||
GGML_NUMA_STRATEGY_NUMACTL = 3, | ||
GGML_NUMA_STRATEGY_MIRROR = 4, | ||
#ifdef GGML_USE_NUMA_MIGRATE | ||
GGML_NUMA_STRATEGY_MIGRATE = 5, | ||
#endif | ||
GGML_NUMA_STRATEGY_COUNT | ||
}; | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should become
GGML_CPU_NUMA
. AFAIU, these changes require linkinglibnuma
, is that correct?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I'm OK with the name change since it's targeted for CPU:)