-
Notifications
You must be signed in to change notification settings - Fork 12.2k
PR: Refine ggml-hexagon backend(Qualcomm Hexagon NPU backend) for latest ggml,whisper.cpp,llama.cpp #12326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
zhouwg
wants to merge
146
commits into
ggml-org:master
Choose a base branch
from
zhouwg:pr_to_upstream
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+10,063
−0
Open
PR: Refine ggml-hexagon backend(Qualcomm Hexagon NPU backend) for latest ggml,whisper.cpp,llama.cpp #12326
Changes from all commits
Commits
Show all changes
146 commits
Select commit
Hold shift + click to select a range
ef343cc
ggml-qnn: add Qualcomm QNN backend for GGML
zhouwg 8015ad7
ggml-qnn: santiy check
zhouwg 4137ed1
ggml-qnn: update script build-run-android.sh to compare peformance of…
zhouwg 436c599
ggml-qnn: fix minor issue in test-backend-ops.cpp
zhouwg 7258496
ggml-qnn: merge QNN RPC feature from https://github.com/zhouwg/kantv/…
zhouwg b41d84e
ggml-qnn: sync from branch kantvai-ggmlqnn-npurpc
zhouwg d91f1ac
ggml-qnn: a concise approach to offload mulmat to QNN backend(sync fr…
zhouwg 835a9b4
ggml-qnn: remove redundant codes
zhouwg d563e40
ggml-qnn: sync from branch kantvai-ggmlqnn-npurpc
zhouwg 53ca7c0
ggml-qnn: sync from branch kantvai-ggmlqnn-npurpc
zhouwg d3efd1a
ggml-qnn: sync from branch kantvai-ggmlqnn-npurpc
zhouwg bcd5ee8
ggml-qnn: add Qualcomm QNN backend for GGML
zhouwg 5ccb9f2
ggml-qnn: merge QNN RPC feature from https://github.com/zhouwg/kantv/…
zhouwg 513141f
ggml-qnn: sync from branch kantvai-ggmlqnn-npurpc
zhouwg c8455ea
ggml-qnn: a concise approach to offload mulmat to QNN backend(sync fr…
zhouwg 1e94524
ggml-qnn: remove redundant codes
zhouwg 10014c4
ggml-qnn: sync from branch kantvai-ggmlqnn-npurpc
zhouwg 6d01dc1
ggml-qnn: sync from branch kantvai-ggmlqnn-npurpc
zhouwg c750cc5
ggml-qnn: sync from branch kantvai-ggmlqnn-npurpc
zhouwg 36d9a23
ggml-qnn: fix a minior typo in internal doc
zhouwg d9a5e0f
ggml-qnn: refine function ggml_qnn_create_general_tensor() to avoid c…
zhouwg 6281630
ggml-qnn: fix a minor typo in source code
zhouwg f1cb636
build: avoid ggml-qnn backend breaking other backend's builds
zhouwg 183099d
ggml-qnn: remove redundant codes to make PR reviewers happy
zhouwg 8812e72
ggml-qnn: refine code format
zhouwg 48449ae
ggml-qnn: offload quantized type mulmat to QNN backend
zhouwg c208133
ggml-qnn: refine source code structure to make code more clearly
zhouwg 24c31ff
ggml-qnn: enable release build with necessary logs to make reviewers …
zhouwg e874a5b
ggml-qnn: enable all quantize type with 2d mulmat
zhouwg ed37e16
ggml-qnn: enable log output of GGMLQNN_LOG_INFO in command line mode …
zhouwg d290dc5
ggml-qnn: Windows port --- step2
zhouwg 3668810
ggml-qnn: merge UT code and corresponding script from local dev branc…
zhouwg 12f0438
ggml-qnn: merge ggml_qnn_mul_mat_4d from local dev branch to make wor…
zhouwg e9cc7ba
ggml-qnn: submit AI-assisted ggml_qnn_mul_mat_4d(not worked currently…
zhouwg 0dbd545
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step2
zhouwg 5745fad
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step3
zhouwg e700d2a
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step4
zhouwg e5fdcb6
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step5
zhouwg f53a27c
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step6
zhouwg c8a8775
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step7
zhouwg 1c1e8d9
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step8
zhouwg 9796e3d
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- good in step9
zhouwg ab6a2ec
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- narrow down t…
zhouwg df2551d
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step10
zhouwg e603942
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- narrow down t…
zhouwg 02bc00f
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step11
zhouwg 13b2f5c
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- both ok in st…
zhouwg 3d92078
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 ---finalizing ver…
zhouwg e2bdef3
ggml-qnn: refine ggml_qnn_mul_mat and ggml_qnn_general_node according…
zhouwg 7df6c41
ggml-qnn: remove no-needed comments
zhouwg 6fad271
ggml-qnn: Windows port --- step3
zhouwg dc6f5e3
ggml-qnn: remove un-needed function
zhouwg a884d43
ggml-qnn:rebase to upstream
zhouwg 4502022
ggml-qnn: fix a minior issue during rebase to upstream
zhouwg d3ced9b
ggml-qnn: update script according to https://github.com/ggml-org/llam…
zhouwg db58469
ggml-qnn: fix a minior issue in ggmlqnn_create_general_tensor()
zhouwg d6c6d07
ggml-qnn: active member variable _device_id in class qnn_instance
zhouwg c73cf15
ggml-qnn: refine ggml_qnn_general_node and ggml_qnn_mul_mat to make c…
zhouwg 9ff652a
ggml-qnn: Windows port --- step4
zhouwg 05b68df
ggml-qnn: Windows port -- step5
zhouwg 5dc4b4e
ggml-qnn: WoA(Windows on ARM) -- step6
zhouwg b13576a
ggml-qnn: rebase to upstream
zhouwg f655720
ggml-qnn: pr to upstream
zhouwg 8a9b88e
ggml-qnn: rebase to upstream
zhouwg cf88a43
ggml-qnn: self code-review
zhouwg 0b93da8
ggml-qnn: rebase upstream
zhouwg c6c6563
ggml-qnn: add approach through Hexagon cDSP
zhouwg 7b8c9d2
ggml-qnn: refine general approach through Hexagon cDSP
zhouwg 0b5d7a5
ggml-qnn: refine the entire ggml-qnn.cpp to make code more clear
zhouwg 9e3ef48
ggml-qnn: refine the entire ggml-qnn.cpp to make code more clear
zhouwg f78beb5
ggml-qnn: add build script for libggmlop_skel.so
zhouwg 474288e
ggml-qnn: remove redundant functions in this PR and make codes more c…
zhouwg e45c627
ggml-qnn: original ggml_compute_forward_add and ggml_compute_forward_…
zhouwg d911099
ggml-qnn: modify build-run-android.sh to verify mulmat and validate m…
zhouwg 2690a5c
ggml-qnn: make host code(ggml-qnn.cpp) more clear and more stable
zhouwg 320ef55
ggml-qnn: refine code according to self code-review and make code mor…
zhouwg eb19589
ggml-qnn: offload more ggml op to Hexagon cDSP
zhouwg 23ef20f
ggml-hexagon: code on AP(arm-cpu) side is stable now
zhouwg b8976d4
ggml-hexagon: optimize GGML_OP_ADD on cDSP side
zhouwg 1835aac
ggml-hexagon: simplify hexagon-kernel build logic in CMakeLists.txt
zhouwg 767734e
ggml-hexagon: release ggml-hexagon v0.98
zhouwg c5d897f
ggml-hexagon: release ggml-hexagon v0.99
zhouwg ea595d0
ggml-hexagon: try to offload q6_k mulmat to cDSP
zhouwg 3897dc3
ggml-hexagon: fix minior issue in ggml-hexagon.cpp after self code-re…
zhouwg 5201594
ggml-hexagon: check validation of ggml-hexagon.cfg before create appr…
zhouwg 686a1c8
ggml-hexagon: fix all compiler warnings in ggml-hexagon.cpp
zhouwg e58cd8d
ggml-hexagon: enable only one backend device for HWACCEL_CDSP and ena…
zhouwg da08bfa
ggml-hexagon: rpc ion memory pool and test-backend-ops works fine in …
zhouwg 15c1f79
ggml-hexagon: make comprision of mulmat performance between HWACCEL_Q…
zhouwg 4f80ac9
ggml-hexagon: release ggml-hexagon v1.00
zhouwg b191a7b
ggml-hexagon: rebase to upstream
zhouwg d242bc1
ggml-hexagon: check configuration of enable_rpc_dma_mempool in functi…
zhouwg 36754a6
ggml-hexagon: uniform rpc_ion_memsize and rpc_ion_usage between HWACC…
zhouwg ce047b6
ggml-hexagon: make buffer mechanism more clear in HWACCEL_CDSP approach
zhouwg e92ffdb
ggml-hexagon: add perf function in hexagon kernerls on cDSP side
zhouwg ee733dd
ggml-hexagon: fix a stupid issue of why set rpc latency failure and i…
zhouwg fd10234
ggml-hexagon: make helper function ggmlhexagon_get_timestring() threa…
zhouwg 0ebec99
ggml-hexagon: fix a typo in ggml-hexagon.cpp
zhouwg baecc2d
ggml-hexagon: list all known todo and fixme tasks in ggml-hexagon.cpp
zhouwg 8b58002
ggml-hexagon: fix units MB -> MiB
zhouwg ba4aaa9
ggml-hexagon: try to make ggml-hexagon backend works fine in a standa…
zhouwg fc1d9db
ggml-hexagon: remove reduament code and make debug log more clear
zhouwg c75df4e
ggml-hexagon: add gemma-3-4b-it-Q8_0.gguf to verify q8_0 mulmat on cDSP
zhouwg 7fbae90
ggml-hexagon:add skeleton code of offload GGML_OP_SOFT_MAX/GGML_OP_RM…
zhouwg 48a5ef5
ggml-hexagon: release ggml-dsp v0.60 on cDSP side
zhouwg 07a4826
ggml-hexagon: merge build logic in kernels/Makefile to ggml-hexagon/C…
zhouwg 3d2acf2
ggml-hexagon: fix a typo in ggml-hexagon.cpp
zhouwg 473ea76
ggml-hexagon: uniform NDEBUG usage in ggml-hexagon.cpp and ggml-dsp.c
zhouwg 9ebc58e
ggml-hexagon: add profiler feature for purpose of visualize NPU perfo…
zhouwg c9ecd60
ggml-hexagon: remove so-called dma memory pool to avoid confusion and…
zhouwg 83b0e4f
ggml-hexagon: make function ggmlhexagon_init_rpcmempool in ggml-hexag…
zhouwg 3a34101
ggml-hexagon: fix potential resource leak in class hexagon_profiler
zhouwg 98fdc28
ggml-hexagon: enable multi-threading feature on cDSP side
zhouwg 880976f
ggml-hexagon: upgrade QNN SDK to v2.33.0.250327
zhouwg 67551bb
ggml-hexagon: fix typo in ggml-hexagon.cpp
zhouwg 9d43167
ggml-dsp: probe QuRT RTOS information in function ggmlop_dsp_open
zhouwg 0b28da9
ggml-hexagon: setting enable_rpc_ion_mempool to 1 and make test-backe…
zhouwg ea970ca
ggml-hexagon: check whether user's specified htp arch is valid in CMa…
zhouwg f12593a
ggml-hexagon: sync with upstream
zhouwg 828d465
ggml-hexagon: refine pinned-memory feature
zhouwg 9839bd0
ggml-hexagon: refine build system in ggml-hexagon
zhouwg 65c377a
ggml-hexagon: remove redundant code in struct ggml_backend_hexagon_bu…
zhouwg 7ad26b6
ggml-hexagon: upgrade Android NDK to android-ndk-r28
zhouwg db15b6c
ggml-dsp: split ggml-dsp.c into multiple files and cleanup
zhouwg a37f1b5
ggml-dsp: refine ggml-dsp and make ggml-dsp more clear
zhouwg 90b2dc0
ggml-hexagon: fix a minior issue in dev ops
zhouwg e9bfbce
ggml-hexagon: fix a build issue in CI
zhouwg 4359824
ggml-dsp: cleanup code
zhouwg 7bb2774
ggml-hexagon: sync with upstream
zhouwg 0451d53
ggml-dsp: cleanup code
zhouwg da2545d
ggml-dsp:refine ggmlhexagon_dsp_add_f32
zhouwg 80330d3
ggml-dsp: refine logic of thread_counts
zhouwg 7f11fc1
ggml-hexagon: release v1.06 and ready for code review
zhouwg 2285eb3
ggml-dsp: make GGML_OP_ADD more faster on cDSP side
zhouwg 70206d7
ggml-hexagon: sync from project kantv(make ggml-hexagon backend can w…
zhouwg b79f396
sync with upstream llama.cpp and sync ggml-hexagon.cpp from project k…
zhouwg 35bfc28
sync with upstream
zhouwg 3ab7ddb
sync with upstream
zhouwg 5bbcd23
ggml-hexagon: upgrade QNN SDK to v2.34.0.250424
zhouwg 770061f
sync with upstream
zhouwg 5a588d1
ggml-hexagon: sync from project kantv(fix a long-term issue which int…
zhouwg 057bf1b
ggml-hexagon: sync with upstream llama.cpp
zhouwg 700f039
ggml-hexagon: add set_hexagon_cfg(int new_hexagon_backend, int new_hw…
zhouwg 0ef1e49
ggml-hexagon: sync with branch self-build
zhouwg 1245c4e
ggml-hexagon:sycn with branch self-build
zhouwg 2864ed9
project: sync with upstream(PR-14501:remove kompute backend)
zhouwg File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -146,3 +146,5 @@ poetry.toml | |
# Local scripts | ||
/run-vim.sh | ||
/run-chat.sh | ||
|
||
/prebuilts |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
#pragma once | ||
|
||
#include "ggml.h" | ||
#include "ggml-backend.h" | ||
|
||
#ifdef __cplusplus | ||
extern "C" { | ||
#endif | ||
|
||
#define GGML_HEXAGON_MAX_DEVICES 4 | ||
#define GGML_HEXAGON_BACKEND_NAME "hexagon" | ||
|
||
enum HEXAGONBackend { | ||
HEXAGON_BACKEND_QNNCPU = 0, | ||
HEXAGON_BACKEND_QNNGPU = 1, | ||
HEXAGON_BACKEND_QNNNPU = 2, | ||
HEXAGON_BACKEND_CDSP = 3, | ||
HEXAGON_BACKEND_GGML = 4, //"fake" HEXAGON backend for compare performance between HEXAGON backend and ggml backend | ||
}; | ||
|
||
//0: general approach through QNN:offload ggmlop to QNN(QNNCPU, QNNGPU, QNNNPU) | ||
//1: special approach through QNN-SINGLEGRAPH:mapping entire ggml cgraph to a single QNN graph | ||
//2: general approach through Hexagon cDSP:offload ggmlop to Hexagon cDSP directly | ||
enum hwaccel_approach_type { | ||
HWACCEL_QNN = 0, | ||
HWACCEL_QNN_SINGLEGRAPH= 1, | ||
HWACCEL_CDSP = 2, | ||
}; | ||
|
||
GGML_BACKEND_API ggml_backend_t ggml_backend_hexagon_init(size_t dev_num, const char * qnn_lib_path); | ||
|
||
GGML_BACKEND_API bool ggml_backend_is_hexagon(ggml_backend_t backend); | ||
|
||
GGML_BACKEND_API int ggml_backend_hexagon_get_device_count(void); | ||
|
||
GGML_BACKEND_API ggml_backend_reg_t ggml_backend_hexagon_reg(void); | ||
|
||
GGML_BACKEND_API const char * ggml_backend_hexagon_get_devname(size_t dev_num); | ||
|
||
GGML_BACKEND_API void ggml_backend_hexagon_set_cfg(int new_hexagon_backend, int new_hwaccel_approach); | ||
|
||
GGML_BACKEND_API int ggml_backend_hexagon_get_mulmat_algotype(void); | ||
|
||
GGML_BACKEND_API void ggml_backend_hexagon_set_mulmat_algotype(int new_mulmat_algotype); | ||
|
||
#ifdef __cplusplus | ||
} | ||
#endif |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,133 @@ | ||
project(ggml-hexagon) | ||
message(STATUS "Using HEXAGON backend") | ||
message("CMAKE_SYSTEM_NAME : ${CMAKE_SYSTEM_NAME}") | ||
|
||
set(CMAKE_CXX_STANDARD 20) | ||
set(CMAKE_CXX_STANDARD_REQUIRED ON) | ||
|
||
if(NOT DEFINED QNN_SDK_PATH) | ||
message(FATAL_ERROR "QNN_SDK_PATH not defined") | ||
endif() | ||
|
||
if(NOT DEFINED HEXAGON_SDK_PATH) | ||
message(FATAL_ERROR "HEXAGON_SDK_PATH not defined") | ||
endif() | ||
|
||
message("QNN_SDK_PATH : ${QNN_SDK_PATH}") | ||
message("HEXAGON_SDK_PATH: ${HEXAGON_SDK_PATH}") | ||
message("HTP_ARCH_VERSION: ${HTP_ARCH_VERSION}") | ||
|
||
if (CMAKE_BUILD_TYPE STREQUAL "Debug") | ||
set(DEBUG_FLAG "-DDEBUG -Wall") | ||
message("Debug mode:${DEBUG_FLAG}") | ||
else() | ||
set(DEBUG_FLAG "-DNDEBUG -Wall") | ||
#manually disable all verbose logs in ggml-hexagon/CMakeLists.txt to | ||
#make compare NPU performance through llama-bench more clear | ||
#set(DEBUG_FLAG "-DNDEBUG -Wall -DDISABLE_ALL_LOG") | ||
message("Release mode:${DEBUG_FLAG}") | ||
endif() | ||
|
||
#v68 --- Snapdragon 888 | ||
#v69 --- Snapdragon 8 Gen1 | ||
#v73 --- Snapdragon 8 Gen2 | ||
#v75 --- Snapdragon 8 Gen3 | ||
#v79 --- Snapdragon 8 Elite | ||
if(NOT DEFINED HTP_ARCH_VERSION) | ||
message(FATAL_ERROR "HTP_ARCH_VERSION not defined, valid htp arch: v68,v69,v73,v75,v79") | ||
endif() | ||
|
||
#check whether user's specified htp arch is valid | ||
set(CHECK_HTP_ARCH "WRONG") | ||
foreach (feat v68 v69 v73 v75 v79) | ||
if (${feat} STREQUAL ${HTP_ARCH_VERSION}) | ||
set(CHECK_HTP_ARCH "GOOD") | ||
endif() | ||
endforeach() | ||
if (${CHECK_HTP_ARCH} STREQUAL "WRONG") | ||
message(FATAL_ERROR "ggml-hexagon backend only support htp arch v68,v69,v73,v75,v79") | ||
endif() | ||
|
||
#check optimization flags | ||
set(OPT_FLAG " ") | ||
if (${HTP_ARCH_VERSION} STREQUAL "v75" OR ${HTP_ARCH_VERSION} STREQUAL "v79") | ||
#works fine on Snapdragon 8Gen3&8Elite with 1.5x - 3x performance gains with the default ggml backend | ||
set(OPT_FLAG " -O3 -march=armv8.7-a -mcpu=cortex-x1 -mtune=cortex-x1 -flto -D_GNU_SOURCE -fvectorize -ffp-model=fast -fno-finite-math-only") | ||
endif() | ||
message("OPT_FLAG:${OPT_FLAG}") | ||
|
||
if(CMAKE_SYSTEM_NAME STREQUAL "Android") | ||
find_library(LOG_LIB log) | ||
|
||
add_library(cdsprpc | ||
SHARED | ||
IMPORTED) | ||
set_target_properties(cdsprpc | ||
PROPERTIES | ||
IMPORTED_LOCATION | ||
${HEXAGON_SDK_PATH}/ipc/fastrpc/remote/ship/android_aarch64/libcdsprpc.so) | ||
|
||
set(QNN_LINK_LIBRARIES ${LOG_LIB} cdsprpc) | ||
set(QNN_DEFAULT_LIB_SEARCH_PATH "/data/local/tmp/" CACHE STRING "customized library search path for QNN backend") | ||
|
||
include_directories(${HEXAGON_SDK_PATH}/incs) | ||
include_directories(${HEXAGON_SDK_PATH}/incs/stddef) | ||
include_directories(${HEXAGON_SDK_PATH}/ipc/fastrpc/incs) | ||
include_directories(${HEXAGON_SDK_PATH}/ipc/fastrpc/rpcmem/inc) | ||
include_directories(${HEXAGON_SDK_PATH}/ipc/fastrpc/remote/ship/android_Debug_aarch64) | ||
include_directories(${HEXAGON_SDK_PATH}/utils/examples) | ||
include_directories(${HEXAGON_SDK_PATH}/ipc/fastrpc/rtld/ship/android_aarch64) | ||
include_directories(${HEXAGON_SDK_PATH}/libs/atomic/inc) | ||
include_directories(${HEXAGON_SDK_PATH}/libs/atomic/android_Debug_aarch64/ship) | ||
include_directories(${CMAKE_SOURCE_DIR}/ggml/src/ggml-hexagon/) | ||
include_directories(${CMAKE_SOURCE_DIR}/ggml/src/ggml-hexagon/kernels/) | ||
elseif(CMAKE_SYSTEM_NAME STREQUAL "Windows") | ||
set(QNN_DEFAULT_LIB_SEARCH_PATH "C:\\" CACHE STRING "customized library search path for QNN backend") | ||
else() | ||
message(FATAL_ERROR "ggml-hexagon now only available on Android and Windows(Windows on ARM)") | ||
endif() | ||
|
||
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -DGGML_USE_HEXAGON ${DEBUG_FLAG} ${OPT_FLAG}") | ||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DGGML_USE_HEXAGON ${DEBUG_FLAG} ${OPT_FLAG}") | ||
set(CMAKE_C_FLAGS_RELEASE "${CMAKE_C_FLAGS_RELEASE} -DGGML_USE_HEXAGON ${DEBUG_FLAG} ${OPT_FLAG}") | ||
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -DGGML_USE_HEXAGON ${DEBUG_FLAG} ${OPT_FLAG}") | ||
|
||
file(GLOB HEXAGON_SOURCES "${CMAKE_CURRENT_LIST_DIR}/*.cpp" "${CMAKE_CURRENT_LIST_DIR}/kernels/stub.c") | ||
ggml_add_backend_library(ggml-hexagon ${HEXAGON_SOURCES}) | ||
|
||
target_include_directories(ggml-hexagon PRIVATE ${QNN_SDK_PATH}/include/QNN ${HEXAGON_SDK_PATH} ${CMAKE_CURRENT_LIST_DIR}) | ||
target_link_libraries(ggml-hexagon PRIVATE ${QNN_LINK_LIBRARIES}) | ||
|
||
string(REGEX REPLACE "/$" "" QNN_DEFAULT_LIB_SEARCH_PATH "${QNN_DEFAULT_LIB_SEARCH_PATH}") | ||
target_compile_definitions(ggml-hexagon PRIVATE QNN_DEFAULT_LIB_SEARCH_PATH="${QNN_DEFAULT_LIB_SEARCH_PATH}/") | ||
|
||
#cross compiling source codes of hexagon kernels which running on cDSP side | ||
function(ggml_hexagon_build_kernel KNAME) | ||
message(STATUS "ggml_hexagon: build hexagon-kernel ${KNAME}") | ||
|
||
add_custom_command( | ||
TARGET ${PROJECT_NAME} | ||
POST_BUILD | ||
COMMAND echo "current working path:`pwd`\n" | ||
COMMAND echo "${CMAKE_CURRENT_LIST_DIR}/kernels" | ||
COMMAND make -C ${CMAKE_CURRENT_LIST_DIR}/kernels/ clean | ||
COMMAND make -C ${CMAKE_CURRENT_LIST_DIR}/kernels/ HEXAGON_SDK_PATH=${HEXAGON_SDK_PATH} HTP_ARCH_VERSION=${HTP_ARCH_VERSION} DEBUG_FLAG=${DEBUG_FLAG} | ||
COMMAND echo "current working path:`pwd`\n" | ||
COMMAND ls -l ../../../bin/libggmldsp-skel.so | ||
COMMENT "build hexagon-kernel" | ||
) | ||
endfunction() | ||
|
||
function(ggml_hexagon_setup_cfg KNAME) | ||
message(STATUS "ggml_hexagon: setup runtime configuration file ${KNAME}") | ||
add_custom_command( | ||
TARGET ${PROJECT_NAME} | ||
POST_BUILD | ||
COMMAND echo "current working path:`pwd`\n" | ||
COMMAND /bin/cp -fv ../../../../../scripts/${KNAME} ../../../bin/ | ||
COMMENT "setup runtime configuration file" | ||
) | ||
endfunction() | ||
|
||
ggml_hexagon_build_kernel("cdsp") | ||
ggml_hexagon_setup_cfg("ggml-hexagon.cfg") |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.