-
Notifications
You must be signed in to change notification settings - Fork 636
Qualcomm AI Engine Direct - gpu support part1 #12165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- rename folders in backends/qualcomm/runtime/backends - add gpu infra
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12165
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 New Failures, 1 Cancelled JobAs of commit f21b2b8 with merge base 929ec94 ( NEW FAILURES - The following jobs have failed:
CANCELLED JOB - The following job was cancelled. Please retry:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot label "release notes: qualcomm" |
#include <executorch/backends/qualcomm/runtime/backends/htpbackend/HtpContext.h> | ||
#include <executorch/backends/qualcomm/runtime/backends/htpbackend/HtpDevice.h> | ||
#include <executorch/backends/qualcomm/runtime/backends/htpbackend/HtpGraph.h> | ||
#include <executorch/backends/qualcomm/runtime/backends/gpu/GpuBackend.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm slightly worried about the runtime size increase, that usually is a requirement for production. Do we know how much size increase with this PR? If I have a model runs on HTP only, can the runtime include HTP only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The libqnn_executorch_backend.so
grows from 630984 to 652672 bytes. We'll deprecate few files in next PR, hopefully it could further reduce the number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What files will be deprecated in next PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be aot/ir
and runtime/backend/CustomProtocol*
. We now switch to QNN IR backend (DLC) for online-prepare path, the qcir
and the legacy code for multi-method compilation can be fully deprecated.
But it would break backward compatibility since we used to wrap preprocess result with custom protocol. Probably will let you to decide when will be the right time to apply the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I was thinking wrong about the impact of deprecating files. We still need to keep the custom protocol implementation to make multi-graph path work.
The change is in #12583 now and will guarantee BC.
Sorry I need to spend a bit more time on this, because we don't have CI to test the pllm model and I'm worried it will cause breakage |
No worries, I think GA decoder models is way more important than this. This PR is mainly a proof of concept that we can extend the capability of QNN backend. |
Can we prioritize the stories.pte as part of CI to prevent BC breakage? Otherwise it's hard to catch failure |
Summary
Test plan