Add zero-copy ByteBuffer image prefill API for Android LlmModule#17767
Add zero-copy ByteBuffer image prefill API for Android LlmModule#17767
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17767
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 New Failures, 2 Unrelated FailuresAs of commit 2852bd7 with merge base 5f879ca ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
There was a problem hiding this comment.
Pull request overview
This PR adds new Android LlmModule image prefill overloads that accept direct ByteBuffers and new corresponding JNI bindings, aiming to reduce JNI overhead compared to the existing int[] / float[] pathways.
Changes:
- Add JNI methods to accept direct
ByteBufferimage inputs (uint8 and normalized float). - Add Java
LlmModule.prefillImages(ByteBuffer, ...)andprefillNormalizedImages(ByteBuffer, ...)overloads and register new native methods. - Deprecate the legacy
prefillImages(int[], ...)overload.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| extension/android/jni/jni_layer_llama.cpp | Adds new JNI entrypoints that read direct ByteBuffer data for image prefill. |
| extension/android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java | Adds new public ByteBuffer overloads and native declarations; deprecates the int[] overload. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java
Show resolved
Hide resolved
...android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java
Show resolved
Hide resolved
...android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java
Show resolved
Hide resolved
...android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java
Show resolved
Hide resolved
eccbdf3 to
12ab370
Compare
12ab370 to
06b45f4
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java
Outdated
Show resolved
Hide resolved
...android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java
Show resolved
Hide resolved
06b45f4 to
2387efc
Compare
2387efc to
2feccf7
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java
Outdated
Show resolved
Hide resolved
The existing prefillImages(int[]) and prefillImages(float[]) APIs perform two full data copies across the JNI boundary: one via getRegion() into a temporary vector, then an element-by-element conversion into the final vector. This adds new prefillImages(ByteBuffer) and prefillNormalizedImages(ByteBuffer) overloads that use GetDirectBytes() on direct ByteBuffers to access the native memory pointer directly, eliminating the JNI array copy overhead. Both Java and native layers validate dimensions, buffer size, and float alignment before accessing the buffer. The float path uses memcpy instead of reinterpret_cast to avoid undefined behavior. Both array-based overloads are now marked @deprecated.
2feccf7 to
2852bd7
Compare
The existing prefillImages(int[]) and prefillImages(float[]) APIs perform two full data copies across the JNI boundary: one via getRegion() into a temporary vector, then an element-by-element conversion into the final vector. This adds new prefillImages(ByteBuffer) and prefillNormalizedImages(ByteBuffer) overloads that use GetDirectBytes() on direct ByteBuffers to access the native memory pointer directly, eliminating the JNI array copy overhead.