Add model metadata storage in PTE via NamedData (#19466)#19466
Add model metadata storage in PTE via NamedData (#19466)#19466kirklandsign wants to merge 1 commit into
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19466
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 1 New FailureAs of commit 9da7249 with merge base ce08814 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@kirklandsign has exported this pull request. If you are a Meta employee, you can view the originating Diff in D104322796. |
This PR needs a
|
|
@claude review |
|
Claude finished @kirklandsign's task in 1m 56s —— View job Code Review: Add model metadata storage in PTE via NamedData
Overall this is a clean, well-scoped addition. The design is sensible — leveraging the existing NamedData mechanism for lightweight metadata is much better than the current constant_methods approach. A few items worth addressing: Issues1. Missing The Python side has 2. In Python,
3. Accessing private edge_manager._named_data_store.add_named_data(...)This accesses an internal attribute directly. If the 4. No CMake target for the C++ metadata library The Minor / Nits5. auto buffer = std::move(result.get());
// ...
buffer.Free();Since 6. Encoding format is implicitly tied between Python and C++ The wire format (little-endian int64, uint32 length-prefix for lists, etc.) is defined implicitly by matching 7. The deferred import of 8. No validation of key format ( The docstring says keys use Positive Notes
|
There was a problem hiding this comment.
Pull request overview
Adds a lightweight mechanism to embed and retrieve model-level metadata in .pte files by storing key/value blobs in the existing NamedData section, avoiding the heavier constant_methods approach for simple constants.
Changes:
- Introduces a Python export helper (
add_metadata) plus PTE-side readers for typed access (string/int/float/int-list). - Adds a small C++ runtime helper header for reading metadata values from
runtime::NamedDataMap. - Adds a Python round-trip test and corresponding BUCK targets.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| extension/llm/runner/targets.bzl | Adds a new header-only C++ library target for metadata helpers. |
| extension/llm/runner/metadata.h | C++ helpers to decode int/string/int-list metadata values from NamedDataMap. |
| extension/llm/export/metadata.py | Python API to encode/store metadata into NamedData during export and read it back from PTE. |
| extension/llm/export/test/test_metadata_roundtrip.py | Round-trip tests for metadata storage and typed decoding. |
| extension/llm/export/test/BUCK | Adds BUCK test target for the new round-trip test. |
| extension/llm/export/BUCK | Adds BUCK python_library target for the new metadata module. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| size_t expected = sizeof(uint32_t) + count * sizeof(int64_t); | ||
| if (buffer.size() != expected) { | ||
| buffer.Free(); | ||
| return runtime::Error::InvalidArgument; | ||
| } | ||
| std::vector<int64_t> values(count); | ||
| std::memcpy( | ||
| values.data(), | ||
| static_cast<const char*>(buffer.data()) + sizeof(uint32_t), | ||
| count * sizeof(int64_t)); |
| def _encode_value(key: str, value: MetadataValue) -> bytes: | ||
| if isinstance(value, str): | ||
| return value.encode("utf-8") | ||
| elif isinstance(value, (list, tuple)): | ||
| return struct.pack(f"<I{len(value)}q", len(value), *value) | ||
| elif isinstance(value, int): | ||
| return struct.pack("<q", value) | ||
| elif isinstance(value, float): | ||
| return struct.pack("<d", value) | ||
| elif isinstance(value, bytes): | ||
| return value | ||
| raise TypeError(f"Unsupported metadata type {type(value)} for key '{key}'") |
| metadata: Dict mapping string keys to values (str, int, float, or bytes). | ||
| Keys are automatically prefixed with "metadata." to avoid collision | ||
| with backend named data. |
785ac68 to
a7031c3
Compare
Summary:
Add support for embedding model-level metadata (tokenizer config, chat
templates, architecture info) directly in PTE files using the existing
NamedData mechanism. This is a lightweight alternative to the current
constant_methods approach, which creates full ExecutionPlan entries for
simple constant values.
Python export API: add_metadata(edge_manager, {"key": value})
C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float
Supported types: str, int, float, bytes, list[int].
Keys use a namespace.field convention (e.g., tokenizer.bos_id,
context.max_seq_len) and are prefixed with "metadata." to avoid
collision with backend NamedData entries.
Differential Revision: D104322796
a7031c3 to
bfdcf96
Compare
Summary:
Add support for embedding model-level metadata (tokenizer config, chat
templates, architecture info) directly in PTE files using the existing
NamedData mechanism. This is a lightweight alternative to the current
constant_methods approach, which creates full ExecutionPlan entries for
simple constant values.
Python export API: add_metadata(edge_manager, {"key": value})
C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float
Supported types: str, int, float, bytes, list[int].
Keys use a namespace.field convention (e.g., tokenizer.bos_id,
context.max_seq_len) and are prefixed with "metadata." to avoid
collision with backend NamedData entries.
Differential Revision: D104322796
bfdcf96 to
4d709cc
Compare
Summary:
Add support for embedding model-level metadata (tokenizer config, chat
templates, architecture info) directly in PTE files using the existing
NamedData mechanism. This is a lightweight alternative to the current
constant_methods approach, which creates full ExecutionPlan entries for
simple constant values.
Python export API: add_metadata(edge_manager, {"key": value})
C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float
Supported types: str, int, float, bytes, list[int].
Keys use a namespace.field convention (e.g., tokenizer.bos_id,
context.max_seq_len) and are prefixed with "metadata." to avoid
collision with backend NamedData entries.
Differential Revision: D104322796
4d709cc to
aa3d450
Compare
| or list[int]). Keys are automatically prefixed with "metadata." to | ||
| avoid collision with backend named data. | ||
| """ | ||
| for key, value in metadata.items(): |
There was a problem hiding this comment.
Are we planning to standardize key names and dtypes?
There was a problem hiding this comment.
Easiest way: follow what gguf did 🤣
| inline constexpr const char* kBosId = "metadata.tokenizer.bos_id"; | ||
| inline constexpr const char* kEosIds = "metadata.tokenizer.eos_ids"; | ||
| inline constexpr const char* kMaxSeqLen = "metadata.context.max_seq_len"; | ||
| inline constexpr const char* kMaxContextLen = "metadata.context.max_context_len"; | ||
| inline constexpr const char* kVocabSize = "metadata.model.vocab_size"; | ||
| inline constexpr const char* kUseKVCache = "metadata.model.use_kv_cache"; | ||
| inline constexpr const char* kNLayers = "metadata.model.n_layers"; | ||
| inline constexpr const char* kChatTemplate = "metadata.tokenizer.chat_template"; |
There was a problem hiding this comment.
I meant how to keep it in sync either through serializing the dtypes along with the names in the PTE or have some shared header file exposed to python like backends/xnnpack/utils/gen_xnnpack_constants.sh
|
@claude is metadata.h metadata keys in sync with python? |
|
Claude finished @kirklandsign's task in 35s —— View job Checking metadata key sync between C++ and Python
Answer: Yes, they are in sync. The C++ and Python sides are consistent in terms of encoding format and accessor functions. Here's the detailed comparison: Key Constants (C++ only — Python has no predefined key constants)The C++ header defines well-known key constants (
The Python side doesn't define these key constants — it accepts arbitrary string keys and prefixes them with
These match correctly. Accessor Functions
PrefixBoth use One observationThe Python side also supports raw TL;DR: The wire formats, prefix convention, and accessor functions are fully in sync between the two files. The C++ side additionally defines well-known key constants that the Python side leaves to the caller. |
Summary:
Add support for embedding model-level metadata (tokenizer config, chat
templates, architecture info) directly in PTE files using the existing
NamedData mechanism. This is a lightweight alternative to the current
constant_methods approach, which creates full ExecutionPlan entries for
simple constant values.
Python export API: add_metadata(edge_manager, {"key": value})
C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float
Supported types: str, int, float, bytes, list[int].
Keys use a namespace.field convention (e.g., tokenizer.bos_id,
context.max_seq_len) and are prefixed with "metadata." to avoid
collision with backend NamedData entries.
Differential Revision: D104322796
aa3d450 to
7460e27
Compare
Summary:
Add support for embedding model-level metadata (tokenizer config, chat
templates, architecture info) directly in PTE files using the existing
NamedData mechanism. This is a lightweight alternative to the current
constant_methods approach, which creates full ExecutionPlan entries for
simple constant values.
Python export API: add_metadata(edge_manager, {"key": value})
C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float
Supported types: str, int, float, bytes, list[int].
Keys use a namespace.field convention (e.g., tokenizer.bos_id,
context.max_seq_len) and are prefixed with "metadata." to avoid
collision with backend NamedData entries.
Differential Revision: D104322796
7460e27 to
9834c9b
Compare
Summary:
Add support for embedding model-level metadata (tokenizer config, chat
templates, architecture info) directly in PTE files using the existing
NamedData mechanism. This is a lightweight alternative to the current
constant_methods approach, which creates full ExecutionPlan entries for
simple constant values.
Python export API: add_metadata(edge_manager, {"key": value})
C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float
Supported types: str, int, float, bytes, list[int].
Keys use a namespace.field convention (e.g., tokenizer.bos_id,
context.max_seq_len) and are prefixed with "metadata." to avoid
collision with backend NamedData entries.
Differential Revision: D104322796
9834c9b to
9da7249
Compare
Summary:
Add support for embedding model-level metadata (tokenizer config, chat
templates, architecture info) directly in PTE files using the existing
NamedData mechanism. This is a lightweight alternative to the current
constant_methods approach, which creates full ExecutionPlan entries for
simple constant values.
Python export API: add_metadata(edge_manager, {"key": value})
C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float
Supported types: str, int, float, bytes, list[int].
Keys use a namespace.field convention (e.g., tokenizer.bos_id,
context.max_seq_len) and are prefixed with "metadata." to avoid
collision with backend NamedData entries.
Differential Revision: D104322796