Skip to content

Add model metadata storage in PTE via NamedData (#19466)#19466

Open
kirklandsign wants to merge 1 commit into
mainfrom
export-D104322796
Open

Add model metadata storage in PTE via NamedData (#19466)#19466
kirklandsign wants to merge 1 commit into
mainfrom
export-D104322796

Conversation

@kirklandsign
Copy link
Copy Markdown
Contributor

@kirklandsign kirklandsign commented May 11, 2026

Summary:

Add support for embedding model-level metadata (tokenizer config, chat
templates, architecture info) directly in PTE files using the existing
NamedData mechanism. This is a lightweight alternative to the current
constant_methods approach, which creates full ExecutionPlan entries for
simple constant values.

Python export API: add_metadata(edge_manager, {"key": value})
C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float

Supported types: str, int, float, bytes, list[int].
Keys use a namespace.field convention (e.g., tokenizer.bos_id,
context.max_seq_len) and are prefixed with "metadata." to avoid
collision with backend NamedData entries.

Differential Revision: D104322796

Copilot AI review requested due to automatic review settings May 11, 2026 21:08
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 11, 2026
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented May 11, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19466

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 1 New Failure

As of commit 9da7249 with merge base ce08814 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync Bot commented May 11, 2026

@kirklandsign has exported this pull request. If you are a Meta employee, you can view the originating Diff in D104322796.

@github-actions
Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@kirklandsign
Copy link
Copy Markdown
Contributor Author

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented May 11, 2026

Claude finished @kirklandsign's task in 1m 56s —— View job


Code Review: Add model metadata storage in PTE via NamedData

  • Read and analyze Python export API (extension/llm/export/metadata.py)
  • Read and analyze C++ runtime API (extension/llm/runner/metadata.h)
  • Read and analyze test file (extension/llm/export/test/test_metadata_roundtrip.py)
  • Review BUCK/bzl build changes
  • Post review feedback

Overall this is a clean, well-scoped addition. The design is sensible — leveraging the existing NamedData mechanism for lightweight metadata is much better than the current constant_methods approach. A few items worth addressing:


Issues

1. Missing get_float in C++ header (extension/llm/runner/metadata.h)

The Python side has get_float (which encodes as <d, an 8-byte double), but the C++ header has no corresponding get_float/get_double. If a user stores model.temperature (as shown in the test), there's no way to read it from C++. Either add get_double or document that float metadata is Python-only.

Fix this →

2. bool passes the int isinstance check silently (extension/llm/export/metadata.py:29)

In Python, isinstance(True, int) is True, so a boolean value will be silently encoded as int64_t (value 0 or 1). This may be intentional (the test uses "model.use_kv_cache": 1), but it's a subtle gotcha. Consider either:

  • Explicitly checking for bool before int and raising a TypeError (forcing users to pass 0/1), or
  • Documenting that booleans are stored as int64 0/1.

3. Accessing private _named_data_store (extension/llm/export/metadata.py:56)

edge_manager._named_data_store.add_named_data(...)

This accesses an internal attribute directly. If the EdgeProgramManager API changes, this will break silently. Consider whether a public method (e.g., edge_manager.add_named_data(key, data)) should be added to the EdgeProgramManager class instead, or at minimum add a comment noting the coupling.

4. No CMake target for the C++ metadata library

The targets.bzl (Buck) defines the metadata library, but extension/llm/runner/CMakeLists.txt has no corresponding target. OSS users building with CMake won't be able to link against it. This is header-only so it's low-friction to add — just needs an INTERFACE library target with the include path and deps.

Fix this →


Minor / Nits

5. buffer.Free() after move (extension/llm/runner/metadata.h:42-49)

auto buffer = std::move(result.get());
// ...
buffer.Free();

Since FreeableBuffer's destructor already calls Free(), the explicit Free() calls are redundant (the buffer is stack-allocated and will be destroyed at scope exit). They don't cause harm since Free() is idempotent, but they add noise. Consider letting RAII handle it, or add a brief comment explaining why explicit free is preferred (e.g., releasing memory before further processing).

6. Encoding format is implicitly tied between Python and C++

The wire format (little-endian int64, uint32 length-prefix for lists, etc.) is defined implicitly by matching struct.pack format strings in Python with memcpy in C++. Consider adding a brief format comment in either file so future maintainers don't have to cross-reference.

7. read_metadata imports at function body (extension/llm/export/metadata.py:68)

The deferred import of deserialize_pte_binary is fine for avoiding circular imports, but it means import errors won't surface until runtime. A comment explaining why would be helpful.

8. No validation of key format (extension/llm/export/metadata.py:55)

The docstring says keys use namespace.field convention, but there's no enforcement. Invalid keys (empty string, keys with the metadata. prefix already included) would silently produce unexpected results. Consider a lightweight validation or at least a warning.


Positive Notes

  • Good test coverage: round-trip, empty metadata, raw bytes, and the real-world LLM metadata use case.
  • The METADATA_PREFIX approach for namespacing is clean and simple.
  • The C++ API correctly validates buffer sizes before memcpy.
  • The well-defined constant keys in the C++ header (kBosId, kEosIds, etc.) will make adoption straightforward.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a lightweight mechanism to embed and retrieve model-level metadata in .pte files by storing key/value blobs in the existing NamedData section, avoiding the heavier constant_methods approach for simple constants.

Changes:

  • Introduces a Python export helper (add_metadata) plus PTE-side readers for typed access (string/int/float/int-list).
  • Adds a small C++ runtime helper header for reading metadata values from runtime::NamedDataMap.
  • Adds a Python round-trip test and corresponding BUCK targets.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
extension/llm/runner/targets.bzl Adds a new header-only C++ library target for metadata helpers.
extension/llm/runner/metadata.h C++ helpers to decode int/string/int-list metadata values from NamedDataMap.
extension/llm/export/metadata.py Python API to encode/store metadata into NamedData during export and read it back from PTE.
extension/llm/export/test/test_metadata_roundtrip.py Round-trip tests for metadata storage and typed decoding.
extension/llm/export/test/BUCK Adds BUCK test target for the new round-trip test.
extension/llm/export/BUCK Adds BUCK python_library target for the new metadata module.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread extension/llm/runner/metadata.h Outdated
Comment on lines +80 to +89
size_t expected = sizeof(uint32_t) + count * sizeof(int64_t);
if (buffer.size() != expected) {
buffer.Free();
return runtime::Error::InvalidArgument;
}
std::vector<int64_t> values(count);
std::memcpy(
values.data(),
static_cast<const char*>(buffer.data()) + sizeof(uint32_t),
count * sizeof(int64_t));
Comment on lines +26 to +37
def _encode_value(key: str, value: MetadataValue) -> bytes:
if isinstance(value, str):
return value.encode("utf-8")
elif isinstance(value, (list, tuple)):
return struct.pack(f"<I{len(value)}q", len(value), *value)
elif isinstance(value, int):
return struct.pack("<q", value)
elif isinstance(value, float):
return struct.pack("<d", value)
elif isinstance(value, bytes):
return value
raise TypeError(f"Unsupported metadata type {type(value)} for key '{key}'")
Comment thread extension/llm/export/metadata.py Outdated
Comment on lines +51 to +53
metadata: Dict mapping string keys to values (str, int, float, or bytes).
Keys are automatically prefixed with "metadata." to avoid collision
with backend named data.
@meta-codesync meta-codesync Bot force-pushed the export-D104322796 branch from 785ac68 to a7031c3 Compare May 11, 2026 22:23
@meta-codesync meta-codesync Bot changed the title Add model metadata storage in PTE via NamedData Add model metadata storage in PTE via NamedData (#19466) May 11, 2026
meta-codesync Bot pushed a commit that referenced this pull request May 11, 2026
Summary:

Add support for embedding model-level metadata (tokenizer config, chat
templates, architecture info) directly in PTE files using the existing
NamedData mechanism. This is a lightweight alternative to the current
constant_methods approach, which creates full ExecutionPlan entries for
simple constant values.

Python export API: add_metadata(edge_manager, {"key": value})
C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float

Supported types: str, int, float, bytes, list[int].
Keys use a namespace.field convention (e.g., tokenizer.bos_id,
context.max_seq_len) and are prefixed with "metadata." to avoid
collision with backend NamedData entries.

Differential Revision: D104322796
Copilot AI review requested due to automatic review settings May 11, 2026 22:28
@meta-codesync meta-codesync Bot force-pushed the export-D104322796 branch from a7031c3 to bfdcf96 Compare May 11, 2026 22:28
@kirklandsign kirklandsign review requested due to automatic review settings May 11, 2026 22:28
meta-codesync Bot pushed a commit that referenced this pull request May 11, 2026
Summary:

Add support for embedding model-level metadata (tokenizer config, chat
templates, architecture info) directly in PTE files using the existing
NamedData mechanism. This is a lightweight alternative to the current
constant_methods approach, which creates full ExecutionPlan entries for
simple constant values.

Python export API: add_metadata(edge_manager, {"key": value})
C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float

Supported types: str, int, float, bytes, list[int].
Keys use a namespace.field convention (e.g., tokenizer.bos_id,
context.max_seq_len) and are prefixed with "metadata." to avoid
collision with backend NamedData entries.

Differential Revision: D104322796
Copilot AI review requested due to automatic review settings May 11, 2026 22:44
@meta-codesync meta-codesync Bot force-pushed the export-D104322796 branch from bfdcf96 to 4d709cc Compare May 11, 2026 22:44
@kirklandsign kirklandsign review requested due to automatic review settings May 11, 2026 22:44
meta-codesync Bot pushed a commit that referenced this pull request May 11, 2026
Summary:

Add support for embedding model-level metadata (tokenizer config, chat
templates, architecture info) directly in PTE files using the existing
NamedData mechanism. This is a lightweight alternative to the current
constant_methods approach, which creates full ExecutionPlan entries for
simple constant values.

Python export API: add_metadata(edge_manager, {"key": value})
C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float

Supported types: str, int, float, bytes, list[int].
Keys use a namespace.field convention (e.g., tokenizer.bos_id,
context.max_seq_len) and are prefixed with "metadata." to avoid
collision with backend NamedData entries.

Differential Revision: D104322796
Copilot AI review requested due to automatic review settings May 11, 2026 23:03
@meta-codesync meta-codesync Bot force-pushed the export-D104322796 branch from 4d709cc to aa3d450 Compare May 11, 2026 23:03
@kirklandsign kirklandsign review requested due to automatic review settings May 11, 2026 23:03
or list[int]). Keys are automatically prefixed with "metadata." to
avoid collision with backend named data.
"""
for key, value in metadata.items():
Copy link
Copy Markdown
Contributor

@digantdesai digantdesai May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we planning to standardize key names and dtypes?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easiest way: follow what gguf did 🤣

Comment on lines +26 to +33
inline constexpr const char* kBosId = "metadata.tokenizer.bos_id";
inline constexpr const char* kEosIds = "metadata.tokenizer.eos_ids";
inline constexpr const char* kMaxSeqLen = "metadata.context.max_seq_len";
inline constexpr const char* kMaxContextLen = "metadata.context.max_context_len";
inline constexpr const char* kVocabSize = "metadata.model.vocab_size";
inline constexpr const char* kUseKVCache = "metadata.model.use_kv_cache";
inline constexpr const char* kNLayers = "metadata.model.n_layers";
inline constexpr const char* kChatTemplate = "metadata.tokenizer.chat_template";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sync with python?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant how to keep it in sync either through serializing the dtypes along with the names in the PTE or have some shared header file exposed to python like backends/xnnpack/utils/gen_xnnpack_constants.sh

@kirklandsign
Copy link
Copy Markdown
Contributor Author

@claude is metadata.h metadata keys in sync with python?

@claude
Copy link
Copy Markdown

claude Bot commented May 12, 2026

Claude finished @kirklandsign's task in 35s —— View job


Checking metadata key sync between C++ and Python

  • Read extension/llm/runner/metadata.h (C++ keys)
  • Read extension/llm/export/metadata.py (Python keys)
  • Compare and report findings

Answer: Yes, they are in sync. The C++ and Python sides are consistent in terms of encoding format and accessor functions. Here's the detailed comparison:

Key Constants (C++ only — Python has no predefined key constants)

The C++ header defines well-known key constants (metadata.h:26-33):

C++ Constant Value
kBosId metadata.tokenizer.bos_id
kEosIds metadata.tokenizer.eos_ids
kMaxSeqLen metadata.context.max_seq_len
kMaxContextLen metadata.context.max_context_len
kVocabSize metadata.model.vocab_size
kUseKVCache metadata.model.use_kv_cache
kNLayers metadata.model.n_layers
kChatTemplate metadata.tokenizer.chat_template

The Python side doesn't define these key constants — it accepts arbitrary string keys and prefixes them with "metadata." (metadata.py:21). This means:

  • Python export: add_metadata(mgr, {"tokenizer.bos_id": 1}) → stored as "metadata.tokenizer.bos_id"
  • C++ read: get_int(map, kBosId) → reads "metadata.tokenizer.bos_id"

These match correctly.

Accessor Functions

Type Python (metadata.py) C++ (metadata.h) Wire format Sync?
int get_int get_int <q (little-endian int64)
float get_float get_float <d (little-endian double)
string get_string get_string raw UTF-8 bytes
int list get_int_list get_int_list <I count + <q × N

Prefix

Both use "metadata."METADATA_PREFIX in Python (metadata.py:21) and kPrefix in C++ (metadata.h:24).

One observation

The Python side also supports raw bytes values in _encode_value, but there's no corresponding get_bytes on either side (C++ or Python reader). This is fine since read_metadata() already returns raw bytes per key, but if you wanted a symmetric C++ accessor for opaque blobs, that could be added later.

TL;DR: The wire formats, prefix convention, and accessor functions are fully in sync between the two files. The C++ side additionally defines well-known key constants that the Python side leaves to the caller.


meta-codesync Bot pushed a commit that referenced this pull request May 12, 2026
Summary:

Add support for embedding model-level metadata (tokenizer config, chat
templates, architecture info) directly in PTE files using the existing
NamedData mechanism. This is a lightweight alternative to the current
constant_methods approach, which creates full ExecutionPlan entries for
simple constant values.

Python export API: add_metadata(edge_manager, {"key": value})
C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float

Supported types: str, int, float, bytes, list[int].
Keys use a namespace.field convention (e.g., tokenizer.bos_id,
context.max_seq_len) and are prefixed with "metadata." to avoid
collision with backend NamedData entries.

Differential Revision: D104322796
Copilot AI review requested due to automatic review settings May 12, 2026 05:03
@meta-codesync meta-codesync Bot force-pushed the export-D104322796 branch from aa3d450 to 7460e27 Compare May 12, 2026 05:03
@kirklandsign kirklandsign review requested due to automatic review settings May 12, 2026 05:03
meta-codesync Bot pushed a commit that referenced this pull request May 12, 2026
Summary:

Add support for embedding model-level metadata (tokenizer config, chat
templates, architecture info) directly in PTE files using the existing
NamedData mechanism. This is a lightweight alternative to the current
constant_methods approach, which creates full ExecutionPlan entries for
simple constant values.

Python export API: add_metadata(edge_manager, {"key": value})
C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float

Supported types: str, int, float, bytes, list[int].
Keys use a namespace.field convention (e.g., tokenizer.bos_id,
context.max_seq_len) and are prefixed with "metadata." to avoid
collision with backend NamedData entries.

Differential Revision: D104322796
Copilot AI review requested due to automatic review settings May 12, 2026 05:09
@meta-codesync meta-codesync Bot force-pushed the export-D104322796 branch from 7460e27 to 9834c9b Compare May 12, 2026 05:09
@kirklandsign kirklandsign review requested due to automatic review settings May 12, 2026 05:09
Summary:

Add support for embedding model-level metadata (tokenizer config, chat
templates, architecture info) directly in PTE files using the existing
NamedData mechanism. This is a lightweight alternative to the current
constant_methods approach, which creates full ExecutionPlan entries for
simple constant values.

Python export API: add_metadata(edge_manager, {"key": value})
C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float

Supported types: str, int, float, bytes, list[int].
Keys use a namespace.field convention (e.g., tokenizer.bos_id,
context.max_seq_len) and are prefixed with "metadata." to avoid
collision with backend NamedData entries.

Differential Revision: D104322796
@meta-codesync meta-codesync Bot force-pushed the export-D104322796 branch from 9834c9b to 9da7249 Compare May 12, 2026 05:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants