Add model metadata storage in PTE via NamedData (#19466) by kirklandsign · Pull Request #19466 · pytorch/executorch

kirklandsign · 2026-05-11T21:08:49Z

Summary:

Add support for embedding model-level metadata (tokenizer config, chat
templates, architecture info) directly in PTE files using the existing
NamedData mechanism. This is a lightweight alternative to the current
constant_methods approach, which creates full ExecutionPlan entries for
simple constant values.

Python export API: add_metadata(edge_manager, {"key": value})
C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float

Supported types: str, int, float, bytes, list[int].
Keys use a namespace.field convention (e.g., tokenizer.bos_id,
context.max_seq_len) and are prefixed with "metadata." to avoid
collision with backend NamedData entries.

Differential Revision: D104322796

pytorch-bot · 2026-05-11T21:08:59Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19466

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Run pull jobs on OSDC in pull requests shadow mode

❌ 1 New Failure

As of commit 9da7249 with merge base ce08814 ():

NEW FAILURE - The following job has failed:

pull / test-multimodal-linux (gemma3-4b) / linux-job (gh)
RuntimeError: Command docker exec -t 2f3ae9be97c5e1c6a4290390d0c0ed571783f6252b5db9ec375af6614a497160 /exec failed with exit code 139

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2026-05-11T21:08:59Z

@kirklandsign has exported this pull request. If you are a Meta employee, you can view the originating Diff in D104322796.

github-actions · 2026-05-11T21:09:43Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

kirklandsign · 2026-05-11T21:10:11Z

@claude review

claude · 2026-05-11T21:10:52Z

Claude finished @kirklandsign's task in 1m 56s —— View job

Code Review: Add model metadata storage in PTE via NamedData

Read and analyze Python export API (extension/llm/export/metadata.py)
Read and analyze C++ runtime API (extension/llm/runner/metadata.h)
Read and analyze test file (extension/llm/export/test/test_metadata_roundtrip.py)
Review BUCK/bzl build changes
Post review feedback

Overall this is a clean, well-scoped addition. The design is sensible — leveraging the existing NamedData mechanism for lightweight metadata is much better than the current constant_methods approach. A few items worth addressing:

Issues

1. Missing get_float in C++ header (extension/llm/runner/metadata.h)

The Python side has get_float (which encodes as <d, an 8-byte double), but the C++ header has no corresponding get_float/get_double. If a user stores model.temperature (as shown in the test), there's no way to read it from C++. Either add get_double or document that float metadata is Python-only.

Fix this →

2. bool passes the int isinstance check silently (extension/llm/export/metadata.py:29)

In Python, isinstance(True, int) is True, so a boolean value will be silently encoded as int64_t (value 0 or 1). This may be intentional (the test uses "model.use_kv_cache": 1), but it's a subtle gotcha. Consider either:

Explicitly checking for bool before int and raising a TypeError (forcing users to pass 0/1), or
Documenting that booleans are stored as int64 0/1.

3. Accessing private _named_data_store (extension/llm/export/metadata.py:56)

edge_manager._named_data_store.add_named_data(...)

This accesses an internal attribute directly. If the EdgeProgramManager API changes, this will break silently. Consider whether a public method (e.g., edge_manager.add_named_data(key, data)) should be added to the EdgeProgramManager class instead, or at minimum add a comment noting the coupling.

4. No CMake target for the C++ metadata library

The targets.bzl (Buck) defines the metadata library, but extension/llm/runner/CMakeLists.txt has no corresponding target. OSS users building with CMake won't be able to link against it. This is header-only so it's low-friction to add — just needs an INTERFACE library target with the include path and deps.

Fix this →

Minor / Nits

5. buffer.Free() after move (extension/llm/runner/metadata.h:42-49)

auto buffer = std::move(result.get());
// ...
buffer.Free();

Since FreeableBuffer's destructor already calls Free(), the explicit Free() calls are redundant (the buffer is stack-allocated and will be destroyed at scope exit). They don't cause harm since Free() is idempotent, but they add noise. Consider letting RAII handle it, or add a brief comment explaining why explicit free is preferred (e.g., releasing memory before further processing).

6. Encoding format is implicitly tied between Python and C++

The wire format (little-endian int64, uint32 length-prefix for lists, etc.) is defined implicitly by matching struct.pack format strings in Python with memcpy in C++. Consider adding a brief format comment in either file so future maintainers don't have to cross-reference.

7. read_metadata imports at function body (extension/llm/export/metadata.py:68)

The deferred import of deserialize_pte_binary is fine for avoiding circular imports, but it means import errors won't surface until runtime. A comment explaining why would be helpful.

8. No validation of key format (extension/llm/export/metadata.py:55)

The docstring says keys use namespace.field convention, but there's no enforcement. Invalid keys (empty string, keys with the metadata. prefix already included) would silently produce unexpected results. Consider a lightweight validation or at least a warning.

Positive Notes

Good test coverage: round-trip, empty metadata, raw bytes, and the real-world LLM metadata use case.
The METADATA_PREFIX approach for namespacing is clean and simple.
The C++ API correctly validates buffer sizes before memcpy.
The well-defined constant keys in the C++ header (kBosId, kEosIds, etc.) will make adoption straightforward.

Copilot

Pull request overview

Adds a lightweight mechanism to embed and retrieve model-level metadata in .pte files by storing key/value blobs in the existing NamedData section, avoiding the heavier constant_methods approach for simple constants.

Changes:

Introduces a Python export helper (add_metadata) plus PTE-side readers for typed access (string/int/float/int-list).
Adds a small C++ runtime helper header for reading metadata values from runtime::NamedDataMap.
Adds a Python round-trip test and corresponding BUCK targets.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
extension/llm/runner/targets.bzl	Adds a new header-only C++ library target for metadata helpers.
extension/llm/runner/metadata.h	C++ helpers to decode int/string/int-list metadata values from `NamedDataMap`.
extension/llm/export/metadata.py	Python API to encode/store metadata into NamedData during export and read it back from PTE.
extension/llm/export/test/test_metadata_roundtrip.py	Round-trip tests for metadata storage and typed decoding.
extension/llm/export/test/BUCK	Adds BUCK test target for the new round-trip test.
extension/llm/export/BUCK	Adds BUCK python_library target for the new metadata module.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+  size_t expected = sizeof(uint32_t) + count * sizeof(int64_t);
+  if (buffer.size() != expected) {
+    buffer.Free();
+    return runtime::Error::InvalidArgument;
+  }
+  std::vector<int64_t> values(count);
+  std::memcpy(
+      values.data(),
+      static_cast<const char*>(buffer.data()) + sizeof(uint32_t),
+      count * sizeof(int64_t));


+def _encode_value(key: str, value: MetadataValue) -> bytes:
+    if isinstance(value, str):
+        return value.encode("utf-8")
+    elif isinstance(value, (list, tuple)):
+        return struct.pack(f"<I{len(value)}q", len(value), *value)
+    elif isinstance(value, int):
+        return struct.pack("<q", value)
+    elif isinstance(value, float):
+        return struct.pack("<d", value)
+    elif isinstance(value, bytes):
+        return value
+    raise TypeError(f"Unsupported metadata type {type(value)} for key '{key}'")


+        metadata: Dict mapping string keys to values (str, int, float, or bytes).
+            Keys are automatically prefixed with "metadata." to avoid collision
+            with backend named data.


Summary: Add support for embedding model-level metadata (tokenizer config, chat templates, architecture info) directly in PTE files using the existing NamedData mechanism. This is a lightweight alternative to the current constant_methods approach, which creates full ExecutionPlan entries for simple constant values. Python export API: add_metadata(edge_manager, {"key": value}) C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float Supported types: str, int, float, bytes, list[int]. Keys use a namespace.field convention (e.g., tokenizer.bos_id, context.max_seq_len) and are prefixed with "metadata." to avoid collision with backend NamedData entries. Differential Revision: D104322796

digantdesai · 2026-05-12T03:07:36Z

+            or list[int]). Keys are automatically prefixed with "metadata." to
+            avoid collision with backend named data.
+    """
+    for key, value in metadata.items():


Are we planning to standardize key names and dtypes?

Easiest way: follow what gguf did 🤣

digantdesai · 2026-05-12T03:08:17Z

+inline constexpr const char* kBosId = "metadata.tokenizer.bos_id";
+inline constexpr const char* kEosIds = "metadata.tokenizer.eos_ids";
+inline constexpr const char* kMaxSeqLen = "metadata.context.max_seq_len";
+inline constexpr const char* kMaxContextLen = "metadata.context.max_context_len";
+inline constexpr const char* kVocabSize = "metadata.model.vocab_size";
+inline constexpr const char* kUseKVCache = "metadata.model.use_kv_cache";
+inline constexpr const char* kNLayers = "metadata.model.n_layers";
+inline constexpr const char* kChatTemplate = "metadata.tokenizer.chat_template";


sync with python?

I meant how to keep it in sync either through serializing the dtypes along with the names in the PTE or have some shared header file exposed to python like backends/xnnpack/utils/gen_xnnpack_constants.sh

kirklandsign · 2026-05-12T05:02:08Z

@claude is metadata.h metadata keys in sync with python?

claude · 2026-05-12T05:02:58Z

Claude finished @kirklandsign's task in 35s —— View job

Checking metadata key sync between C++ and Python

Read extension/llm/runner/metadata.h (C++ keys)
Read extension/llm/export/metadata.py (Python keys)
Compare and report findings

Answer: Yes, they are in sync. The C++ and Python sides are consistent in terms of encoding format and accessor functions. Here's the detailed comparison:

Key Constants (C++ only — Python has no predefined key constants)

The C++ header defines well-known key constants (metadata.h:26-33):

C++ Constant	Value
`kBosId`	`metadata.tokenizer.bos_id`
`kEosIds`	`metadata.tokenizer.eos_ids`
`kMaxSeqLen`	`metadata.context.max_seq_len`
`kMaxContextLen`	`metadata.context.max_context_len`
`kVocabSize`	`metadata.model.vocab_size`
`kUseKVCache`	`metadata.model.use_kv_cache`
`kNLayers`	`metadata.model.n_layers`
`kChatTemplate`	`metadata.tokenizer.chat_template`

The Python side doesn't define these key constants — it accepts arbitrary string keys and prefixes them with "metadata." (metadata.py:21). This means:

Python export: add_metadata(mgr, {"tokenizer.bos_id": 1}) → stored as "metadata.tokenizer.bos_id"
C++ read: get_int(map, kBosId) → reads "metadata.tokenizer.bos_id"

These match correctly.

Accessor Functions

Type	Python (`metadata.py`)	C++ (`metadata.h`)	Wire format	Sync?
int	`get_int`	`get_int`	`<q` (little-endian int64)	✅
float	`get_float`	`get_float`	`<d` (little-endian double)	✅
string	`get_string`	`get_string`	raw UTF-8 bytes	✅
int list	`get_int_list`	`get_int_list`	`<I` count + `<q` × N	✅

Prefix

Both use "metadata." — METADATA_PREFIX in Python (metadata.py:21) and kPrefix in C++ (metadata.h:24).

One observation

The Python side also supports raw bytes values in _encode_value, but there's no corresponding get_bytes on either side (C++ or Python reader). This is fine since read_metadata() already returns raw bytes per key, but if you wanted a symmetric C++ accessor for opaque blobs, that could be added later.

TL;DR: The wire formats, prefix convention, and accessor functions are fully in sync between the two files. The C++ side additionally defines well-known key constants that the Python side leaves to the caller.

Summary: Add support for embedding model-level metadata (tokenizer config, chat templates, architecture info) directly in PTE files using the existing NamedData mechanism. This is a lightweight alternative to the current constant_methods approach, which creates full ExecutionPlan entries for simple constant values. Python export API: add_metadata(edge_manager, {"key": value}) C++ runtime API: metadata::get_int(map, key), get_string, get_int_list, get_float Supported types: str, int, float, bytes, list[int]. Keys use a namespace.field convention (e.g., tokenizer.bos_id, context.max_seq_len) and are prefixed with "metadata." to avoid collision with backend NamedData entries. Differential Revision: D104322796

Copilot AI review requested due to automatic review settings May 11, 2026 21:08

kirklandsign requested review from larryliu0820 and mergennachin as code owners May 11, 2026 21:08

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 11, 2026

meta-codesync Bot added fb-exported meta-exported labels May 11, 2026

Copilot started reviewing on behalf of kirklandsign May 11, 2026 21:09 View session

Copilot AI reviewed May 11, 2026

View reviewed changes

meta-codesync Bot force-pushed the export-D104322796 branch from 785ac68 to a7031c3 Compare May 11, 2026 22:23

meta-codesync Bot changed the title ~~Add model metadata storage in PTE via NamedData~~ Add model metadata storage in PTE via NamedData (#19466) May 11, 2026

Copilot AI review requested due to automatic review settings May 11, 2026 22:28

meta-codesync Bot force-pushed the export-D104322796 branch from a7031c3 to bfdcf96 Compare May 11, 2026 22:28

kirklandsign review requested due to automatic review settings May 11, 2026 22:28

Copilot AI review requested due to automatic review settings May 11, 2026 22:44

meta-codesync Bot force-pushed the export-D104322796 branch from bfdcf96 to 4d709cc Compare May 11, 2026 22:44

kirklandsign review requested due to automatic review settings May 11, 2026 22:44

Copilot AI review requested due to automatic review settings May 11, 2026 23:03

meta-codesync Bot force-pushed the export-D104322796 branch from 4d709cc to aa3d450 Compare May 11, 2026 23:03

kirklandsign review requested due to automatic review settings May 11, 2026 23:03

digantdesai reviewed May 12, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings May 12, 2026 05:03

meta-codesync Bot force-pushed the export-D104322796 branch from aa3d450 to 7460e27 Compare May 12, 2026 05:03

kirklandsign review requested due to automatic review settings May 12, 2026 05:03

Copilot AI review requested due to automatic review settings May 12, 2026 05:09

meta-codesync Bot force-pushed the export-D104322796 branch from 7460e27 to 9834c9b Compare May 12, 2026 05:09

kirklandsign review requested due to automatic review settings May 12, 2026 05:09

meta-codesync Bot force-pushed the export-D104322796 branch from 9834c9b to 9da7249 Compare May 12, 2026 05:10

Conversation

kirklandsign commented May 11, 2026 • edited by meta-codesync Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19466

❗ 1 Active SEVs

❌ 1 New Failure

Uh oh!

meta-codesync Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

This PR needs a release notes: label

Uh oh!

kirklandsign commented May 11, 2026

Uh oh!

claude Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review: Add model metadata storage in PTE via NamedData

Issues

Minor / Nits

Positive Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

digantdesai May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kirklandsign May 12, 2026

Choose a reason for hiding this comment

Uh oh!

digantdesai May 12, 2026

Choose a reason for hiding this comment

Uh oh!

kirklandsign May 12, 2026

Choose a reason for hiding this comment

Uh oh!

digantdesai May 12, 2026

Choose a reason for hiding this comment

Uh oh!

kirklandsign commented May 12, 2026

Uh oh!

claude Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checking metadata key sync between C++ and Python

Key Constants (C++ only — Python has no predefined key constants)

Accessor Functions

Prefix

One observation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kirklandsign commented May 11, 2026 •

edited by meta-codesync Bot

Loading

pytorch-bot Bot commented May 11, 2026 •

edited

Loading

This PR needs a `release notes:` label

claude Bot commented May 11, 2026 •

edited

Loading

digantdesai May 12, 2026 •

edited

Loading

claude Bot commented May 12, 2026 •

edited

Loading