Skip to content

fix: _get_offset_tokenizer immune to global fastokens patch (concurrent-pool race)#86

Merged
hallerite merged 2 commits into
mainfrom
fix/offset-tokenizer-fastokens-race
Jun 14, 2026
Merged

fix: _get_offset_tokenizer immune to global fastokens patch (concurrent-pool race)#86
hallerite merged 2 commits into
mainfrom
fix/offset-tokenizer-fastokens-race

Conversation

@S1ro1

@S1ro1 S1ro1 commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Problem

load_tokenizer(use_fastokens=True) installs a process-global fastokens monkeypatch for the duration of each pool-slot's from_pretrained, holding _FASTOKENS_PATCH_LOCK only around the patch/unpatch calls — not around the load itself. Under the concurrent renderer pool (create_renderer_pool fans loads across a thread pool), _get_offset_tokenizer's "vanilla" reload (_load_tokenizer_via_autoAutoTokenizer.from_pretrained) can run inside another slot's open patch window, come back with the offset-less fastokens shim, and get cached. From then on every offset attribution raises NotImplementedError: fastokens does not track character offsets (surfaced downstream as a 502 from the interception server).

Reproduced: a bare AutoTokenizer.from_pretrained issued during an open patch window returns an offset-less tokenizer.

Fix

In _get_offset_tokenizer, after the first load, verify offsets; if missing, reload with fastokens force-unpatched under _FASTOKENS_PATCH_LOCK (restoring the prior patch state) and re-probe before caching — so a non-offset tokenizer is never returned or cached. Verified: a reload landing in a patch window now returns an offset-capable tokenizer.

Note

Fix _get_offset_tokenizer to retry with fastokens patch disabled under a lock

  • Adds _has_offsets(tok) helper in renderers/base.py that verifies true offset support by checking is_fast and attempting tokenization with return_offsets_mapping=True.
  • Replaces the previous is_fast check with _has_offsets for both initial load and post-reload validation.
  • If the initial tokenizer fails _has_offsets, retries under _FASTOKENS_PATCH_LOCK by temporarily unpatching fastokens, reloading the tokenizer, then restoring the patch — suppressing stdout during patch/unpatch.
  • Only caches a tokenizer that passes _has_offsets; raises RuntimeError otherwise.
  • Risk: the temporary unpatch is process-global, so concurrent tokenizer loads during the retry window may see inconsistent patch state despite the lock.

Macroscope summarized 057f008.


Note

Medium Risk
Touches tokenizer loading and global fastokens patch coordination on a hot path for hand-coded renderers; incorrect locking or restore logic could regress attribution or pool startup, but scope is narrow to _get_offset_tokenizer.

Overview
Fixes a concurrent renderer pool race where _get_offset_tokenizer could load and cache a fastokens shim tokenizer (no offset_mapping) if AutoTokenizer.from_pretrained ran while another slot had the process-global fastokens patch active—breaking hand-coded renderer body/scaffold attribution downstream.

_get_offset_tokenizer now uses a _has_offsets probe (fast tokenizer + successful return_offsets_mapping=True) instead of relying on is_fast alone. After the first vanilla load, if offsets are missing it reloads under _FASTOKENS_PATCH_LOCK with fastokens temporarily unpatch (restoring prior patch state), then refuses to cache until offsets are verified; otherwise it raises a clearer RuntimeError.

Reviewed by Cursor Bugbot for commit 057f008. Bugbot is set up for automated code reviews on this repo. Configure here.

load_tokenizer toggles a process-global fastokens monkeypatch per pool-slot load,
holding _FASTOKENS_PATCH_LOCK only around the patch/unpatch calls, not the load. Under
the concurrent renderer pool, _get_offset_tokenizer's 'vanilla' reload could race an
open patch window, get an offset-less fastokens-backed tokenizer, and cache it —
permanently breaking offset attribution (renderers using attribute_text_segments then
raise 'fastokens does not track character offsets').

Reload with the patch forced off under _FASTOKENS_PATCH_LOCK and re-probe before
caching, so a poisoned (non-offset) tokenizer is never returned or cached.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mikasenghaas mikasenghaas marked this pull request as ready for review June 14, 2026 21:32
@mikasenghaas mikasenghaas requested a review from hallerite June 14, 2026 21:32
@macroscopeapp

macroscopeapp Bot commented Jun 14, 2026

Copy link
Copy Markdown

Approvability

Verdict: Approved

This is a well-scoped bug fix for a race condition in tokenizer loading. The changes add defensive detection and retry logic using existing lock infrastructure, with clear intent and limited scope to a single function.

You can customize Macroscope's approvability policy. Learn more.

@hallerite hallerite merged commit 67568f7 into main Jun 14, 2026
11 checks passed
@hallerite hallerite deleted the fix/offset-tokenizer-fastokens-race branch June 14, 2026 23:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants