Type-check opencontractserver.llms modules (issue #1484) by JSv4 · Pull Request #1543 · Open-Source-Legal/OpenContracts

JSv4 · 2026-05-05T03:19:12Z

Summary

Graduate the remaining opencontractserver.llms.* modules (agents, api/client, vector_stores) out of the mypy baseline by fixing type errors and adding proper type annotations. This resolves issue #1484 and completes the typing work started in #1468.

Key Changes

Core Agent Framework (`core_agents.py`)

Fixed CorpusAgentContext.documents to use field(default_factory=list) instead of Optional[list[Document]]
Updated initialize() to safely handle async document loading with proper type narrowing
Changed stream() protocol method signature from async def to def (returns async generator, not coroutine)
Fixed resume_with_approval() to return AsyncGenerator[UnifiedStreamEvent, None] consistently
Added Optional type hints to sources and metadata parameters across message methods
Fixed _stream_raw() abstract method with unreachable yield to satisfy type checker
Improved type safety in _normalise_source() with explicit string conversion

PydanticAI Agent Implementation (`pydantic_ai_agents.py`)

Refactored _stream_core() signature to accept **kwargs: Any for compatibility with mixin contract
Extracted typed kwargs (force_llm_id, force_user_msg_id, etc.) from **kwargs with proper defaults
Added explicit type annotation for effective_history variable
Removed unnecessary # type: ignore comments where types are now clear
Added null-safety checks before calling _finalise_llm_message() and complete_message()
Fixed resume_with_approval() to safely handle MessageState enum migration with fallback
Added type annotation for _vs_kwargs dictionary
Improved tool result handling with explicit type narrowing

Agent Factory (`agent_factory.py`)

Added Any import for proper type annotations
Fixed type hints in factory methods

API Module (`api.py`)

Changed ToolType from Union[str, CoreTool, callable] to Union[str, CoreTool, Callable[..., Any]]
Extracted framework resolution logic into _resolve_framework() helper
Added explicit type annotations for resolved tools lists
Fixed from_function() parameter type from callable to Callable[..., Any]
Improved type safety in get_structured_response_and_sources_from_document()

Vector Store Modules

core_vector_stores.py: Fixed embedder initialization with proper null checks and assertions; updated return type to AnnotationQuerySet
core_conversation_vector_stores.py: Added ChatMessageQuerySet import; improved embedder resolution with conditional logic
pydantic_ai_vector_stores.py: Added Callable import for proper type hints

Timeline & Mixin Support

timeline_stream_mixin.py: Fixed _stream_core() return type to AsyncGenerator[UnifiedStreamEvent, None] with unreachable yield
timeline_utils.py: Added cast import and TimelineEntry import for type safety

Configuration

Removed 9 ignore_errors = True directives from mypy.ini for llms modules
Added explanatory comment documenting the graduation of these modules

Implementation Details

Used sync_to_async(lambda: ...)() pattern for safe queryset execution in async contexts
Added # nosec B101 comments for intentional assertions used as type guards
Preserved backward compatibility by accepting **kwargs in public methods while extracting typed parameters internally
Used cast() for type narrowing where runtime behavior is guaranteed but type checker needs help
Added null-safety checks before database operations that require valid IDs

https://claude.ai/code/session_016L1wbQaPdTKtpuJnxErihF

…line Closes #1484. Removes the per-module ignore_errors blocks for the nine remaining llms.* modules (agents, api/client, vector_stores) and fixes the underlying ~125 type errors so mypy runs clean on this surface. Notable fixes: - Replaced `callable`/`any` placeholders with `typing.Callable[..., Any]` and `typing.Any` across api/client/vector_stores. - Made implicit Optionals explicit (`no_implicit_optional` defaults). - Tightened `_resolve_framework`/`_resolve_tools` helpers in llms.api so framework normalisation produces a concrete `AgentFramework`. - Typed Annotation/ChatMessage querysets as their custom subclasses so the `search_by_embedding` extension method resolves. - Switched `CorpusAgentContext.documents` to a non-Optional list with `default_factory=list`; aligned `CoreAgent.stream` / `CoreAgent.resume_with_approval` Protocol signatures with their PydanticAI implementations. - Pruned now-unused `# type: ignore` comments and the unused `CoreAnnotationVectorStore` annotation from `pydantic_ai_tools`. - Pruned the corresponding 189 advisory entries from `docs/typing/mypy_baseline.txt`.

Signed-off-by: JSIV <5049984+JSv4@users.noreply.github.com>

claude · 2026-05-05T03:32:02Z

Code Review: PR #1543 — Type-check opencontractserver.llms modules

Overview

This PR graduates 9 modules from the mypy ignore_errors = True baseline, fixing ~125 type errors across opencontractserver.llms.*. The work is valuable beyond pure type-hygiene: the typing pass uncovered several real runtime bugs that are fixed here.

Real bugs fixed — worth highlighting

These are not just type annotations; they are correctness fixes:

dataclasses.asdict() on a class instead of an instance (pydantic_ai_agents.py): Previously dataclasses.is_dataclass(usage) was truthy for both instances and class objects; the new guard and not isinstance(usage, type) prevents asdict() from being called on a bare class, which would crash at runtime.
Null message IDs passed to pause_for_approval() / _finalise_llm_message() / complete_message(): The null-safety guards (if llm_msg_id is not None:) prevent a TypeError in persistence-disabled flows.
String comparison for event type dispatch replaced with isinstance (pydantic_ai_agents.py): getattr(ev, "type", "") == "sources" was fragile and would silently break if the attribute name changed. isinstance(ev, SourceEvent) is both more correct and type-safe.
New permission pre-checks in tool closures (add_document_note_tool, add_exact_string_annotations_tool, duplicate_annotations_tool): Raises PermissionError before attempting the DB write when user_id is None or corpus is None. Good security tightening.

Positive patterns used well

sync_to_async(lambda: list(qs))() everywhere instead of sync_to_async(list)(qs) — the lambda form correctly wraps queryset evaluation inside the thread-pool executor.
cast(AnnotationQuerySet, queryset) / cast(ChatMessageQuerySet, queryset) — minimal surface casts to expose the custom manager's search_by_embedding method; the casts are safe because the managers actually return these custom queryset types.
_resolve_framework() helper extraction — the three copy-pasted framework-resolution blocks in api.py are collapsed into one function. Clean DRY improvement.
nosec B101 on intentional assertions — accurate: these are type-narrowing invariants, not security-relevant assertions.
Changelog entry is detailed and accurate.

Issues and suggestions

Medium — `CorpusAgentContext.documents` empty-vs-uninitialized ambiguity

Before: documents: Optional[list[Document]] = None / if self.documents is None:
After: documents: list[Document] = field(default_factory=list) / if not self.documents:

For a corpus with zero active documents, initialize() will re-query the database on every call because an empty list is falsy. You can no longer distinguish "not yet loaded" from "loaded and empty". A sentinel _initialized: bool = field(default=False, init=False) would be cleaner. Worth a comment at minimum.

Minor — import inside unreachable dead code

In core_agents.py _stream_raw(), the from typing import cast as _cast inside unreachable code is unusual and may confuse linters. Consider the conventional if False: idiom which avoids the unusual local import and is immediately recognizable as a type-narrowing stub:

raise NotImplementedError
if False:  # pragma: no cover
    yield cast(UnifiedStreamEvent, None)

Minor — `vector_store: Optional[Any]` is a type regression

pydantic_ai_tools.py loosens Optional[CoreAnnotationVectorStore] to Optional[Any]. The runtime reason is valid (two concrete store types share the field), but it removes all type-checking on this attribute. A VectorStoreProtocol defining search / async_search methods would be the tightest fix. Acceptable as a follow-up.

Minor — `_stream_core(**kwargs)` loses compile-time parameter checking

The **kwargs: Any + .pop() pattern satisfies the mixin contract but loses mypy verification on call-site arguments (force_llm_id, etc.). A TypedDict for the kwargs would recover that. Low priority but worth noting.

Minor — silent empty `tool_name` on malformed approval data

tool_name = str(pending.get("name") or "") silently produces "" when the field is absent. A warning log would help diagnose malformed persistence state.

Test coverage gap

The PR fixes real runtime behavior (the asdict guard, null-ID guards, permission pre-checks in tools) but adds no new tests for them. The _usage_to_dict / asdict guard would benefit from a unit test that passes a dataclass class object (not an instance) to confirm the old crash no longer occurs. Non-blocking since these are improvements over the pre-existing untested baseline, but should be tracked.

Summary

Solid, careful typing work that caught real bugs along the way. The dataclasses.asdict fix and null-ID guards are the most impactful; the permission pre-checks are a nice security improvement. None of the raised issues block merging. Recommend addressing the unreachable-import style and the silent empty tool_name before merge; the rest can be tracked as follow-ups.

codecov · 2026-05-05T03:36:53Z

Codecov Report

❌ Patch coverage is 74.23313% with 42 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...encontractserver/llms/agents/pydantic_ai_agents.py	68.08%	15 Missing ⚠️
opencontractserver/llms/agents/core_agents.py	62.50%	12 Missing ⚠️
...s/vector_stores/core_conversation_vector_stores.py	73.91%	6 Missing ⚠️
...actserver/llms/vector_stores/core_vector_stores.py	80.00%	4 Missing ⚠️
...ontractserver/llms/agents/timeline_stream_mixin.py	40.00%	3 Missing ⚠️
opencontractserver/llms/client.py	0.00%	1 Missing ⚠️
opencontractserver/tasks/agent_tasks.py	0.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

- embeddings.py: cast get_component_by_name to type[BaseEmbedder] and annotate embedder_class so the Optional return type is satisfied. - core_vector_stores.py / core_conversation_vector_stores.py: raise ValueError when get_embedder() yields no path (search_by_embedding requires str, not Optional[str]). Failing fast at construction time is preferable to a downstream type error. - core_agents.py: lift cast import to module-level and replace the unreachable-by-NotImplementedError yield with the conventional if False: idiom (review nit). - pydantic_ai_agents.py: log a warning when a paused message's pending_tool_call is missing 'name' so malformed persistence state doesn't silently drive downstream lookups with an empty tool name.

claude · 2026-05-05T04:34:15Z

Code Review: Type-check opencontractserver.llms modules (#1484)

This PR graduates the remaining opencontractserver.llms.* modules from the mypy baseline, fixing ~125 type errors across agents, API, client, and vector store modules. The work is well-scoped and the changes are clearly motivated. A few items worth discussing before merging:

Behavioral Changes Hidden in a Typing PR

Several changes go beyond annotation-only fixes and alter runtime behavior. These are worth calling out explicitly.

CorpusAgentContext.documents field change (core_agents.py:317)

# Before
documents: Optional[list[Document]] = None

# After
documents: list[Document] = field(default_factory=list)

The initialize() method now uses if not self.documents: instead of if self.documents is None:. This means a context pre-loaded with an intentionally empty document list will trigger a DB fetch on initialize(), changing semantics. If any caller deliberately passes documents=[] to signal "this corpus has no documents," it will be silently overwritten.

_normalise_source() falsy-check change (core_agents.py:904-508)

# Before (None-safe default)
content=raw.get("content", raw.get("text", ""))

# After (falsy-check, treats empty string "" as missing)
content = raw.get("content") or raw.get("text") or ""

The new version will fall through to text if content exists but is an empty string "". The old version would have kept the empty string. Minor, but worth confirming this is the intended behavior.

_resolve_framework() now raises TypeError (api.py:1232)
Previously, unresolvable framework values were silently passed downstream and would fail at the factory level. Now they raise at the API boundary. This is strictly better, but it's a new exception type callers might not expect.

Protocol Contract Changes

CoreAgent.stream() signature change (core_agents.py:344)

# Before
async def stream(self, message: str, **kwargs) -> AsyncGenerator[...]

# After
def stream(self, message: str, **kwargs: Any) -> AsyncGenerator[...]

Changing the Protocol method from async def to def is correct for returning an async generator (the generator itself is the async iterable, not the coroutine result). However, any caller that currently does await agent.stream(...) will silently stop awaiting after this change — the object returned by a sync method returning AsyncGenerator is just the generator, not a coroutine. Existing callers should be audited to confirm none do await agent.stream(...).

resume_with_approval() return type narrowed (core_agents.py:387-394)
The Protocol return type dropped UnifiedChatResponse from the union. If any caller relies on the UnifiedChatResponse path, they'll now get type errors (or runtime surprises). Verify no callers branch on isinstance(result, UnifiedChatResponse).

`**kwargs` Erosion in `_stream_core()`

The signature change from explicit parameters to **kwargs + manual extraction (pydantic_ai_agents.py:701-715) keeps the mixin contract happy but removes the key benefit of typed parameters — IDE completeness and call-site type checking. If this method is called from many places, consider whether a TypedDict or Protocol for the kwargs would preserve some safety without requiring the explicit signature.

`Optional[Any]` in `PydanticAIDependencies`

# pydantic_ai_tools.py:274
vector_store: Optional[Any] = Field(...)

The comment explains the reasoning (two different concrete types both work at runtime), but this erases all type safety for a field that the agent directly calls methods on. A small Protocol with just the search methods would restore meaningful type checking here without coupling to either concrete class:

class VectorStoreProtocol(Protocol):
    async def async_search(self, query: VectorSearchQuery) -> list[VectorSearchResult]: ...
    def search(self, query: VectorSearchQuery) -> list[VectorSearchResult]: ...

Positive Changes Worth Noting

Real bug fix: sync_to_async(lambda: list(queryset))() vs the old sync_to_async(list)(queryset) — the lambda form correctly defers queryset evaluation to the thread pool. The old form evaluated the queryset in the calling async context before handing off.
Real bug fix: dataclasses.is_dataclass(usage) and not isinstance(usage, type) prevents asdict() being called on a class rather than an instance.
DRY improvement: _resolve_framework() cleanly extracts the repeated framework resolution pattern across three callers.
Security improvement: The new explicit user_id is None and corpus is None guards in add_document_note_tool, duplicate_annotations_tool, and add_exact_string_annotations_tool add meaningful pre-flight validation.
isinstance over getattr(ev, "type", "") comparisons in ask_document event handling (pydantic_ai_agents.py:2936-2950) is strictly better — safe, readable, and not fragile to type attribute changes.

Test Coverage

No tests accompany the behavioral changes noted above. Specifically, the new PermissionError/ValueError guards in tool closures, the _resolve_framework() TypeError path, and the documents=[] initialize behavior would each benefit from a test. Given the CLAUDE.md guidance to ensure all affected tests pass, it would be worth confirming the existing agent test suite still passes without changes.

Minor

The # nosec B101 comments on assertions are appropriate and correctly annotated.
The if False: yield cast(UnifiedStreamEvent, None) pattern in abstract methods is unconventional but valid. A short comment explaining the pattern to future readers (beyond what's already in _stream_raw) would help, since it looks like dead code on first scan.
The inspected: Any = part workaround in _event_to_text_and_meta is a reasonable way to tame mypy's union narrowing complaints, though it obscures the type at the dispatch site.

Overall this is solid typing work. The main asks are: (1) confirm the documents=[] semantic change is intentional, (2) audit callers of stream() for any await agent.stream() usage, and (3) consider adding tests for the new runtime guards.

claude · 2026-05-05T04:59:40Z

Code Review: Type-check opencontractserver.llms modules (#1543)

This is a well-scoped, systematic typing PR that cleans up ~125 mypy errors across 9 modules and graduates them out of the baseline. Overall the approach is sound and the changes are correct. A few items worth discussing:

Positive Changes Worth Calling Out

sync_to_async pattern fix — The change from sync_to_async(list)(queryset) to sync_to_async(lambda: list(queryset))() is an actual correctness fix, not just cosmetic. The old form passes the queryset as an argument into a sync-wrapped list() call, meaning queryset attribute access (lazy evaluation, Django's _result_cache) happens in the wrong thread context. The lambda-capture form evaluates everything inside the thread pool worker. The changelog entry should call this out as a behavior fix, not just a typing change.
isinstance event checks — Replacing getattr(ev, "type", "") == "thought" with isinstance(ev, ThoughtEvent) throughout pydantic_ai_agents.py is strictly better: it's type-safe, avoids attribute access on Any, and doesn't silently pass when an event happens to carry a matching string literal for unrelated reasons.
dataclasses.asdict guard — dataclasses.is_dataclass(usage) and not isinstance(usage, type) correctly excludes class objects (which is_dataclass returns True for). This is a real bug fix.
_normalise_source improvement — Using str(content) after or-chain fallback avoids passing None as SourceNode.content.
Permission guards before annotation tools — Adding if config.user_id is None: raise PermissionError(...) before aadd_document_note, aduplicate_annotations_with_label, and aadd_annotations_from_exact_strings closes a real security gap.

Issues / Concerns

1. _stream_core(**kwargs: Any) loses static call-site type safety

Changing from explicit keyword parameters to **kwargs: Any with internal kwargs.pop() satisfies the mixin's Liskov requirement, but it silently accepts any kwargs (including typos) without error and makes IDE assistance useless for callers. The hidden contract should at least be documented in the docstring with a Parameters section. Consider whether a TypedDict overload would be practical here.

2. CorpusAgentContext.documents = field(default_factory=list) edge case

The initialize() check changed from if self.documents is None: to if not self.documents:. This means a caller who pre-populates documents=[] (empty list, intentionally) will trigger a DB fetch when initialize() is called. The old behaviour was explicit: only fetch if None. This is an unlikely edge case but the semantic change is invisible in the PR summary. Adding a comment on initialize() noting this behaviour would prevent future confusion.

3. resume_with_approval return-type narrowing is a breaking API change

The Protocol's resume_with_approval was changed from Union[UnifiedChatResponse, AsyncGenerator[...]] to AsyncGenerator[...] only. Any external code holding a typed reference to a CoreAgent Protocol and calling resume_with_approval expecting a possible UnifiedChatResponse will now get a type error — or worse, silently do the wrong thing at runtime if they await the result expecting a response object. Confirm no callers branch on the old union type before merging.

4. New # type: ignore[assignment] introduced

# pydantic_ai_agents.py
async for event in tool_stream:  # type: ignore[assignment]

This PR removes many # type: ignore comments but adds one new one. The suppression should be replaced with a proper fix (e.g., an explicit cast) with a comment explaining why the cast is safe — otherwise it's just moving the problem.

5. vector_store: Optional[Any] in PydanticAIDependencies

Typing vector_store as Optional[Any] erases all type checking for this field. The comment explains the runtime reason correctly, but a structural Protocol (e.g., class VectorStoreProtocol(Protocol): def search(...) -> ...: ...) would preserve safety while handling both concrete types. The project already has VectorStoreProtocol in opencontractserver/types/protocols.py — consider using or extending that.

6. Duplicated __init__ embedder-resolution block

The sync and async variants of the conversation vector store class now both contain an identical ≈30-line if embedder_path is not None / else assert corpus_id is not None block. This was pre-existing duplication but this PR added significant content to both copies identically. A shared _resolve_embedder(corpus_id, embedder_path) helper would follow the DRY principle stated in CLAUDE.md and avoid the two copies drifting apart.

Missing Test Coverage

The following are behaviour changes (not just typing changes) with no accompanying tests:

llm_msg_id is not None guard in _stream_core — the "no persistence" code path was previously incorrect; a test for stream() with enable_message_persistence=False would confirm the guard works.
Permission checks in annotation tools (add_document_note_tool, duplicate_annotations_tool, add_exact_string_annotations_tool) — a simple test asserting PermissionError is raised when user_id=None would protect these.
The _resolve_framework helper — a TypeError is now raised for unrecognised framework values instead of propagating silently; this path is untested.

Minor Nits

The if False: yield cast(UnifiedStreamEvent, None) pattern in _stream_raw is valid but the comment in the if False: block says "Unreachable" which is slightly misleading — the real purpose is to syntactically turn the method into a generator function. A one-line comment at the method level (# unreachable yield makes this an async generator at the type level) would be clearer than inline documentation inside a dead branch.
effective_history: Optional[list[Any]] declaration before the if/else block is fine but the variable is never None by the time it leaves the conditional — the annotation slightly overstates what's possible downstream.

Summary

The typing work is thorough and correct. The real behaviour fixes (sync_to_async, permission checks, isinstance event dispatch, dataclasses guard) are genuinely valuable. The main asks before merging:

Confirm or fix the resume_with_approval union-type break for existing callers.
Replace the newly-added # type: ignore[assignment] with a cast + explanation, or fix the underlying type.
Consider VectorStoreProtocol instead of Optional[Any] for PydanticAIDependencies.vector_store.
Add tests for the behaviour changes listed above (or open a follow-up issue).

…-lgZT0 # Conflicts: # CHANGELOG.md

… tests Two follow-ups from the Claude review on PR #1543: - The async for event in tool_stream: # type: ignore[assignment] in pydantic_ai_agents.py was the only new ignore introduced by the typing graduation. Removing it would either require renaming the loop variable across ~30 reference sites in the same branch (every event.part.* / event.result.* access) or a cast(Any, ...) that surfaces unrelated mypy errors downstream (the casted event flows into serializable_args whose four assignment branches give it different types). Add a multi-line comment explaining why the narrow [assignment]-only suppression is the right tradeoff and what the runtime contract is, so a future reader can either accept the ignore or rename with full context. - Add test_llms_typing_behavior_guards.py covering the runtime guards the PR introduced that the review flagged as untested: - _resolve_framework: TypeError on unrecognised non-string objects, ValueError on unknown strings, pass-through for str / enum / None inputs. - PermissionError contract on add_document_note_tool / duplicate_annotations_tool / add_exact_string_annotations_tool — pinned via the message-string contract (3+ guards in source) so a refactor that drops one surfaces here. - Anti-regression guards for the sync_to_async(list)(queryset) bug, the dataclasses.is_dataclass + isinstance(type) guard, the isinstance(ev, ThoughtEvent) dispatch (vs. string-compare), and the llm_msg_id is not None persistence guard in _stream_core. - CorpusAgentContext.documents default_factory=list (vs. Optional) contract, including a runtime check that documents == [] when the caller doesn't pre-populate. These also restore the PR's codecov backend coverage by exercising the new branches in api.py and pydantic_ai_agents.py.

claude · 2026-05-06T03:43:58Z

Code Review — PR #1543: Type-check `opencontractserver.llms` modules

Overall: This is solid, well-scoped work. Graduating 9 modules from the mypy baseline clears ~125 real type errors and includes several genuine runtime correctness fixes alongside the annotation work. A few things worth addressing before merge.

What the PR Does Well

sync_to_async(list)(queryset) → sync_to_async(lambda: list(queryset))() — This is a real bug fix, not just a typing change. The original form evaluates the queryset in the calling async context before the thread pool sees it, which can raise SynchronousOnlyOperation. The lambda defers evaluation correctly.
_resolve_framework() helper — Good DRY refactoring. Three identical framework-resolution blocks collapsed into one with a TypeError escape valve for unrecognized types.
CorpusAgentContext.documents change — Changing from Optional[list[Document]] to list[Document] = field(default_factory=list) is the right semantic call; None was never a meaningful steady state.
isinstance(ev, SourceEvent/ThoughtEvent/FinalEvent) over getattr(ev, "type", "") — Correct improvement; string-based type dispatch is fragile.
dataclasses.is_dataclass(usage) and not isinstance(usage, type) — Correctly guards against asdict() receiving a class object rather than an instance.
User-ID guards on annotation tools — PermissionError raised before any DB write when config.user_id is None is a good defensive addition.

Issues Worth Addressing

1. `assert` used for runtime security invariants (medium)

Several new assert statements guard security/correctness invariants:

assert self.user_id is not None  # nosec B101 — factory invariant

assert is stripped by Python's -O optimisation flag. If this code ever runs in an optimised context (some container images enable it), the guard silently vanishes and creator_id=None is passed to the ORM, producing a confusing DB error rather than a clean early failure.

Suggestion: Replace with explicit if/raise:

if self.user_id is None:
    raise RuntimeError("factory invariant violated: user_id must be set for persistent messages")

The three nosec B101 sites in core_agents.py and the two in core_conversation_vector_stores.py / core_vector_stores.py all fall into this category.

2. `vector_store: Optional[Any]` widens type to `Any` (low-medium)

In pydantic_ai_tools.py:

# Before
vector_store: Optional[CoreAnnotationVectorStore] = Field(...)
# After
vector_store: Optional[Any] = Field(...)

The comment explains why (CoreAnnotationVectorStore vs. its PydanticAI wrapper), but Any silently disables all type checking on this field for callers. A Protocol or Union[CoreAnnotationVectorStore, PydanticAIAnnotationVectorStore] would preserve meaningful coverage. If a Protocol is too much work now, at least a # type: ignore[assignment] at the injection site keeps the field typed and confines the suppression.

3. `cast("list[TimelineEntry]", self._timeline)` is unchecked (low)

_timeline is accumulated as list[dict[str, Any]] throughout TimelineBuilder.add(). Casting it to list[TimelineEntry] satisfies mypy but does not guarantee the dictionaries actually conform to the TypedDict. If a caller later adds a stricter consumer that expects real TimelineEntry keys, the cast silently hides the mismatch. Consider either narrowing the accumulation type or keeping it as list[dict[str, Any]] and updating the property's return type to match.

4. Semantic change in `CorpusAgentContext.initialize()` (low)

# Before
if self.documents is None:
# After
if not self.documents:

An explicitly-passed empty list now triggers a DB reload. This could surprise callers who pass documents=[] to signal "no documents; skip loading". If the intent is truly "reload when empty", the condition is fine — but worth a note in the docstring or constructor signature.

5. `_resolve_framework` default setting name for vector stores (low)

In api.py, VectorStoreAPI.create() calls:

resolved_framework = _resolve_framework(framework)  # default: LLMS_DOCUMENT_AGENT_FRAMEWORK

A vector store factory reading LLMS_DOCUMENT_AGENT_FRAMEWORK looks like an unintentional leak from the document-agent path. Should this be a LLMS_VECTOR_STORE_FRAMEWORK setting, or at least documented as intentional?

Test Coverage

The new test_llms_typing_behavior_guards.py is a good addition. A few observations:

test_aadd_document_note_is_not_called_when_user_id_is_none and the two sibling tests only assert callable(paa.aadd_document_note) — they don't test the guard itself. The comment acknowledges this limitation ("can't construct it without a full Agent"), which is fair, but the tests read as guard tests and aren't. Consider either removing them and relying on the source-inspection test, or renaming them to make the scope clear (e.g., test_aadd_document_note_is_importable).
test_permission_error_message_format uses inspect.getsource() to count occurrences of a string in source code. This is explicitly acknowledged as brittle and intentional — that's fine as long as the team understands it's a lint-via-test rather than a functional test.
TestSyncToAsyncListEvaluation is clever: it greps source files to prevent the bug regressing. No objection, just note that it reads files by relative path from the CWD, which works in the Docker test environment but would fail if tests are ever run from a different working directory. Using Path(__file__).parent.parent.parent / "opencontractserver/..." makes it path-safe.

Minor Style Notes

A few new comments in pydantic_ai_agents.py describe what the code does rather than why (CLAUDE.md asks to omit those). The _stream_core(**kwargs) block is borderline — the explanation of the mixin constraint is genuinely non-obvious.
The # ``[assignment]`` ignore multi-line comment block at the async for event in tool_stream site is a good write-up; no issue there.

Summary

The core work is correct and the runtime fixes (sync_to_async lambda, is_dataclass guard, isinstance dispatch) are worth merging on their own. The main asks before landing:

Convert assert self.user_id is not None guards to explicit if/raise (the -O risk is real).
Reconsider Optional[Any] for vector_store — at minimum document the decision clearly.
Clarify or fix the _resolve_framework default setting name for vector stores.

…-lgZT0 # Conflicts: # CHANGELOG.md # docs/typing/mypy_baseline.txt

- Convert 8 ``assert ... # nosec B101`` security/invariant guards to explicit ``if/raise RuntimeError`` so the checks survive ``python -O`` optimisation (asserts are stripped under ``-O``). Sites: * ``CoreConversationManager.create_for_document`` / ``create_for_corpus`` (cid resolution) * ``CoreConversationManager`` ``store_*_message`` x3 (factory user_id invariant) * ``ConversationVectorStore`` / ``CoreAnnotationVectorStore`` corpus_id auto-detect branches. - Replace ``vector_store: Optional[Any]`` on ``PydanticAIDependencies`` with a typed ``Optional[_AgentVectorStoreProto]``. The new Protocol declares the only attribute callers actually reach for (``similarity_search``) and avoids the circular import that would otherwise force ``Union[CoreAnnotationVectorStore, PydanticAIAnnotationVectorStore]``. - Document ``CorpusAgentContext.initialize`` reload semantics: the switch from ``is None`` to ``not self.documents`` was deliberate to match the post-typing default (``[]`` instead of ``None``); the docstring now spells out that callers wanting "no docs — skip load" should not invoke ``initialize()`` at all. - ``VectorStoreAPI.create`` no longer reads ``LLMS_DOCUMENT_AGENT_FRAMEWORK`` — the leak from the document-agent path was unintentional. New dedicated setting ``LLMS_VECTOR_STORE_FRAMEWORK`` with the same fallthrough default (``AgentFramework.PYDANTIC_AI``), so existing deployments keep working without a settings change. - ``TimelineBuilder._timeline`` accumulation comment now explains why the storage stays as ``list[dict[str, Any]]`` (mutating ``entry`` dicts post-construction would be rejected by mypy if typed as the closed ``TimelineEntry`` TypedDict) and where the schema contract is pinned (the property-level ``cast``). - ``test_llms_typing_behavior_guards``: * Rename three importability-only tests (``test_aadd_*_is_not_called_when_user_id_is_none``) to ``..._is_importable`` so the scope matches what they actually assert; behaviour pin remains in ``test_permission_error_message_format``. * All ``Path("opencontractserver/...")`` source-grep call sites now resolve via ``Path(__file__).resolve().parents[2]`` so the tests work regardless of the runner's CWD.

CI fix: - ``test_shared_managers.py``: re-format the multi-line ``order_by + values_list + first`` chain that black wanted as a single line. The earlier formatting tripped the linter job; running black locally yields the canonical layout. Review feedback (PR #1533, second pass): - ``test_async_task_decorators::test_async_function_has_access_to_txt_text`` lost the final ``span_annotation.annotation_label.text == "TEXT_SPAN_ASYNC"`` assertion when the new ``test_async_doc_analyzer_task_txt_no_extract_file_raises`` test was appended in the previous review-fix commit. Restored — the end-to-end label-text check is the only place that pins the analyzer's annotation_label_text propagation through the async post-processing path. - ``shared/Managers.py``: condense the 7-line explanatory block above the ``model_name is not None`` guard to one short paragraph (per CLAUDE.md comment style) and convert the ``assert`` to an explicit ``if model_name is None: raise RuntimeError(...)`` so the invariant survives ``python -O`` (where asserts are stripped). Same fix pattern as PR #1543; the reviewer didn't flag it on this PR but the security rationale is identical. The reviewer's other action item — restoring ``list[TokenIdPythonType]`` annotations in ``llamaparse_parser.py`` that were demoted to comments — is out of scope: the comment-style demotions were introduced by PR #1544 (pipeline review fixes) which has already merged into ``main``; this PR does not touch ``llamaparse_parser.py``. A follow-up tracker should be opened against the pipeline graduation if those annotations need restoring.

+                raise RuntimeError(
+                    "internal invariant violated: corpus_id is None in "
+                    "auto-detect branch (the `else` should only be reached "
+                    "when corpus_id is set)"
+                )


+                raise RuntimeError(
+                    "internal invariant violated: corpus_id is None in "
+                    "auto-detect branch (the `else` should only be reached "
+                    "when corpus_id is set)"
+                )


+                raise RuntimeError(
+                    "internal invariant violated: corpus_id is None in the "
+                    "embedder-path-absent branch (constructor requires at "
+                    "least one of corpus_id / embedder_path)"
+                )


+
+    async def similarity_search(
+        self, query: str, *, k: int = ..., **kwargs: Any
+    ) -> Any: ...


claude · 2026-05-06T22:57:53Z

Code Review: Type-check `opencontractserver.llms` modules (Issue #1484)

Good work graduating 9 modules out of the mypy ignore_errors = True baseline. Several changes fix real runtime bugs alongside the type annotations. Below is a prioritized breakdown.

Real Bug Fixes (Positive)

These changes correct actual runtime issues, not just type-checker noise:

sync_to_async(lambda: list(qs))() — the old sync_to_async(list)(qs) evaluated the queryset on the calling thread before handing work to the pool. The lambda-wrap fixes the actual async-safety issue across core_vector_stores.py, core_conversation_vector_stores.py, and core_agents.py.
dataclasses.asdict guard — dataclasses.is_dataclass returns True for both instances and class objects. Adding and not isinstance(usage, type) in _usage_to_dict prevents a TypeError when a dataclass class (not instance) is accidentally passed.
isinstance(ev, SourceEvent/ThoughtEvent/FinalEvent) over getattr(ev, "type", "") == "..." — the old form silently matched any event carrying a .type attribute with the matching string. The new form is correct and less fragile.
doc_pk = self.document_id lambda closure capture — avoids stale-capture bugs in the sync_to_async lambda in _build_base_queryset.
getattr(agent, "agent_deps", None) — the original getattr(agent.agent_deps, ...) would raise AttributeError if agent_deps didn't exist; the two-step approach is safer.
user_id is None pre-checks before DB writes — good defensive addition in create_placeholder_message, store_user_message, store_llm_message, and the tool closures. Using raise RuntimeError instead of assert (to survive python -O) is the right call.

Concerns

1. `_stream_core(**kwargs)` erases the public contract (medium)

# Before
async def _stream_core(self, message: str, *, force_llm_id: int | None = None, ...) -> ...:

# After
async def _stream_core(self, message: str, **kwargs: Any) -> ...:
    force_llm_id: Optional[int] = kwargs.pop("force_llm_id", None)

This moves from a typed, IDE-navigable signature to a stringly-typed interface. A misspelled kwarg (e.g. force_llm_idd=123) will be silently ignored — kwargs.pop returns None, the message ID is lost, and there's no error. The PR description explains why (mixin contract compatibility), but the approach creates a maintenance trap.

A safer alternative is a typed TypedDict or dataclass for the kwargs, or a **Unpack[_StreamCoreKwargs] annotation (PEP 692, available via typing_extensions). Worth a follow-up issue.

2. Unreachable `yield cast(UnifiedStreamEvent, None)` in Protocol stub (low)

async def _stream_core(self, ...) -> AsyncGenerator[UnifiedStreamEvent, None]:
    raise NotImplementedError("_stream_core() must be implemented by adapter")
    yield cast(UnifiedStreamEvent, None)  # unreachable

cast(UnifiedStreamEvent, None) would yield None at runtime if the raise were ever removed — a silent misbehavior. A bare yield # type: ignore[misc] is the conventional idiom for this pattern, or verify whether the current mypy version accepts raise NotImplementedError alone in an AsyncGenerator-returning method (recent versions do).

3. `cast()` in `TimelineBuilder.timeline` is an unsound contract (low)

@property
def timeline(self) -> list[TimelineEntry]:
    return cast("list[TimelineEntry]", self._timeline)

The comment acknowledges _timeline is list[dict[str, Any]] — the cast is unsound. Callers that trust the narrower return type will get KeyError at runtime for malformed entries. Consider keeping the return type as list[dict[str, Any]] and using TypeAlias at call sites that need the narrower hint — that would be an honest contract rather than silencing the type checker at the boundary.

4. `int(self.corpus_id)` cast in queryset filter (low)

queryset = queryset.filter(conversation__chat_with_corpus_id=int(self.corpus_id))

If corpus_id requires an int() cast at filter time, its type annotation should reflect Union[int, str] or the constructor should coerce it on entry. A silent runtime cast at point of use suggests the declared type doesn't match reality.

5. Logically dead `cid is None` guards (low)

elif conversation_id or config.conversation_id:
    cid = conversation_id or config.conversation_id
    if cid is None:  # can never be True here
        raise RuntimeError("internal invariant violated: ...")

The guard exists purely for mypy type narrowing — raise RuntimeError implies a recoverable runtime failure, which this isn't. assert cid is not None, "..." (with # nosec B101) communicates invariant intent more idiomatically.

6. Duplicate embedder-resolution logic (low)

The __init__ methods of the two concrete conversation vector store classes in core_conversation_vector_stores.py contain nearly identical ~40-line if embedder_path is not None / else blocks after this PR. A shared _resolve_embedder_path(corpus_id, embedder_path) helper would eliminate the duplication.

Test File Observations

The new test_llms_typing_behavior_guards.py pins _resolve_framework and CorpusAgentContext.documents behavior cleanly. A few notes:

Source-grep tests (TestSyncToAsyncListEvaluation, TestDataclassesAsdictUsageGuard, TestIsinstanceEventDispatchKeepsTypeSafety, TestStreamCorePersistenceGuard) are brittle: they break on formatting changes, pass if the pattern appears only in a comment, and don't test runtime behavior. These are better as CI grep shell checks or mypy rules than as TestCase assertions.
TestPermissionGuardsOnAnnotationTools — the three _is_importable tests only verify callable(fn), which is trivially true for any imported function. The guard itself is never exercised. Consider replacing them with a test that constructs a minimal AgentConfig(user_id=None) and asserts the PermissionError is raised from the inner closure.
Multi-paragraph docstrings throughout the test file conflict with CLAUDE.md ("Never write multi-paragraph docstrings or multi-line comment blocks — one short line max"). The rationale is valuable but belongs in the PR description, not inline docstrings.

Nits

messages = loaded_messages or config.loaded_messages or [] — the or [] is unreachable since the branch is only entered when the preceding if loaded_messages or config.loaded_messages is truthy. Harmless but misleading.
The _resolve_framework extraction is clean and all three call sites benefit. Good DRY improvement.
_AgentVectorStoreProto Protocol with @runtime_checkable is the right tool for breaking the circular import without widening to Any.

Summary

The real bug fixes (sync_to_async lambda, dataclasses.asdict guard, isinstance event dispatch) make this PR worthwhile independent of the typing graduation. The main concern worth a follow-up issue is the **kwargs change to _stream_core — it trades type safety for mixin compatibility in a way that silently swallows misspelled arguments. Everything else is low-priority cleanup.

…-lgZT0 # Conflicts: # CHANGELOG.md # mypy.ini

claude · 2026-05-07T04:19:31Z

Code Review — PR #1543 (Type-check opencontractserver.llms modules)

Overall this is a well-structured typing graduation. The changes are clearly scoped to the llms.* modules, the mypy.ini removals map 1:1 to the modules touched, and the changelog entry is detailed enough to be useful. Several changes also fix latent runtime bugs, which is a nice bonus.

Genuine bug fixes worth calling out explicitly

dataclasses.is_dataclass(usage) and not isinstance(usage, type) (pydantic_ai_agents.py)

This is a real correctness fix. dataclasses.is_dataclass() returns True for both dataclass classes and instances, but dataclasses.asdict() only accepts instances and raises TypeError on classes. The added isinstance(usage, type) guard is the idiomatic fix.

if llm_msg_id is not None: guards before persistence calls

Adding explicit null-guards before _finalise_llm_message() and complete_message() is correct. Ephemeral (non-persisted) agents legitimately produce llm_msg_id = None, and the previous code would TypeError at the function boundary.

isinstance(ev, SourceEvent/ThoughtEvent/FinalEvent) replacing getattr(ev, "type", "") == "..."

Switching to isinstance checks is strictly better — no magic string, not susceptible to objects that happen to have a .type attribute set to the right string for unrelated reasons.

getattr(agent, "agent_deps", None) in api.py

Guards against AttributeError if agent_deps is absent on the agent object. Correct defensive change.

Concerns

1. Semantic change in CorpusAgentContext.initialize() (medium)

# Before
documents: Optional[list[Document]] = None
if self.documents is None:
    self.documents = await sync_to_async(list)(self.corpus.get_documents())

# After
documents: list[Document] = field(default_factory=list)
if not self.documents:
    self.documents = await sync_to_async(lambda: list(self.corpus.get_documents()))()

The intent is to replace None as the "not loaded" sentinel with []. However if not self.documents: conflates two distinct states: never loaded ([]) and loaded but empty ([]). If a corpus genuinely has zero active documents, calling initialize() again will re-query the database unnecessarily and overwrite the already-correct empty list.

The PR comment in the docstring acknowledges this ("Callers that want to explicitly state 'no documents — skip loading' should not invoke initialize()"), but that's a new caller-responsibility that didn't exist before. Consider using a sentinel — either keep Optional[list[Document]] and check is None, or use a private _initialized: bool flag. The typing improvement doesn't require conflating these two states.

2. _stream_core(**kwargs: Any) erases caller-side type safety (minor)

# Before
async def _stream_core(
    self,
    message: str,
    *,
    force_llm_id: int | None = None,
    force_user_msg_id: int | None = None,
    ...

# After
async def _stream_core(self, message: str, **kwargs: Any) -> ...:
    force_llm_id: Optional[int] = kwargs.pop("force_llm_id", None)
    ...

The PR correctly identifies that the mixin contract requires a compatible signature. However, the tradeoff is that callers now pass these parameters as untyped kwargs — a typo like forcellm_id=123 becomes a silent no-op instead of a type error. Since these are internal methods, not public API, it's an acceptable workaround, but the comment in _stream_core's docstring should explicitly name all accepted kwargs so future readers know the implicit contract.

3. Permission guard tests are mostly importability checks, not behavioral tests

def test_aadd_document_note_is_importable(self) -> None:
    from opencontractserver.llms.agents import pydantic_ai_agents as paa
    self.assertTrue(callable(paa.aadd_document_note))

The test file acknowledges this limitation ("We can't easily invoke the inner closure from outside"). These tests verify the function names are importable but don't verify that PermissionError is actually raised when user_id is None. The source-inspection test (source.count("requires an authenticated user")) is brittle by the PR's own admission.

Two untested guards worth adding behavioral tests for:

The context.corpus is None → ValueError path in add_document_note_tool and add_exact_string_annotations_tool (neither is covered by the count test).
The resume_with_approval path when tool_name is empty after stripping.

4. Unreachable yield cast(UnifiedStreamEvent, None) in timeline_stream_mixin.py

async def _stream_core(self, ...) -> AsyncGenerator[UnifiedStreamEvent]:
    raise NotImplementedError("_stream_core() must be implemented by adapter")
    # The unreachable yield turns this into an async generator...
    yield cast(UnifiedStreamEvent, None)

This is a known Python trick to make a function an async generator. It works, but cast(UnifiedStreamEvent, None) yields None at runtime if somehow the raise is bypassed. A safer alternative is yield # type: ignore[misc] or better: declare the base as an abstract class with @abc.abstractmethod rather than a Protocol stub. Minor, but the cast(X, None) pattern is a lie to the type checker that could bite a future consumer of the yielded value.

5. _resolve_framework uses same default setting_name doc but different for corpus

The default setting_name="LLMS_DOCUMENT_AGENT_FRAMEWORK" is fine since the call-sites pass the correct setting for corpus agents. Just make sure the default isn't accidentally used in a corpus context without the override — the test covers None resolution but doesn't exercise the corpus-specific setting name path.

Nits / style

The inspected: Any = part alias in _event_to_text_and_meta is functional but adds visual noise. An alternative is # type: ignore[union-attr] on the offending lines, which is narrower and self-documenting.
Several new RuntimeError messages are verbose multi-sentence strings. The message should be one short clause; the explanation belongs in the surrounding comment (which is already there).
The new _AgentVectorStoreProto Protocol in pydantic_ai_tools.py is a clean solution to the circular import. @runtime_checkable is correctly applied since callers likely won't isinstance-check against it, but it's harmless.

Summary

The PR achieves its goal — 9 modules graduate from ignore_errors = True and ~125 errors are resolved. The runtime changes (null guards, permission checks, isinstance dispatch) are improvements. The main concern is the CorpusAgentContext.initialize() semantic change that conflates "not yet loaded" with "legitimately empty"; that deserves a follow-up or a clarification comment in the code. The test file is a good starting point but the behavioral coverage for the permission guards is thin.

…-lgZT0 # Conflicts: # CHANGELOG.md

Targets the lines added by the llms.* mypy graduation in PR #1543 that codecov flagged as uncovered (75.62% patch coverage). New unit tests cover: - core_agents: CorpusAgentContext.initialize() both branches (load / skip), CoreAgentBase._normalise_source content fallback chain, get_default_config defaults dict + None-override filter. - pydantic_ai_agents: _extract_tool_result_summary (dict/string/None/ truncation/error), _event_to_text_and_meta (all isinstance branches: TextPart, ToolCallPart, TextPartDelta, ToolCallPartDelta, unsupported), _usage_to_dict (None / model_dump / dataclass instance / dataclass *class*-object guard / unknown). - client.SimpleLLMClient._chat_openai with and without max_tokens. - core_vector_stores.CoreAnnotationVectorStore ctor: explicit-path, corpus-only, ValueError on unresolved embedder, validation error. - core_conversation_vector_stores.{CoreConversationVectorStore, CoreChatMessageVectorStore} ctors: explicit-path with/without corpus, resolution-failure swallowed, auto-detect ValueError, auto-detect success. - timeline_stream_mixin.TimelineStreamMixin.stream: timeline injection, _finalise_llm_message hook invocation, default _stream_core NotImplementedError. Tests are mock-heavy and narrow — coverage tests, not behavioural contracts (those live in test_llms_typing_behavior_guards.py).

claude · 2026-05-07T23:37:38Z

Code Review: PR #1543 — Type-check `opencontractserver.llms` modules

Overall this is solid, well-scoped work. Graduating 9 modules out of mypy's ignore_errors baseline with ~125 fixes is meaningful progress. The sync_to_async(lambda: list(qs))() correction is a genuine runtime improvement, and the cast(UnifiedStreamEvent, None) unreachable-yield trick for abstract async generators is the right pattern. The review below focuses on a functional regression, a few behavioral changes that need attention, and some type-safety trade-offs introduced during the refactor.

🔴 Bug — `add_document_note_tool` breaks standalone document agents

File: opencontractserver/llms/agents/pydantic_ai_agents.py ~line 2368

if context.corpus is None:
    raise ValueError(
        "add_document_note requires the agent to be scoped to a corpus"
    )

The underlying aadd_document_note accepts corpus_id: int | None = None — corpus is optional. This guard breaks agents that are scoped to a document without a corpus. The guard was added purely to satisfy the type checker (context.corpus.id fails when corpus is None). The fix is to pass corpus_id=context.corpus.id if context.corpus else None and remove the guard.

Contrast with the guard in add_exact_string_annotations_tool — that one is correct because aadd_annotations_from_exact_strings takes corpus_id: int (non-optional). Both guards look parallel in the source, but only one is semantically justified. No test covers the corpus is None branch here.

🟡 Behavioral Change — Undocumented `LLMS_VECTOR_STORE_FRAMEWORK` setting

File: opencontractserver/llms/api.py ~line 659

The VectorStoreAPI.create() method now reads from a new LLMS_VECTOR_STORE_FRAMEWORK setting instead of falling through to LLMS_DOCUMENT_AGENT_FRAMEWORK. The default is still PYDANTIC_AI, so green-field deployments are fine. But operators who explicitly set LLMS_DOCUMENT_AGENT_FRAMEWORK expecting it to govern both agents and vector stores will find vector store framework selection silently decoupled. This should be called out in the settings docs / release notes.

🟡 Behavioral Change — Double-fetch for empty corpuses after `documents` field change

File: opencontractserver/llms/agents/core_agents.py ~line 317 and 361

# Before
documents: Optional[list[Document]] = None
# ...
if self.documents is None:   # short-circuits after first load

# After
documents: list[Document] = field(default_factory=list)
# ...
if not self.documents:       # always True for empty corpus → always re-fetches

When CoreCorpusAgentFactory.create_context pre-populates documents=[] (empty corpus) then calls context.initialize(), the new guard triggers a redundant DB fetch. Not a crash, but wasteful and semantically surprising — the previous guard correctly distinguished "not yet loaded" from "loaded and empty". One way to preserve the intent without Optional is a separate _documents_loaded: bool = False sentinel field.

🟡 Type Safety Regression — `**kwargs: Any` in `_stream_core`

File: opencontractserver/llms/agents/pydantic_ai_agents.py ~line 619

Replacing explicit typed parameters with **kwargs: Any + kwargs.pop(...) makes callers lose static type checking on force_llm_id, force_user_msg_id, etc. After each pop, unknown keys are silently ignored — a misspelled kwarg name produces no error at the call site or at runtime. Consider at least adding assert not kwargs, f"Unexpected kwargs: {list(kwargs)}" after all pops to catch caller mistakes at dev time (with # nosec B101 since it's intentional).

🟡 Type Asymmetry — `corpus_id` int-cast missing in `CoreConversationVectorStore`

File: opencontractserver/llms/vector_stores/core_conversation_vector_stores.py ~lines 210, 280

CoreChatMessageVectorStore.search applies int(self.corpus_id) in its ORM filter. CoreConversationVectorStore.search (same file) does not, despite having the same corpus_id: Union[str, int, None] signature. This asymmetry means the latter may still generate a mypy warning or behave unexpectedly when corpus_id is passed as a string.

🔵 Minor — Redundant `list()` wrap in `_resolve_tools`

File: opencontractserver/llms/api.py ~line 118

resolved_tools: Optional[list[Union[CoreTool, Callable[..., Any], str]]] = (
    list(_resolve_tools(tools)) if tools else None
)

_resolve_tools already returns a list[CoreTool], so the list(...) call is a no-op copy. The Callable and str union members in the type annotation are also spurious leftovers from the old ToolType union — the resolver normalises everything to CoreTool. Simplify to Optional[list[CoreTool]] and drop the redundant wrap.

🔵 Pre-existing Issue — `messages` variable assigned but never consumed

File: opencontractserver/llms/agents/core_agents.py ~lines 1240, 1314

messages = loaded_messages or config.loaded_messages or []
logger.debug(f"Loaded {len(messages)} existing messages ...")
# `messages` never used again

The or [] addition (to fix a type error) highlights a variable that's computed, logged, and discarded — a pre-existing dead-code pattern. CLAUDE.md calls for removing dead code. Out of scope for a typing PR but worth a follow-up ticket.

✅ Things done well

sync_to_async(lambda: list(queryset))() is the correct fix vs. the old sync_to_async(list)(queryset) — the lambda correctly defers queryset evaluation into the thread pool.
if False: yield cast(UnifiedStreamEvent, None) unreachable-yield pattern correctly makes abstract _stream_raw compile as an async generator. The explanatory comment is justified ("explains the WHY — a mypy limitation workaround"), consistent with CLAUDE.md.
All cast() usages are safe: cast(UnifiedStreamEvent, None) in unreachable branches, cast(AnnotationQuerySet/ChatMessageQuerySet, qs) on correctly-typed managers.
mypy.ini cleanup is clean and well-commented.
CHANGELOG entry is thorough.
New test file test_llms_typing_behavior_guards.py is a good addition, though the add_document_note_tool guard test should be extended to cover the corpus is None path.

Summary

Severity	Issue
🔴 Bug	`add_document_note_tool` raises for valid standalone-document agents
🟡 Behavioral	Undocumented `LLMS_VECTOR_STORE_FRAMEWORK` setting split from `LLMS_DOCUMENT_AGENT_FRAMEWORK`
🟡 Behavioral	Empty-corpus double-fetch after `Optional[list] → list` field change
🟡 Type safety	`**kwargs: Any` in `_stream_core` silently swallows unknown keys
🟡 Type asymmetry	`int(corpus_id)` cast missing in `CoreConversationVectorStore.search`
🔵 Minor	Redundant `list()` wrap + spurious union members in `_resolve_tools` caller
🔵 Pre-existing	Dead `messages` variable — follow-up ticket recommended

The functional regression in add_document_note_tool should be fixed before merge. The remaining items are lower-priority but worth addressing.

…-lgZT0

agent.stream() is an async def with yield — it returns an AsyncGenerator directly, not an awaitable. async for event in await agent.stream(...) was a left-over from a pre-typing iteration; mypy flags it now that the surrounding tasks module type-checks.

claude · 2026-05-08T00:14:26Z

Code Review: PR #1543 — Type-check `opencontractserver.llms` modules

Overview

This PR graduates 9 opencontractserver.llms.* modules out of the mypy baseline by fixing ~125 type errors across agents, API/client, and vector stores. Importantly, the type fixes also surface and fix several genuine runtime bugs — this is a high-quality typing PR that goes beyond mechanical annotation. Two new test files are added to pin the behavioral contracts.

Strengths

Substantive fixes alongside type annotations. The sync_to_async(lambda: list(...))() change (vs. the old sync_to_async(list)(...)) is a real bug fix: the queryset was being evaluated outside the thread-pool worker. The test in TestSyncToAsyncListEvaluation is a smart regression pin.
RuntimeError instead of assert for invariant guards. The user_id is None checks in create_placeholder_message, store_user_message, and store_llm_message correctly use explicit raise RuntimeError so the guards survive python -O. The inline comment explaining the rationale is exactly the right kind of non-obvious comment.
isinstance(ev, SourceEvent) over getattr(ev, "type", "") == "sources". Safer dispatch with real type-narrowing benefit.
_resolve_framework() extraction cleanly deduplicates framework resolution across for_document and for_corpus, and the new TypeError on unrecognised values is a meaningful contract tightening.
AnnotationItem TypedDict replaces raw dict for aadd_annotations_from_exact_strings — correct use of the typed layer.
dataclasses.asdict guard. Adding not isinstance(usage, type) before calling asdict(usage) is the correct fix for the class-vs-instance confusion.
Pruned # type: ignore backlog. The single remaining [assignment] on the tool-stream loop is well-explained and justified.

Issues and Concerns

1. Protocol signature break: `stream()` and `resume_with_approval()` changed from `async def` to `def`

The CoreAgent Protocol methods stream() and resume_with_approval() were async def and are now def. This is semantically correct (both return an AsyncGenerator — callers iterate with async for, not await), but any existing call-site that does await agent.stream(...) will break silently at runtime (returns a generator, not a coroutine). The corresponding implementations in CoreAgentBase.stream() and the PydanticAI agents are still async def, which is fine since those are real async def methods that yield.

Recommendation: Grep across all callers of .stream( and .resume_with_approval( — especially in config/websocket/consumers/unified_agent_conversation.py — to verify none are preceded by await. That consumer file is still in the mypy baseline (ignore_errors = True), so mypy would not have caught a mismatch there.

2. Empty-list semantics for `CorpusAgentContext.documents` — potential repeated-reload edge case

Before: Optional[list[Document]] = None → reload trigger was is None
After: list[Document] = field(default_factory=list) → trigger is not self.documents

The docstring explains the intent, but there is a subtle edge case: a corpus with zero documents will trigger a reload on every initialize() call because an empty list is falsy. In the current create_context path this is harmless (documents are loaded before construction, so the list is non-empty and initialize() is a no-op), but if any caller constructs a CorpusAgentContext with the default empty list and then calls initialize() on a genuinely empty corpus, it will hit the DB on every invocation.

The factory also now loads documents before construction AND calls initialize() after — the no-op is correct but the double-intent is subtle. A brief comment at the create_context call site (# initialize() is a no-op here since documents were pre-loaded) would clarify.

3. `timeline_stream_mixin.py`: `event.llm_message_id or 0` passes 0 as a message ID

await finalise(
    event.llm_message_id or 0,   # pk=0 almost certainly doesn't exist
    ...
)

This predates the PR, but the surrounding code is being refactored here and it's a good opportunity to fix it. The correct guard is:

if event.llm_message_id is not None:
    await finalise(event.llm_message_id, ...)

This is consistent with the pattern correctly applied in _stream_core (also in this PR) where if llm_msg_id is not None: guards the same call.

4. Test quality: import-callability checks and source-text inspection are fragile

In TestPermissionGuardsOnAnnotationTools, three tests only verify that a function is importable and callable. The docstring honestly acknowledges this. The test_permission_error_message_format test reads the source file and counts occurrences of a string literal — brittle to refactoring (rename the message → test fails; extract the guard into a helper → count is wrong).

A more robust approach: extract the user_id is None guard logic into a shared helper function that can be unit-tested directly, rather than coupling tests to source text.

Similarly, TestSyncToAsyncListEvaluation.test_no_naked_sync_to_async_list_callsites reads module files directly to grep for a pattern. This is fragile to whitespace/reformatting and scans only the four listed modules (a new module with the same bug wouldn't be caught). A linting rule (e.g., a custom flake8 check) would be more reliable.

5. Unreachable `RuntimeError` for conversation-id narrowing

elif conversation_id or config.conversation_id:
    cid = conversation_id or config.conversation_id
    if cid is None:          # truly unreachable given the elif condition above
        raise RuntimeError(...)

This branch cannot fire — the elif guarantees at least one is truthy. The RuntimeError message ("factory invariant violated") is misleading since no invariant is actually at risk. Consider assert cid is not None # nosec B101 with a short comment, consistent with the type-narrowing-only assertion pattern used elsewhere in this codebase.

Minor Nits

resolved: Any = framework in _resolve_framework widens immediately to Any, which prevents mypy from following the isinstance chain. Non-blocking since the return annotation is correct, but keeping the typed form through the chain would be cleaner.
_resolve_framework is correctly prefixed _ (private), but is imported directly in tests. Fine for now.

Summary

This is a well-executed typing graduation PR that fixes real bugs alongside mechanical annotation. The sync_to_async(lambda: ...) fix, the user_id invariant guards, and the isinstance event dispatch are all substantive improvements worth having.

Main actionable items before merge:

Verify no await agent.stream(...) callers exist, especially in unified_agent_conversation.py (still mypy-ignored). This is the only potential silent breakage introduced.
Fix event.llm_message_id or 0 in timeline_stream_mixin.py to use if event.llm_message_id is not None: instead.
Add a comment in create_context explaining why initialize() is a no-op there (pre-loaded documents).

Overall: approvable with the stream/await call-site check done.

- Drop the corpus-required guard on add_document_note_tool: the underlying aadd_document_note accepts corpus_id=None, so standalone- document agents must keep working. Forward context.corpus.id when present and None otherwise. - _stream_core now raises TypeError on unknown kwargs instead of silently swallowing typos. - Mirror the int(self.corpus_id) cast on the sync Conversation vector store search paths so the ORM honours string-typed corpus IDs as CoreChatMessageVectorStore.search already does. - Drop redundant list() wrap in api.py around _resolve_tools, cast through the wider factory parameter type instead. - Document LLMS_VECTOR_STORE_FRAMEWORK in the LLM architecture README so operators understand the new setting is intentionally independent of LLMS_DOCUMENT_AGENT_FRAMEWORK. - Add behaviour-guard tests pinning all three contracts.

claude · 2026-05-08T00:17:54Z

Code Review: PR #1543 — Type-check `opencontractserver.llms` modules

Overall: This is a solid typing graduation PR. The core goal (removing 9 ignore_errors = True blocks from mypy.ini) is achieved cleanly, and several legitimate runtime bugs were fixed in the process. The new test files are a welcome addition. Comments below are organized from most to least important.

Issues

1. `CorpusAgentContext.documents` — empty-list ambiguity (`core_agents.py:338`)

The switch from Optional[list[Document]] = None to list[Document] = field(default_factory=list) conflates two distinct states:

Empty list = "not yet loaded" (triggers corpus fetch in initialize())
Empty list = "corpus has no documents" (after loading, should be a no-op)

A corpus can legitimately have zero documents. After initialize() runs on such a corpus it populates documents with [], but a subsequent initialize() call (or any other caller using not self.documents as the load-trigger) will incorrectly re-fetch. The docstring acknowledges this but labels it a "hypothetical" concern — it isn't, it's a real case.

The minimal fix is a sentinel field:

_initialized: bool = field(default=False, init=False)

async def initialize(self):
    if not self._initialized:
        self.documents = await sync_to_async(lambda: list(self.corpus.get_documents()))()
        self._initialized = True

Alternatively, keep Optional with None as the unambiguous sentinel. The typing issue that drove this change (no_implicit_optional) is trivially satisfied by just adding Optional[list[Document]] = None explicitly.

2. `create_context()` calls `initialize()` after pre-loading (`core_agents.py:1101–1137`)

documents: list[Document] = await sync_to_async(lambda: list(corpus_obj.get_documents()))()
context = CorpusAgentContext(corpus=corpus_obj, config=config, documents=documents)
await context.initialize()  # No-op: documents is already non-empty

The initialize() call is now dead code (it will short-circuit every time). If the empty-list ambiguity above is fixed, this becomes the only caller that reliably passes documents — but it still calls initialize() unnecessarily. Either remove the trailing await context.initialize() here, or drop the pre-load and let initialize() own the fetch.

3. Permission guards on tool closures are untestable as written (`test_llms_typing_behavior_guards.py:130–200`)

The tests for add_document_note_tool, duplicate_annotations_tool, and add_exact_string_annotations_tool verify importability and source-code string counts, not runtime behavior:

# Source-grep as a test — brittle and not a behaviour assertion
guard_count = source.count("requires an authenticated user")
self.assertGreaterEqual(guard_count, 3, ...)

These guards are security-relevant checks (preventing writes without an authenticated user). A source scan won't catch a refactor that moves or renames the guard. The test comment acknowledges the closures can't be constructed without a full agent, but that constraint is worth pushing back on: the guard logic (if config.user_id is None: raise PermissionError) could trivially be extracted into a helper or tested by constructing a minimal mock context and calling the tool factory. That would give a real behaviour assertion.

4. `_stream_core(**kwargs)` swallows the explicit signature (`pydantic_ai_agents.py:618`)

async def _stream_core(self, message: str, **kwargs: Any) -> AsyncGenerator[...]:
    force_llm_id: Optional[int] = kwargs.pop("force_llm_id", None)
    force_user_msg_id: Optional[int] = kwargs.pop("force_user_msg_id", None)
    ...
    if kwargs:
        raise TypeError(f"_stream_core got unexpected keyword arguments: {sorted(kwargs)}")

The raise TypeError guard is good, but this is essentially a hand-rolled TypedDict without the type safety. The mixin contract that forced this signature relaxation (TimelineStreamMixin._stream_core) could accept a TypedDict or Protocol typed kwargs instead of **Any. The current approach pushes type errors from call-time to runtime for internal callers.

Not a blocker, but worth a follow-up ticket.

Correctness Improvements (positive)

These changes fix real bugs and are good to land:

sync_to_async(lambda: list(queryset))() — The original sync_to_async(list)(queryset) evaluated the queryset in the calling context before handing the result to the worker thread. The lambda form correctly defers evaluation. This appears in core_vector_stores.py:103, core_conversation_vector_stores.py:57, and core_agents.py:342 — all correct.
Lambda closure capture (core_vector_stores.py:387): doc_pk = self.document_id before the lambda correctly captures the value at construction time rather than late-binding self.
dataclasses.is_dataclass(usage) and not isinstance(usage, type) (pydantic_ai_agents.py:3101): Correctly restricts dataclasses.asdict() to instances only — is_dataclass() returns True for both the class and its instances.
isinstance(ev, SourceEvent/ThoughtEvent/FinalEvent) replacing getattr(ev, "type", "") == "...": More robust and type-safe.
agent_tasks.py:202: async for event in agent.stream(...) dropping the redundant await correctly matches the new non-coroutine Protocol definition.
Null guards on llm_msg_id: The if llm_msg_id is not None: guards before _finalise_llm_message and complete_message prevent silent DB errors when persistence is disabled.

Minor Notes

_resolve_framework() new setting LLMS_VECTOR_STORE_FRAMEWORK: Clean extraction of the repeated framework-resolution pattern. Worth documenting the new setting in the README or settings reference so operators know it exists.
Unreachable yield for async generator typing (core_agents.py:673–679): The if False: yield cast(...) pattern is the standard idiom for this mypy constraint. The comment is clear; no change needed.
_AgentVectorStoreProto Protocol (pydantic_ai_tools.py): Clean solution to the circular import. The k: int = ... default in the Protocol method is valid Python typing syntax.
docs/typing/mypy_baseline.txt diff not reviewed here — if it shows a net reduction in suppressed errors across the graduated modules, that's exactly what's expected.

Summary

Approve with requests: The CorpusAgentContext.documents empty-list ambiguity (#1) and the redundant initialize() call in create_context() (#2) are the most consequential issues — they interact and could cause silent double-fetches on corpuses with zero documents. The permission guard tests (#3) are a follow-up-worthy gap. Everything else is clean and the thread-safety and null-safety fixes are genuinely valuable.

…ranches

claude · 2026-05-08T00:34:08Z

Code Review: Type-check `opencontractserver.llms` modules (#1543)

This is a solid typing graduation PR. Removing 9 ignore_errors = True blocks is real progress, and several changes here are genuine bug fixes rather than pure annotation work. Overall the approach is sound — a few items worth addressing before merge.

Real Bug Fixes (worth calling out explicitly)

sync_to_async(list)(queryset) → sync_to_async(lambda: list(queryset))()
The original form evaluates the queryset iterator before entering the worker thread — the queryset is built in the async event loop where Django ORM calls raise SynchronousOnlyOperation. The lambda form defers evaluation to inside the thread pool. This is a genuine correctness fix in core_vector_stores.py, core_conversation_vector_stores.py, and core_agents.py. The regression test in test_no_naked_sync_to_async_list_callsites is a good belt-and-suspenders guard.

await agent.stream() → agent.stream() in agent_tasks.py
Since stream() is now a sync function returning AsyncGenerator (not a coroutine returning one), the await was consuming the generator object itself and then iterating over None. Correct fix.

isinstance(ev, SourceEvent/ThoughtEvent/FinalEvent) over getattr(ev, "type", "")
More robust — getattr would silently pass if an event object happened to lack a type attribute or renamed it.

doc_pk = self.document_id before lambda in _build_base_queryset
Correct closure capture fix. Without the local binding, the lambda captures self and mypy (and linters) can struggle to narrow self.document_id. The local also avoids late-binding surprises if self.document_id were mutable (it's not here, but good habit).

Issues

1. `CorpusAgentContext.initialize()` — semantic ambiguity with `documents=[]` (Medium)

# Before: if self.documents is None:
# After:  if not self.documents:

A caller that constructs CorpusAgentContext(corpus=..., config=..., documents=[]) intending to say "this agent needs no documents" will now silently trigger a corpus DB fetch. The docstring calls this out but marks it as a future concern ("callers that want to explicitly state 'no documents — skip loading' should not invoke initialize()"). This is a latent footgun — the pre-typing API made the intent explicit (None = unloaded, [] = intentionally empty); the new API collapses both into the same falsy branch.

Consider a sentinel or an explicit _documents_loaded: bool flag if the "intentionally empty" use case ever materialises. At minimum, a comment in the dataclass docstring would help future callers.

2. `_stream_core(**kwargs)` weakens the call-site API (Medium)

# Before:
async def _stream_core(self, message: str, *, force_llm_id: int | None = None, ...)

# After:
async def _stream_core(self, message: str, **kwargs: Any)

The explicit keyword parameters were IDE-discoverable and caught typos at type-check time. The **kwargs approach with runtime extraction means _stream_core(msg, force_llm_idd=123) fails at runtime with TypeError rather than at mypy time. The if kwargs: raise TypeError(...) guard is a good safety net but it only fires at runtime.

The PR comment explains this is needed for mixin compatibility. If the mixin's _stream_core signature must be **kwargs, consider at least documenting the accepted keys in the docstring (they are listed in the body, but not in the signature).

3. New `LLMS_VECTOR_STORE_FRAMEWORK` setting is undocumented (Low)

_resolve_framework(..., setting_name="LLMS_VECTOR_STORE_FRAMEWORK") introduces a new Django setting but neither the CHANGELOG entry, README, nor any settings documentation mentions it. An operator who reads the code can find it, but discovery is poor. The existing LLMS_DOCUMENT_AGENT_FRAMEWORK and LLMS_CORPUS_AGENT_FRAMEWORK settings also appear undocumented, but this PR adds a third one.

4. `cast` in `TimelineBuilder.timeline` is a runtime lie (Low)

self._timeline: list[dict[str, Any]] = []

@property
def timeline(self) -> list[TimelineEntry]:
    return cast("list[TimelineEntry]", self._timeline)

add() accepts Any and appends raw dicts — there is no validation that the dict actually matches the TimelineEntry TypedDict shape. The cast silences mypy at the call site but doesn't prevent e.g. a FinalEvent with a missing type key from producing a malformed TimelineEntry at runtime. The comment says "add() is the gatekeeper" but add() doesn't validate structure. This is fine as a pragmatic tradeoff, but consider a comment clarifying that TimelineEntry is a documentation hint rather than a runtime contract.

5. Permission guard tests don't exercise the guard path (Low)

TestPermissionGuardsOnAnnotationTools verifies the functions are importable and that the error string appears in source. It does not call the tool closure with user_id=None to verify the PermissionError actually raises. The PR author acknowledges the closures are hard to reach without a full agent.

A narrower test would be possible: extract the user_id is None guard into a small helper (e.g. _assert_authenticated(config)) and test that helper directly. This would decouple the permission contract from the full agent construction. Not required, but would make the guards feel properly pinned rather than linted.

Similarly, the context.corpus is None guard added to add_exact_string_annotations_tool is not counted in the assertGreaterEqual(guard_count, 3) check (that check is for "requires an authenticated user", not the corpus guard message). Consider adding a matching assertion for "requires the agent to be scoped to a corpus".

6. Constructor invariant not enforced at the entry point (Low)

In core_vector_stores.py:

# The constructor invariant above guarantees that at least one of
# ``corpus_id`` / ``embedder_path`` is provided

But the actual validation happens downstream (inside the else: if corpus_id is None: raise RuntimeError(...) branch). If a future refactor changes the constructor guard (wherever "above" refers to), the error from the downstream check will be confusing. A single guard at the top of __init__ (e.g., if corpus_id is None and embedder_path is None: raise ValueError(...)) would be clearer.

Correct design decisions worth keeping

if False: yield cast(UnifiedStreamEvent, None) — the right idiomatic way to declare an abstract async generator without marking it @abstractmethod (which would complicate multiple inheritance with mixins). The placement after raise NotImplementedError means the if False block is unreachable at runtime but visible to mypy's control-flow analysis.
_resolve_framework() helper — eliminates three identical framework-resolution blocks. The TypeError on unrecognised object types is a good fail-closed addition.
getattr(agent, "agent_deps", None) double-getattr — safer than the previous getattr(agent.agent_deps, ...) which would raise AttributeError if the attribute didn't exist.
resume_with_approval return type narrowed to AsyncGenerator — makes the API consistent; callers always async for regardless of approval outcome.
AnnotationQuerySet return type on _build_base_queryset — correct; QuerySet[Annotation] doesn't expose .search_by_embedding() so mypy was right to reject the old annotation.

Minor nits

test_llms_typing_coverage.py imports dataclasses but doesn't appear to use it directly (it's used transitively via CorpusAgentContext which is a @dataclass). Harmless but can be removed.
The effective_history: Optional[list[Any]] annotation in _stream_core before the if block is correct, but the variable is assigned in both branches, so mypy would infer the type without the annotation. Worth keeping for readability, just noting it's not strictly needed.
The long inline comment on the type: ignore[assignment] for async for event in tool_stream is well-justified. Keeping the narrow ignore is the right call.

claude · 2026-05-08T00:34:20Z

Code Review: PR #1543 — Type-check opencontractserver.llms modules

Overall: This is a well-executed typing graduation. The ~125 type errors are fixed correctly, the behavioral side-effects are documented, and the test file covers the key new runtime contracts. A few things worth discussing before merge.

What's Done Well

_resolve_framework() refactor — Extracting the duplicated framework-resolution logic from three call sites into a single helper is a genuine DRY improvement. The new LLMS_VECTOR_STORE_FRAMEWORK setting is a clean addition that doesn't break existing deployments.
isinstance() over getattr(ev, "type", "") == "..." string comparisons — The switch in pydantic_ai_agents.py for dispatching SourceEvent, ThoughtEvent, and FinalEvent is more robust and the right approach.
dataclasses.is_dataclass(usage) and not isinstance(usage, type) — This is a real correctness fix, not just a typing fix. dataclasses.is_dataclass returns True for the class itself, so the original code would have called dataclasses.asdict(SomeClass) if a class object were passed, which would raise. The guard is correct.
_AgentVectorStoreProto Protocol — Breaking the circular import between pydantic_ai_tools and pydantic_ai_vector_stores via a structural Protocol is the right architectural call. It keeps callers type-safe without forcing the concrete import.
PermissionError guards before annotation mutations — The if config.user_id is None: raise PermissionError(...) guards before aadd_document_note, aduplicate_annotations_with_label, and aadd_annotations_from_exact_strings are a real security improvement per the CLAUDE.md IDOR prevention patterns.
if self.user_id is None: raise RuntimeError(...) over assert — The comment explaining why assert is avoided (survives python -O) is precisely the right reasoning, and using an explicit raise is the correct approach for invariant enforcement.
sync_to_async(lambda: list(queryset))() — The change from sync_to_async(list)(queryset) is important. sync_to_async(list) calls list() in a thread, but the queryset has already been partially set up in the async context. The lambda form closes over the queryset reference correctly.

Issues Worth Discussing

1. Semantic change in `CorpusAgentContext.initialize()` — empty-list edge case

# Before
if self.documents is None:
# After
if not self.documents:

The docstring correctly documents that callers "that want to explicitly state 'no documents — skip loading' should not invoke initialize()". But in create_context(), pre-loaded documents are passed via the constructor AND initialize() is still called afterward. This means the new behavior is: if documents=[] is passed explicitly (e.g., a corpus with no documents), initialize() will trigger a DB fetch that will also return [] — a wasted async DB round-trip. Not a bug, but if this pattern is expected, a docstring note or early-return guard in create_context() would clarify intent.

2. Test coverage gap for the annotation tool closures

TestPermissionGuardsOnAnnotationTools acknowledges that the guards live inside closures that can't be invoked without a full PydanticAI agent stack, so it uses source-grep to verify the error strings are present. That's a reasonable tradeoff, but the three "is importable" tests read as placeholders for real behavior tests. If a full agent mock is ever added to the test suite, these should be upgraded to actually call the closures with user_id=None and assert PermissionError.

3. `int(self.corpus_id)` coercions

In core_conversation_vector_stores.py:

queryset = queryset.filter(chat_with_corpus_id=int(self.corpus_id))

This correctly narrows str | int for the ORM lookup, but will throw ValueError if corpus_id is a non-integer string (e.g., a UUID). The existing code paths only reach this branch with numeric PKs, so this is fine in practice — but a type alias or earlier coercion to int in the constructor would be cleaner.

4. Minor: `resolved_tools` cast comment is verbose

# ``_resolve_tools`` returns ``list[CoreTool]``; the factory parameter
# is the wider ``list[Union[CoreTool, Callable, str]]`` so it can also
# accept un-resolved tool spec lists from other callers. ``list`` is
# invariant in the type system, so we cast through the wider alias.
resolved_tools: Optional[list[Union[CoreTool, Callable[..., Any], str]]] = (
    cast(...)
    if tools
    else None
)

This appears twice identically in api.py (for_document and for_corpus). The comment and cast could be extracted into a helper (e.g., _resolve_and_cast_tools(tools)) to keep it DRY, consistent with the project's DRY guideline and the _resolve_framework refactor pattern already in this PR.

5. `CoreAgent.stream` Protocol signature change

# Before
async def stream(self, message: str, **kwargs) -> AsyncGenerator[UnifiedStreamEvent, None]:
# After
def stream(self, message: str, **kwargs: Any) -> AsyncGenerator[UnifiedStreamEvent, None]:

Changing the Protocol method from async def to def is semantically correct — a method that returns an AsyncGenerator shouldn't itself be a coroutine — but this is a breaking protocol change for any external implementations of CoreAgent. The change is correct and the downstream fix in agent_tasks.py (removing the spurious await) confirms it was wrong before. Just worth noting for any third-party integrators.

No Security Issues Found

The new guards (PermissionError before annotation mutations, explicit RuntimeError for user_id invariants) improve the security posture. No new attack surfaces introduced.

Summary

Approve with minor suggestions. The type errors are fixed correctly, the behavioral changes are documented, and the runtime guards are an improvement. The main actionable items are:

Consider whether the for_document/for_corpus cast duplication in api.py warrants a helper (low priority).
Track the annotation tool closure tests as candidates for upgrade when a lighter agent mock becomes available.

The if not self.documents: semantic change and the int() corpus_id coercions are acceptable as-is given the documented invariants.

After merging main (which renumbered the conflicting 0069 grounding migration to 0071 in #1581) the only remaining failing tests on this branch are the same two typing-test assertion drifts that were failing on main since #1543. Apply the same fixes already shipped via PR #1575: - ``test_callable_resolves_via_from_function``: patch ``CoreTool.from_function`` (the bound name actually invoked at api.py:713), not the unused ``ToolAPI.from_function`` indirection. - ``test_add_document_note_tool_passes_none_when_corpus_absent``: scope the assertNotIn to the ``add_document_note``-specific message so it doesn't accidentally match ``add_exact_string_annotations``'s legitimate corpus-required guard at line 2497. Also address Claude bot review on this PR: - ``leaderboardAvatar.ts``: export ``AVATAR_VIOLET`` / ``AVATAR_PINK`` so theme audits can import the literals directly. - ``user_types.py``: hoist ``_stripped`` from inside ``resolve_display_name`` to a module-level private helper — the resolver runs per leaderboard query, no need to rebuild a function object each call. - ``useTabVisibilityRefresh``: in development, log refetch errors (sync throws + rejected promises) instead of silently swallowing them. Production behaviour is unchanged so caller query state remains the source of truth there.

…ts (#1582) Two tests added in PR #1543 (Type-check opencontractserver.llms modules) have been failing on main, blocking pytest (and therefore codecov) on every PR. Both tests assert against the wrong target: 1. test_callable_resolves_via_from_function patches opencontractserver.llms.api.ToolAPI.from_function, but _resolve_tools calls CoreTool.from_function directly. The mock never intercepts the actual call site, so _resolve_tools returns real CoreTool instances and the assertEqual against the sentinel fails. Patch CoreTool.from_function instead. 2. test_add_document_note_tool_passes_none_when_corpus_absent does a global file scan with assertNotIn("requires the agent to be scoped to a corpus"), but that string legitimately exists for add_exact_string_annotations (a different tool that does require a corpus). Scope the assertion to just the body of add_document_note_tool by slicing the file from its def to the next async def at the same indent level.

…1567) * Leaderboard: redact OAuth provider IDs + drop aggressive polling Closes #1557. A. The USER column was rendering raw Auth0 ``sub`` values like ``google-oauth2|114688257717759010643`` because the leaderboard resolvers selected ``user.username`` directly, and Django's ``User.username`` is set to the OAuth ``sub`` claim for social-login users (see ``jwt_get_username_from_payload_handler``). Added a ``displayName`` resolver to ``UserType`` with the priority ``name`` → ``given_name`` + ``family_name`` → ``first_name`` + ``last_name`` → ``username`` (when not a ``provider|sub``) → redacted ``user_<last 6>`` fallback. ``GET_LEADERBOARD`` and ``GET_DISCOVERY_DATA`` (plus the matching TS types and components) now select / render ``displayName``. Backend regression coverage in ``test_user_display_name.py`` pins the priority chain and the no-leak guarantee. B. The Community Leaderboard previously polled every 60s (``GET_LEADERBOARD``) and 120s (``GET_COMMUNITY_STATS``) regardless of tab visibility, which re-rendered rows and reset in-flight UI state. Dropped both ``pollInterval``s; queries now use ``cache-and-network`` + ``cache-first`` next-fetch + silent network-status updates so the user sees instant cached data while a single background refresh runs on mount or filter change. A ``visibilitychange`` listener performs one refetch when the tab returns to the foreground; hidden tabs no longer hit the network. * Address review: fix short-sub pipe leak in display name redaction The redacted fallback in UserType.resolve_display_name took the last 6 characters of the raw username, which for sub strings shorter than 7 characters (e.g. auth0|abcde, length 11) included the | separator — returning user_|abcde. test_redacts_short_oauth_sub asserts the result must not contain |, so the test failed against the original code. Fix: split on the last | and slice the suffix from the sub part only, so the redaction never carries the provider prefix or the separator. auth0|abcde → user_abcde, google-oauth2|114688257717759010643 → user_010643 (unchanged), a|b → user_b. Also extracted the magic 6 into OAUTH_SUB_DISPLAY_SUFFIX_LENGTH in opencontractserver/constants/auth.py per CLAUDE.md ("No magic numbers"), and added a comment on the bare "user" fallback documenting that it is intentionally non-unique because Django enforces non-empty username so the branch is effectively unreachable. Verified: docker compose -f test.yml -p opencontracts run --rm django pytest opencontractserver/tests/test_user_display_name.py — 9 passed. Refs PR #1567 review. * Extract leaderboard utilities to shared modules + add unit tests Extracts the local getInitials and getAvatarColor helpers from CompactLeaderboard.tsx into frontend/src/utils/leaderboardAvatar.ts so they can be unit-tested independently of the rendering layer. Extracts the visibility-driven refresh logic from Leaderboard.tsx into a small useTabVisibilityRefresh hook (frontend/src/hooks/) with its own unit tests, replacing the inline useEffect. Net: same runtime behaviour, but the OAuth/multi-token initial branches and the visibility-change refetch path are now covered by direct unit tests instead of indirect rendering coverage. * Address review: remove issue refs, harden hook, tighten tests - Drop Issue #1557 inline references from source files per CLAUDE.md - Make useTabVisibilityRefresh stable across renders via useRef so consumers no longer need useMemo on the refresh fn array - Pin the redacted suffix value in test_redacts_short_oauth_sub - Add coverage for whitespace-only given_name/family_name - Move loose hex literals in leaderboardAvatar to named constants - Drop the SSR-only typeof document guard (Vite SPA, no SSR) - Migrate the new hook test to the project's React-18-native renderHook * tests: disable note post_save signal in pytest fixtures NoteVisibilityTest.setUpTestData was failing in CI with EmbeddingGenerationError because Note.objects.create(...) triggers process_note_on_create_atomic, which schedules calculate_embedding_for_note_text via eager celery. With CELERY_TASK_EAGER_PROPAGATES=True, the embedder lookup failure (empty default_embedder in PipelineSettings, likely from cross-worker Redis cache contamination in pytest-xdist) propagates up through the signal and aborts the class fixture. The Document and Annotation post_save signals are already disabled in conftest.py for exactly this reason. Note creation has the same hazard (it schedules an embedding task) and should be handled the same way. Mirror the existing disconnect/reconnect pattern for the Note signal in both `disable_document_processing_signals` (autouse, session-scoped) and the opt-in `enable_doc_processing_signals` fixture so integration tests can still exercise the full pipeline when they need to. * PR #1567: address review + unblock pytest after main rebase After merging main (which renumbered the conflicting 0069 grounding migration to 0071 in #1581) the only remaining failing tests on this branch are the same two typing-test assertion drifts that were failing on main since #1543. Apply the same fixes already shipped via PR #1575: - ``test_callable_resolves_via_from_function``: patch ``CoreTool.from_function`` (the bound name actually invoked at api.py:713), not the unused ``ToolAPI.from_function`` indirection. - ``test_add_document_note_tool_passes_none_when_corpus_absent``: scope the assertNotIn to the ``add_document_note``-specific message so it doesn't accidentally match ``add_exact_string_annotations``'s legitimate corpus-required guard at line 2497. Also address Claude bot review on this PR: - ``leaderboardAvatar.ts``: export ``AVATAR_VIOLET`` / ``AVATAR_PINK`` so theme audits can import the literals directly. - ``user_types.py``: hoist ``_stripped`` from inside ``resolve_display_name`` to a module-level private helper — the resolver runs per leaderboard query, no need to rebuild a function object each call. - ``useTabVisibilityRefresh``: in development, log refetch errors (sync throws + rejected promises) instead of silently swallowing them. Production behaviour is unchanged so caller query state remains the source of truth there. * Address review: gate email at resolver, fix initials whitespace bug, log refresh errors unconditionally Addresses comments on PR #1567: - Gate UserType.email at the resolver (self / superuser only) so the leak is closed regardless of which client selects the field; drop email from leaderboard query/type/mocks. Coverage in EmailResolverTestCase pins self / superuser / cross-user / anonymous / blank-email / context-missing matrix. - Fix getLeaderboardInitials: trimmed-token fallback so leading-whitespace single-token names render the correct initials instead of whitespace; whitespace-only names return "?". - useTabVisibilityRefresh logs callback errors via console.error unconditionally so production failures surface in the browser console / Sentry. - Tighten test_redacts_short_oauth_sub to make the OAUTH_SUB_DISPLAY_SUFFIX_LENGTH invariant explicit (assertLess on the test premise). * Address review: gate OAuth-sub redaction on is_social_user, doc clarifications Addresses Claude review feedback on PR #1567: - config/graphql/user_types.py: gate the OAuth-sub redaction in resolve_display_name on is_social_user rather than the presence of "|" in the username. The project's UserUnicodeUsernameValidator (allows | * \ in addition to the Django default set) means a local user named e.g. "alice|admin" is legitimate; the previous "|"-only check would have falsely redacted them. Redaction now fires only for social-login users where username == Auth0 sub. - opencontractserver/tests/test_user_display_name.py: existing redact-OAuth-sub tests now set is_social_user=True; added a new test_does_not_redact_local_username_with_pipe pinning the local-user pass-through behavior. - config/graphql/user_types.py: small inline comment on the email graphene field declaration explaining why the override is needed (so resolve_email's gate is not bypassed by the auto-exposed DjangoObjectType field). - frontend/src/hooks/useTabVisibilityRefresh.ts: clarifying notes-for- callers comment block — initial-mount fetch is intentionally not fired here (callers rely on Apollo's cache-and-network), and rapid visibility flips are absorbed by Apollo's query deduplication rather than an internal debounce. CHANGELOG.md updated to reflect the is_social_user gate. --------- Signed-off-by: JSIV <5049984+JSv4@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>

#1477) (#1533) * typing: graduate opencontractserver.shared.{Managers,decorators} (closes #1477) Drains the last two `[mypy-opencontractserver.shared.*]` ignore-blocks from `mypy.ini`. Continues #1470 (which graduated QuerySets/fields/mixins) and finishes the shared package's mypy migration. Removed from mypy.ini - opencontractserver.shared.Managers - opencontractserver.shared.decorators Pruned from docs/typing/mypy_baseline.txt - 37 lines (6813 -> 6776) Per-file fixes shared/Managers.py - Drop the `if TYPE_CHECKING / else AbstractUser as User` aliasing block (mypy treated `User` as a value, not a type — root cause of seven `valid-type` errors in the baseline). Replace User-as-type with AbstractBaseUser on EmbeddingManager.store_embedding(creator=...) and UserFeedbackManager.by_creator(creator=...), and Any for _apply_document_prefetches(user=...) which duck-types is_anonymous / is_superuser. - Cast self.model to Any once in BaseVisibilityManager.visible_to_user so the seven `model_cls.objects.*` sites typecheck. Same cast on the one self.model.DoesNotExist site in UserFeedbackManager.get_or_none. _default_manager would change call semantics for models that override objects, so the cast is preferred. - Coalesce self.model._meta.model_name or "" so the two .upper() calls don't union-attr against None. - Align PermissionManager.visible_to_user and UserFeedbackManager.visible_to_user with BaseVisibilityManager.visible_to_user(user, lightweight, with_doc_label_annotations) — kwargs accepted for parity, no behaviour change. Matches the pattern adopted in CorpusActionExecutionManager (PR #1473 review). - Silence `[misc]` on the PermissionManager.from_queryset(...) dynamic base for AnnotationManager and NoteManager. Documented inline. - Remove three unused `for_user(...)` delegation methods on PermissionManager / AnnotationManager / NoteManager. They called self.get_queryset().for_user(...) against querysets that never implement `for_user`, so any caller would have raised AttributeError at runtime. Verified no callers in opencontractserver/, config/, or tests. shared/decorators.py - Annotate pdf_text_extract as Optional[str] in both wrappers and assert it is not None inside the TXT branch (only reachable when txt_extract_file is non-empty, so the assert is a no-op at runtime). - Annotate pdf_data_layer as Any in both wrappers to absorb the `[] | PdfDataLayer` union without forcing plasmapdf stubs. - Add `-> Any` return annotations to the three remaining inner helpers (get_analysis_creator, get_analysis_analyzer, celery_task_with_async_to_sync.wrapper). Coverage - Managers.py: 100% return-annotation coverage (24/24) - decorators.py: 100% return-annotation coverage (16/16) Verification - pre-commit run --all-files -> all hooks pass (mypy: Passed) - No regressions in mypy across opencontractserver + config (1032 source files checked). * Fix model_cls in get_or_none, replace assert with guard, add manager coverage tests - UserFeedbackManager.get_or_none: hoist model_cls local so model_cls.DoesNotExist is evaluated once, not inside except clause (cleaner and avoids repeated cast on the hot path) - opencontractserver/shared/decorators.py: replace bare assert with an explicit if/raise guard so the invariant check survives python -O; update comment to reflect it is a fast-fail runtime guard, not a type-narrowing hint - opencontractserver/tests/test_shared_managers.py: new test module covering PermissionManager, UserFeedbackManager, and UserFeedbackManager.get_or_none for the user=None coercion path and hit/miss paths (boosts patch coverage for lines graduated from the mypy baseline in #1477) - mypy.ini: remove stray duplicate ignore_errors line left by merge-conflict resolution - opencontractserver/utils/embeddings.py: cast get_embedder return to type[BaseEmbedder] | None (get_component_by_name returns the broader PipelineComponentBase); fixes new mypy error surfaced after merging the utils.embeddings graduation from main * Add coverage for decorators.py guard paths and manager superuser branch - test_task_decorators.py: two new tests for the ``pdf_text_extract is None`` guard (decorators.py lines 273-276) covering both ``text/plain`` and ``application/txt`` MIME types - test_shared_managers.py: two new test classes covering BaseVisibilityManager.visible_to_user(superuser) (model_cls.objects.all() branch) and PermissionManager.visible_to_user(superuser); also imports clean-up (unused Annotation import removed) * Address review: assert non-None model_name, drop magic pk in get_or_none miss test - shared/Managers.py: replace silent 'or ""' fallback on Options.model_name with an explicit assert so an abstract-model invariant violation surfaces as a clear AssertionError instead of cascading through the permission table lookup as an empty string. - tests/test_shared_managers.py: compute a guaranteed-missing pk by taking max(pk) + 1 instead of hard-coding 999999999 (CLAUDE.md no- magic-numbers rule). * Fix mypy: remove redundant cast in get_embedder return (#1545) The cast(Optional[type[BaseEmbedder]], embedder_class) on the return statement was redundant because mypy infers the correct type from the assignments throughout the function body. The function's return type annotation already declares the same type, so mypy reports [redundant-cast]. This is the only mypy error on main and has been blocking the linter step of Backend CI for multiple commits, which in turn cascade-skips the pytest job and prevents Codecov from receiving fresh coverage. * typing: address review issues from pipeline.* mypy graduation (PR #1540) (#1544) Follow-up to #1540 (merged) addressing issues raised in the Claude code review: - Issue 1: Mark `dependencies` and `input_schema` as `ClassVar` on `PipelineComponentBase` to prevent shared mutable state across subclasses. Remove now-redundant declarations from `BaseParser`, `BaseEmbedder`, `BaseThumbnailGenerator`, `BasePostProcessor`, and `BaseReranker`. Mark `supported_file_types` and `supported_modalities` as `ClassVar` too. - Issue 2: Update `find_image_tokens_in_bounds` signature to accept `list[PawlsTokenPythonType]` (matches callers) instead of `list[dict[str, Any]]`, eliminating 3 `cast()` calls in `llamaparse_parser`. - Issue 3: Remove internal `cast(dict[str, Any], source)` from `LlamaParseParser._build_image_token` — `PawlsTokenPythonType` already declares all accessed fields as `NotRequired`, so `source.get(...)` is type-safe without widening. - Issue 5: Remove redundant `description`, `author`, `dependencies`, `input_schema` declarations from `BasePostProcessor` (inherited from base). Remove type annotations from concrete subclass overrides of `ClassVar` fields in `PDFRedactor`, `TxtParser`, and `NoopReranker`. - Issue 6: Restore `image_token_refs` type annotation on first declaration per scope in `llamaparse_parser`; bare assignments in subsequent scopes avoid `no-redef` errors while preserving intent via inline comment. - Fix `get_embedder` return in `utils/embeddings.py`: cast the full tuple to `tuple[Optional[type[BaseEmbedder]], Optional[str]]` to satisfy mypy, which could not infer the narrowed type from a first-element cast alone. All pre-commit hooks (black, isort, flake8, mypy) pass. * Demote chatty permissioning INFO logs to DEBUG (#1525) * Demote chatty permissioning INFO logs to DEBUG The get_users_permissions_for_obj/App name pair fired on essentially every authenticated request, dominating Django pod logs and inflating storage costs. Collapses the two lines into one logger.debug() call using lazy %-formatting so the message is not built unless DEBUG is enabled. Permission denial logs and WARNING/ERROR paths are unchanged. Closes #1436 * Merge main and apply embeddings.py mypy fix The latest main brought a mypy regression in opencontractserver/utils/embeddings.py where get_component_by_name returns type[PipelineComponentBase], but the function signature requires type[BaseEmbedder]. Cast the result and annotate the local embedder_class to satisfy the stricter return type. --------- Co-authored-by: Claude <noreply@anthropic.com> * Fix mypy: remove redundant cast on get_embedder return tuple PR #1544 reintroduced a redundant tuple cast on the get_embedder return statement, which mypy flags as [redundant-cast]. The variables embedder_class (annotated Optional[type[BaseEmbedder]]) and embedder_path (Optional[str]) already carry the correct types via type narrowing, so no cast is needed. This restores the fix from #1545 and unblocks the linter step on all currently open PRs. * Address review: add async TXT guard + retarget BaseVisibilityManager test Two follow-ups from the Claude review on PR #1533: 1. async_doc_analyzer_task was missing the runtime guard that the sync wrapper added: if pdf_text_extract is None: raise ValueError(...) inside the text/plain branch. The async post-processing path uses span['text'] (not a slice into pdf_text_extract) so the guard is defensive rather than required by the immediate code, but the contract is the same: TXT processing requires a non-empty txt_extract_file. Mirroring the sync invariant fails fast instead of silently saving annotations from a doc the analyzer couldn't compute spans for. Two new regression tests pin the contract for text/plain and application/txt. 2. BaseVisibilityManagerSuperuserTest claimed to exercise BaseVisibilityManager.visible_to_user but the test routed through Corpus.objects, which is a PermissionManager whose visible_to_user is fully overridden — the base manager's superuser / anonymous / guardian-fallback branches were never reached. Retarget the test to the Embedding model (annotations.Embedding uses EmbeddingManager(BaseVisibilityManager) without an override), so the call now lands directly in BaseVisibilityManager. Three subtests pin the superuser-sees-all branch, the anonymous-only- public branch (user=None coercion), and the unrelated-user guardian-fallback branch. Also fixes the embeddings.py redundant-cast regression that resurfaced during the main merge (post-#1545 cast is genuinely redundant, mypy was right; restore the no-cast form). * Fix CI lint, restore dropped assertion, harden Managers invariant CI fix: - ``test_shared_managers.py``: re-format the multi-line ``order_by + values_list + first`` chain that black wanted as a single line. The earlier formatting tripped the linter job; running black locally yields the canonical layout. Review feedback (PR #1533, second pass): - ``test_async_task_decorators::test_async_function_has_access_to_txt_text`` lost the final ``span_annotation.annotation_label.text == "TEXT_SPAN_ASYNC"`` assertion when the new ``test_async_doc_analyzer_task_txt_no_extract_file_raises`` test was appended in the previous review-fix commit. Restored — the end-to-end label-text check is the only place that pins the analyzer's annotation_label_text propagation through the async post-processing path. - ``shared/Managers.py``: condense the 7-line explanatory block above the ``model_name is not None`` guard to one short paragraph (per CLAUDE.md comment style) and convert the ``assert`` to an explicit ``if model_name is None: raise RuntimeError(...)`` so the invariant survives ``python -O`` (where asserts are stripped). Same fix pattern as PR #1543; the reviewer didn't flag it on this PR but the security rationale is identical. The reviewer's other action item — restoring ``list[TokenIdPythonType]`` annotations in ``llamaparse_parser.py`` that were demoted to comments — is out of scope: the comment-style demotions were introduced by PR #1544 (pipeline review fixes) which has already merged into ``main``; this PR does not touch ``llamaparse_parser.py``. A follow-up tracker should be opened against the pipeline graduation if those annotations need restoring. * Address review: improve TXT error message, trim test polish - TXT-extract guard now reports the actual file_type so the message is accurate for both application/txt and text/plain. - Drop the duplicated assertIsNotNone before the mypy-narrowing assert is not None in test_get_or_none_returns_object_on_hit. - Document the rationale for in-setUp imports in BaseVisibilityManagerSuperuserTest (AppConfig.ready() ordering). - Trim the verbose async TXT-guard comment in decorators.py to one short line, matching the sync path. * Lift backend patch coverage above 90% (close codecov gap on #1533) The patch coverage report flagged 5 missed lines in Managers.py (84.84% / target 87.98%): - line 180 raise RuntimeError on abstract-model guard - lines 203/206/209 unreachable branches inside the if/elif chain in BaseVisibilityManager.visible_to_user (None / superuser / anonymous), all dead because the method already returns early at lines 154-164 for those user states - line 302 user = AnonymousUser() in PermissionManager.visible_to_user (existing test used Corpus which has its own PermissionedTreeQuerySet.as_manager() and never reaches PermissionManager) Changes: - Managers.py: drop the dead None/superuser/anonymous branches that follow the early-return chain (CLAUDE.md "no dead code"). Keep the authenticated-non-superuser path with a clear comment about the invariant. The model_cls.objects.none() initializer stays so the outer except still has a defined fallback queryset. - test_shared_managers.py: - PermissionManagerVisibleToUserViaNoteTest: exercises Note.objects.visible_to_user(user=None), which actually goes through PermissionManager.visible_to_user (NoteManager is built via PermissionManager.from_queryset(NoteQuerySet)) and covers the AnonymousUser coercion path. - BaseVisibilityManagerAbstractModelGuardTest: patches Embedding._meta.model_name to None for the duration of one call to trip the explicit raise, verifying the outer except handler catches the RuntimeError and falls through to the creator/public fallback. This exercises the explicit guard introduced in review feedback (assert → raise) so it survives python -O. * Fix mypy after main merge: cast self.model for blob_field_names() After merging main (which added Document.blob_field_names()), the DocumentManager.unique_blob_paths_for_many() loop calls self.model.blob_field_names() — but self.model is type[_T] which mypy cannot resolve to Document. Cast it to type[Document] inside the method to make the attribute access type-safe without weakening the surrounding function signature. * Fix CI on PR #1533: scope source-greps and surface abstract-model guard - test_callable_resolves_via_from_function: patch the actual call site (CoreTool.from_function), not the ToolAPI wrapper that _resolve_tools no longer routes through. - test_add_document_note_tool_passes_none_when_corpus_absent: scope the source-grep to the add_document_note_tool body so the corpus-required diagnostic on add_exact_string_annotations no longer trips it. - Address review #1: hoist the abstract-model RuntimeError raise above the broad except in BaseVisibilityManager.visible_to_user so the bug surfaces instead of falling back to creator/public filtering. Update the corresponding test to assert the raise. - Address review #2: drop unnecessary string forward reference in cast(type[Document], self.model) — Document is already imported at runtime above the call site. * Strip trailing whitespace; address review on PermissionManager / NoteManager - Pre-commit's trailing-whitespace hook fixes a stray blank line inside a comment block in test_llms_typing_behavior_guards.py (CI was failing on this in the linter job). - PermissionManager.visible_to_user gains a docstring note that the superuser branch lives in PermissionQuerySet, not in the base manager shortcut bypassed by this override. - NoteManager picks up a one-line comment cross-referencing the AnnotationManager type-ignore rationale so the silenced dynamic-base-class warning is self-documenting.

github-code-quality Bot found potential problems May 5, 2026

View reviewed changes

Comment thread opencontractserver/llms/agents/core_agents.py Fixed

Merge branch 'main' into claude/fix-issue-1484-lgZT0

1e86a20

Signed-off-by: JSIV <5049984+JSv4@users.noreply.github.com>

Merge main; reconcile embeddings.py mypy fix variants

8da8f9b

JSv4 added 2 commits May 5, 2026 22:35

Merge remote-tracking branch 'origin/main' into claude/fix-issue-1484…

76d2622

…-lgZT0 # Conflicts: # CHANGELOG.md

JSv4 added 2 commits May 6, 2026 07:46

Merge remote-tracking branch 'origin/main' into claude/fix-issue-1484…

e196983

…-lgZT0 # Conflicts: # CHANGELOG.md # docs/typing/mypy_baseline.txt

github-code-quality Bot found potential problems May 6, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into claude/fix-issue-1484…

a55fb74

…-lgZT0 # Conflicts: # CHANGELOG.md # mypy.ini

JSv4 added 2 commits May 7, 2026 18:06

Merge remote-tracking branch 'origin/main' into claude/fix-issue-1484…

4062cb1

…-lgZT0 # Conflicts: # CHANGELOG.md

JSv4 added 2 commits May 7, 2026 19:08

Merge remote-tracking branch 'origin/main' into claude/fix-issue-1484…

4843ec3

…-lgZT0

JSv4 added 2 commits May 7, 2026 19:30

Add coverage for _is_public/_assert_access helpers + _resolve_tools b…

48e18c7

…ranches

Cover VectorStoreAPI.create framework setting + kwarg passthrough

22fc19a

JSv4 merged commit acb3358 into main May 8, 2026
8 checks passed

JSv4 deleted the claude/fix-issue-1484-lgZT0 branch May 8, 2026 01:03

JSv4 mentioned this pull request May 8, 2026

Fix two PR #1543 tests with wrong assertion targets blocking main CI #1582

Merged

3 tasks

Uh oh!

Conversation

JSv4 commented May 5, 2026

Summary

Key Changes

Core Agent Framework (core_agents.py)

PydanticAI Agent Implementation (pydantic_ai_agents.py)

Agent Factory (agent_factory.py)

API Module (api.py)

Vector Store Modules

Timeline & Mixin Support

Configuration

Implementation Details

Uh oh!

Uh oh!

claude Bot commented May 5, 2026

Code Review: PR #1543 — Type-check opencontractserver.llms modules

Overview

Real bugs fixed — worth highlighting

Positive patterns used well

Issues and suggestions

Medium — CorpusAgentContext.documents empty-vs-uninitialized ambiguity

Minor — import inside unreachable dead code

Minor — vector_store: Optional[Any] is a type regression

Minor — _stream_core(**kwargs) loses compile-time parameter checking

Minor — silent empty tool_name on malformed approval data

Test coverage gap

Summary

Uh oh!

codecov Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

claude Bot commented May 5, 2026

Code Review: Type-check opencontractserver.llms modules (#1484)

Behavioral Changes Hidden in a Typing PR

Protocol Contract Changes

**kwargs Erosion in _stream_core()

Optional[Any] in PydanticAIDependencies

Positive Changes Worth Noting

Test Coverage

Minor

Uh oh!

claude Bot commented May 5, 2026

Code Review: Type-check opencontractserver.llms modules (#1543)

Positive Changes Worth Calling Out

Issues / Concerns

Missing Test Coverage

Minor Nits

Summary

Uh oh!

claude Bot commented May 6, 2026

Code Review — PR #1543: Type-check opencontractserver.llms modules

What the PR Does Well

Issues Worth Addressing

1. assert used for runtime security invariants (medium)

2. vector_store: Optional[Any] widens type to Any (low-medium)

3. cast("list[TimelineEntry]", self._timeline) is unchecked (low)

4. Semantic change in CorpusAgentContext.initialize() (low)

5. _resolve_framework default setting name for vector stores (low)

Test Coverage

Minor Style Notes

Summary

Uh oh!

claude Bot commented May 6, 2026

Code Review: Type-check opencontractserver.llms modules (Issue #1484)

Real Bug Fixes (Positive)

Concerns

1. _stream_core(**kwargs) erases the public contract (medium)

2. Unreachable yield cast(UnifiedStreamEvent, None) in Protocol stub (low)

3. cast() in TimelineBuilder.timeline is an unsound contract (low)

4. int(self.corpus_id) cast in queryset filter (low)

5. Logically dead cid is None guards (low)

6. Duplicate embedder-resolution logic (low)

Test File Observations

Nits

Summary

Uh oh!

claude Bot commented May 7, 2026

Code Review — PR #1543 (Type-check opencontractserver.llms modules)

Core Agent Framework (`core_agents.py`)

PydanticAI Agent Implementation (`pydantic_ai_agents.py`)

Agent Factory (`agent_factory.py`)

API Module (`api.py`)

Medium — `CorpusAgentContext.documents` empty-vs-uninitialized ambiguity

Minor — `vector_store: Optional[Any]` is a type regression

Minor — `_stream_core(**kwargs)` loses compile-time parameter checking

Minor — silent empty `tool_name` on malformed approval data

codecov Bot commented May 5, 2026 •

edited

Loading

`**kwargs` Erosion in `_stream_core()`

`Optional[Any]` in `PydanticAIDependencies`

Code Review — PR #1543: Type-check `opencontractserver.llms` modules

1. `assert` used for runtime security invariants (medium)

2. `vector_store: Optional[Any]` widens type to `Any` (low-medium)

3. `cast("list[TimelineEntry]", self._timeline)` is unchecked (low)

4. Semantic change in `CorpusAgentContext.initialize()` (low)

5. `_resolve_framework` default setting name for vector stores (low)

Code Review: Type-check `opencontractserver.llms` modules (Issue #1484)

1. `_stream_core(**kwargs)` erases the public contract (medium)

2. Unreachable `yield cast(UnifiedStreamEvent, None)` in Protocol stub (low)

3. `cast()` in `TimelineBuilder.timeline` is an unsound contract (low)

4. `int(self.corpus_id)` cast in queryset filter (low)

5. Logically dead `cid is None` guards (low)

Code Review: PR #1543 — Type-check `opencontractserver.llms` modules

🔴 Bug — `add_document_note_tool` breaks standalone document agents

🟡 Behavioral Change — Undocumented `LLMS_VECTOR_STORE_FRAMEWORK` setting

🟡 Behavioral Change — Double-fetch for empty corpuses after `documents` field change

🟡 Type Safety Regression — `**kwargs: Any` in `_stream_core`

🟡 Type Asymmetry — `corpus_id` int-cast missing in `CoreConversationVectorStore`

🔵 Minor — Redundant `list()` wrap in `_resolve_tools`

🔵 Pre-existing Issue — `messages` variable assigned but never consumed

Code Review: PR #1543 — Type-check `opencontractserver.llms` modules

1. Protocol signature break: `stream()` and `resume_with_approval()` changed from `async def` to `def`

2. Empty-list semantics for `CorpusAgentContext.documents` — potential repeated-reload edge case

3. `timeline_stream_mixin.py`: `event.llm_message_id or 0` passes 0 as a message ID

5. Unreachable `RuntimeError` for conversation-id narrowing

Code Review: PR #1543 — Type-check `opencontractserver.llms` modules

1. `CorpusAgentContext.documents` — empty-list ambiguity (`core_agents.py:338`)

2. `create_context()` calls `initialize()` after pre-loading (`core_agents.py:1101–1137`)

3. Permission guards on tool closures are untestable as written (`test_llms_typing_behavior_guards.py:130–200`)

4. `_stream_core(**kwargs)` swallows the explicit signature (`pydantic_ai_agents.py:618`)

Code Review: Type-check `opencontractserver.llms` modules (#1543)

1. `CorpusAgentContext.initialize()` — semantic ambiguity with `documents=[]` (Medium)

2. `_stream_core(**kwargs)` weakens the call-site API (Medium)

3. New `LLMS_VECTOR_STORE_FRAMEWORK` setting is undocumented (Low)

4. `cast` in `TimelineBuilder.timeline` is a runtime lie (Low)

1. Semantic change in `CorpusAgentContext.initialize()` — empty-list edge case

3. `int(self.corpus_id)` coercions

4. Minor: `resolved_tools` cast comment is verbose

5. `CoreAgent.stream` Protocol signature change