map v1 reasoning effort by dialect by xeophon · Pull Request #1690 · PrimeIntellect-ai/verifiers

xeophon · 2026-06-15T12:33:31Z

Overview

Maps the provider-agnostic sampling.reasoning_effort setting into each v1 dialect's native request shape.

Details

Responses requests write the configured value to reasoning.effort while preserving adjacent reasoning settings such as summaries.
Anthropic Messages requests write the configured value to output_config.effort while preserving other output configuration.
Anthropic thinking configuration remains explicit and is preserved from the intercepted request; the dialect does not infer model capabilities or enable a thinking mode.
Existing provider-native reasoning and thinking settings remain unchanged when the eval does not set reasoning_effort.

Note

Low Risk
Small, opt-in request shaping in the interception layer; merges with existing provider fields and only applies when reasoning_effort is configured.

Overview
Adds provider-neutral sampling.reasoning_effort (CLI/TOML) on SamplingConfig so evals can set reasoning effort in one place.

When set, dialect apply_overrides maps it onto outgoing requests: Responses → reasoning.effort (merged with existing reasoning), Anthropic Messages → output_config.effort (merged with existing output_config). If unset, behavior is unchanged.

GUIDE and README document the knob and per-dialect wire shapes (including chat-completions reasoning_effort). Fixes a missing newline at the end of GUIDE.md.

^{Reviewed by Cursor Bugbot for commit b0897d5. Bugbot is set up for automated code reviews on this repo. Configure here.}

Note

Map `reasoning_effort` to dialect-specific request fields for Anthropic and Responses APIs

anthropic.py: Maps sampling.reasoning_effort to output_config.effort in outgoing Anthropic Messages requests, merging with any existing output_config.
responses.py: Maps sampling.reasoning_effort to reasoning.effort in outgoing OpenAI /responses requests, merging with any existing reasoning object.

Changes since #1690 opened

Added 'effort' field to SamplingConfig and updated dialect-specific mappings [f806269]
Replaced reasoning_effort with a provider-neutral effort field in the sampling configuration section [37fa310]
Removed [sampling] section from gsm8k configuration [47aebe5]
Updated dialect mappers to read reasoning_effort from sampling configuration instead of effort [2b174e0]
Renamed the sampling configuration field from effort to reasoning_effort [2b174e0]
Updated documentation to reference sampling.reasoning_effort instead of sampling.effort [2b174e0]
Updated tests to validate the reasoning_effort field name [2b174e0]
Removed test test_sampling_reasoning_effort_is_typed from tests.v1.test_configs [b0897d5]

^{Macroscope summarized 456685c.}

macroscopeapp · 2026-06-15T12:40:49Z

Approvability

Verdict: Needs human review

An unresolved review comment identifies that the new reasoning_effort parameter is not mapped correctly in the train client and legacy bridge code paths, which could cause the feature to fail or be ignored in those contexts.

^{You can customize Macroscope's approvability policy. Learn more.}

mikasenghaas

ya, let's put effort into our SamplingConfig and map from there

mikasenghaas · 2026-06-15T16:25:24Z

unforutnately i dont think we can type this as a literal since it has to work across providers, so prob just str

Dismissing prior approval to re-evaluate f806269

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit f806269. Configure here.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f806269a53

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 47aebe58e2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-15T17:17:34Z

    model_config = ConfigDict(extra="allow")
    temperature: float | None = None
    top_p: float | None = None
+    effort: str | None = None


Map effort before generic sampling dumps

Adding effort to the shared SamplingConfig makes it appear in every model_dump(), but only the proxy dialects translate it. The v1 train client passes sampling_args.model_dump(exclude_none=True) directly to renderers.client.generate (verifiers/v1/clients/train.py:193), and the legacy bridge passes the same dump into v0 clients (verifiers/v1/legacy.py:349/:435), whose normalizers look for reasoning_effort, not effort. In runs such as uv run eval <taskset> --client.type train --sampling.effort medium or legacy --id evals, the new documented knob is therefore sent as an unmapped effort key instead of the provider/engine-native shape, so the request can fail or the requested reasoning budget is not applied.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2b174e074a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b0897d55da

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-16T07:30:08Z

+            overrides["output_config"] = {
+                **dict(body.get("output_config") or {}),
+                "effort": s["reasoning_effort"],
+            }


Enable adaptive thinking when applying Anthropic effort

When --sampling.reasoning-effort is used for adaptive-thinking Claude models such as claude-opus-4-7 or claude-sonnet-4-6 and the intercepted request body does not already include thinking, this override only sends output_config.effort. Anthropic's extended-thinking docs require thinking: {type: "adaptive"} to enable effort-controlled thinking on those models, and the existing v0 Anthropic client adds that field in the same situation (verifiers/clients/anthropic_messages_client.py:335-348), so v1 proxy evals can silently run without the requested adaptive thinking instead of honoring the configured reasoning budget.

Useful? React with 👍 / 👎.

xeophon added 3 commits June 15, 2026 14:33

map v1 reasoning effort by dialect

d65b814

leave Anthropic thinking explicit

de86957

remove dialect override tests

456685c

xeophon marked this pull request as ready for review June 15, 2026 12:36

xeophon changed the title ~~[codex] map v1 reasoning effort by dialect~~ map v1 reasoning effort by dialect Jun 15, 2026

macroscopeapp Bot previously approved these changes Jun 15, 2026

View reviewed changes

mikasenghaas reviewed Jun 15, 2026

View reviewed changes

rename v1 sampling effort

f806269

cursor Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread verifiers/v1/types.py Outdated

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread verifiers/v1/types.py Outdated

xeophon added 2 commits June 15, 2026 19:10

document v1 sampling effort

37fa310

remove effort from gsm8k config

47aebe5

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

restore v1 reasoning effort name

2b174e0

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread verifiers/v1/types.py

remove reasoning effort test

b0897d5

chatgpt-codex-connector Bot reviewed Jun 16, 2026

View reviewed changes

xeophon merged commit 2822e23 into feat/nano-as-v1 Jun 16, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

map v1 reasoning effort by dialect#1690

map v1 reasoning effort by dialect#1690
xeophon merged 8 commits into
feat/nano-as-v1from
codex/v1-reasoning-effort-overrides

xeophon commented Jun 15, 2026 •

edited by cursor Bot

Loading

Uh oh!

macroscopeapp Bot commented Jun 15, 2026 •

edited

Loading

Uh oh!

mikasenghaas left a comment

Uh oh!

mikasenghaas commented Jun 15, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xeophon commented Jun 15, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Details

Map reasoning_effort to dialect-specific request fields for Anthropic and Responses APIs

Changes since #1690 opened

Uh oh!

macroscopeapp Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Approvability

Uh oh!

mikasenghaas left a comment

Choose a reason for hiding this comment

Uh oh!

mikasenghaas commented Jun 15, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xeophon commented Jun 15, 2026 •

edited by cursor Bot

Loading

Map `reasoning_effort` to dialect-specific request fields for Anthropic and Responses APIs

macroscopeapp Bot commented Jun 15, 2026 •

edited

Loading