Agenta-AI · mmabrouk · Jul 4, 2026 · Jul 4, 2026 · Jul 4, 2026 · Jul 4, 2026
diff --git a/docs/design/agent-workflows/documentation/ground-truth.md b/docs/design/agent-workflows/documentation/ground-truth.md
@@ -54,6 +54,13 @@ this page and the referenced code as the source of truth.
   channel, served over a loopback HTTP MCP endpoint the runner stands up (no runner-host child
   process). User-declared MCP resolution is feature-gated (`AGENTA_AGENT_ENABLE_MCP`, off by
   default).
+- `client` tools (browser-fulfilled, e.g. `request_connection`) are delivered to Claude too on
+  the local path: advertised over the same internal MCP channel and PAUSED in the `tools/call`
+  handler (no JSON-RPC result + abort the request), then resumed from the browser result next
+  turn — the same cross-turn pause Pi gets via the file relay, through one shared seam
+  (`services/runner/src/engines/sandbox_agent/client-tools.ts`). On a remote sandbox the
+  loopback channel is unreachable, so a non-Pi run carrying ANY custom tool — client kind
+  included — is refused up front (`REMOTE_TOOLS_UNSUPPORTED_MESSAGE`), never dropped silently.
 
 ## Not Implemented
 

diff --git a/docs/design/agent-workflows/documentation/tools.md b/docs/design/agent-workflows/documentation/tools.md
@@ -197,8 +197,10 @@ natively. Today that splits cleanly into two paths.
   directory. It never receives the `call_ref`, the code, the scoped secrets, or the callback
   auth. When the model calls a tool, the bridge relays the request back to the runner, and the
   runner runs the private spec from memory. This `agenta-tools` server is a tool DELIVERY
-  vehicle, not a user MCP server: it carries gateway and code tools, and it exists only on the
-  Claude path.
+  vehicle, not a user MCP server: it carries gateway and code tools AND `client` tools (which it
+  pauses in `tools/call` rather than executing — see "Client tools" below), and it exists only on
+  the local Claude path (it is skipped on a remote sandbox, where its loopback URL is
+  unreachable).
 
 Both paths funnel execution through one function, `runResolvedTool` in
 `services/agent/src/tools/dispatch.ts`. It is the single place that branches on `kind`, so how
@@ -296,13 +298,32 @@ not, until a provisioning story exists.
 ### Client tools: the browser fulfils them across a turn boundary
 
 Execution happens in the browser, not in the runner at all. A client tool is never run
-in-sandbox; `runResolvedTool` throws if one is ever dispatched there, and the MCP bridge filters
-client tools out of its advertised list. Instead, when the harness calls a client tool, the
-runner emits an `interaction_request` event of kind `client_tool`. The `/messages` egress
-projects it to a browser component, the browser runs it, and the result returns in the next
-`/messages` turn, matched back by id. This is the cross-turn human-in-the-loop path, the same
-mechanism approvals use. A client tool is the right type whenever only the user's environment
-can answer: their location, a file on their machine, a confirmation only they can give.
+in-sandbox; `runResolvedTool` throws if one is ever dispatched there. The model still SEES the
+tool and calls it; the runner then PAUSES the call and emits an `interaction_request` event of
+kind `client_tool`. The `/messages` egress projects it to a browser component, the browser runs
+it, and the result returns in the next `/messages` turn, matched back by tool name + args. This
+is the cross-turn human-in-the-loop path, the same mechanism approvals use. A client tool is the
+right type whenever only the user's environment can answer: their location, a file on their
+machine, a confirmation only they can give.
+
+The pause itself is shared by both delivery paths through one seam
+(`services/runner/src/engines/sandbox_agent/client-tools.ts`, `buildClientToolRelay` +
+`emitClientToolInteraction`):
+
+- **Pi** calls the tool through its extension; the runner's file relay pauses it (writes no
+  response file) and the seam emits the interaction.
+- **Claude** calls the tool over the internal `agenta-tools` MCP server, and the runner pauses it
+  inside the `tools/call` handler: it emits NO JSON-RPC result and aborts that in-flight request,
+  so Claude cannot settle the call before the turn ends `paused`. The browser result resumes it
+  next turn (the MCP handler returns the stored output if the model re-calls). This is
+  local-only: on a remote sandbox the loopback MCP channel is unreachable, so a non-Pi run
+  carrying ANY custom tool — client kind included — is rejected up front
+  (`REMOTE_TOOLS_UNSUPPORTED_MESSAGE`), never delivered silently. (The ACP permission gate in
+  `acp-interactions.ts` keeps its own `kind: "client"` pause branch as a live fallback for a
+  harness that raises a permission gate carrying a resolved client spec.)
+
+A client tool's `render` hint can be `{ kind: "connect" }` (e.g. `request_connection`), the typed
+member of `RenderHint` that asks the frontend to draw the connect widget.
 
 ### Built-in tools: the harness runs them natively
 

diff --git a/docs/design/agent-workflows/interfaces/cross-service/runner-to-mcp-server.md b/docs/design/agent-workflows/interfaces/cross-service/runner-to-mcp-server.md
@@ -32,24 +32,35 @@ server on `127.0.0.1:<ephemeral>` and returns one ACP `type: "http"` entry
 (stateless JSON mode) and answers three methods:
 
 - `initialize`: returns protocol version and `capabilities.tools`.
-- `tools/list`: returns the resolved tool specs as MCP tools. Client-kind tools are filtered
-  out here, because the browser fulfills those.
-- `tools/call`: runs the named tool through `runResolvedTool(..., { relayDir })` (the same relay
-  the Pi path uses) and returns `content`, or an error.
+- `tools/list`: returns the resolved tool specs as MCP tools, reading each tool's input schema
+  through the shared `specInputSchema` accessor (camelCase `inputSchema` OR snake-case
+  `input_schema` — reading `inputSchema` alone advertised an EMPTY schema for every
+  platform-catalog tool). `client` tools ARE advertised here (when a `clientToolRelay` is wired,
+  i.e. local Claude): the model must see them to call them; the runner pauses the call in
+  `tools/call`.
+- `tools/call`: for an executable (`code`/`callback`) tool, runs it through
+  `runResolvedTool(..., { relayDir })` (the same relay the Pi path uses) and returns `content`,
+  or an error. For a `client` tool it validates required args, then pauses through the shared
+  client-tool seam: on `pendingApproval` it emits NO JSON-RPC result and the request listener
+  aborts the in-flight request (socket destroyed, no body) so the harness cannot settle the call
+  before the turn ends `paused`; an engine `AbortSignal` cancels any other in-flight request on
+  pause/teardown. On resume it returns the browser's stored output.
 
 It carries NO credential: the entry has empty `headers`, the server holds only public metadata +
 the relay dir, and it is bound to loopback. It launches no child process — it is served by the
 already-running runner — so it does not reintroduce the runner-host execution hole that #4831
 closed for user stdio MCP. The run end closes it (releases the port).
 
-**On Daytona the internal channel is NOT advertised — the file relay delivers the tools.** The
-loopback URL is a runner-host address; on Daytona the harness runs IN the sandbox, where
-`127.0.0.1` is the sandbox's own loopback, not the runner's, so the URL is unreachable.
-`buildSessionMcpServers` therefore skips the internal channel when `isDaytona` is true and the
-already-running file relay (below) delivers the gateway tools instead — the runner's relay loop
-polls the sandbox filesystem. This honors the design decision "HTTP advertisement for local, file
-relay for Daytona." A user http MCP server (a remote URL the harness dials directly) is NOT
-loopback-bound and stays delivered on Daytona unchanged.
+**On Daytona the internal channel is NOT advertised — only Pi gets tools there.** The loopback
+URL is a runner-host address; on Daytona the harness runs IN the sandbox, where `127.0.0.1` is
+the sandbox's own loopback, not the runner's, so the URL is unreachable. `buildSessionMcpServers`
+skips the internal channel when `isDaytona` is true; only Pi's in-sandbox extension consumes the
+file relay there. A non-Pi (MCP-delivered) harness has no in-sandbox tool reader, so a non-Pi
+remote-sandbox run carrying ANY custom tool (gateway/callback OR client) is refused up front in
+`run-plan.ts` with `REMOTE_TOOLS_UNSUPPORTED_MESSAGE` — fail loud, not a silent empty delivery
+(the capability gate keys on `mcpTools`, which Claude reports `true`). The gate keys on "not
+local", so an unknown remote provider fails closed too. A user http MCP server (a remote URL the
+harness dials directly) is NOT loopback-bound and stays delivered on Daytona unchanged.
 
 **The file relay.** A resolved tool may need to run privately rather than inside the harness
 process. The relay moves the call across that boundary: the child writes a `<id>.req.json`
@@ -83,24 +94,41 @@ allowlist, and permission. Two transports, opposite states:
 ## Owned by
 
 - `sdks/python/agenta/sdk/agents/mcp/`: the Python models and resolver.
-- `services/agent/src/engines/sandbox_agent/mcp.ts`: builds the session's MCP servers (the two
-  layers; the `isDaytona` guard on the internal channel; `validateUserMcpUrl` SSRF guard).
-- `services/agent/src/tools/mcp-bridge.ts`: the internal gateway-tool channel builder; the
-  `USER_MCP_UNSUPPORTED_MESSAGE` and `PI_USER_MCP_UNSUPPORTED_MESSAGE` refusal constants.
-- `services/agent/src/tools/tool-mcp-http.ts`: the internal loopback HTTP MCP server.
-- `services/agent/src/tools/mcp-server.ts`: the removed stdio JSON-RPC server (refusing stub).
-- `services/agent/src/tools/relay.ts`: the file relay loop and hosts.
+- `services/runner/src/engines/sandbox_agent/mcp.ts`: builds the session's MCP servers (the two
+  layers; the `isDaytona` skip on the internal channel; threads `clientToolRelay` + abort signal;
+  `validateUserMcpUrl` SSRF guard).
+- `services/runner/src/engines/sandbox_agent/run-plan.ts`: the `REMOTE_TOOLS_UNSUPPORTED_MESSAGE`
+  gate (a non-Pi remote-sandbox run carrying ANY custom tool fails up front).
+- `services/runner/src/engines/sandbox_agent/client-tools.ts`: the shared client-tool seam
+  (`buildClientToolRelay`, `emitClientToolInteraction`, the ACP tool-call correlation index).
+- `services/runner/src/tools/mcp-bridge.ts`: the internal channel builder (advertises `client`
+  tools when a relay is wired); the `USER_MCP_UNSUPPORTED_MESSAGE` /
+  `PI_USER_MCP_UNSUPPORTED_MESSAGE` refusal constants.
+- `services/runner/src/tools/tool-mcp-http.ts`: the internal loopback HTTP MCP server (the
+  `client` pause: no JSON-RPC result + abort-the-request).
+- `services/runner/src/tools/spec-schema.ts`: the shared `specInputSchema` accessor + arg
+  validation.
+- `services/runner/src/tools/mcp-server.ts`: the removed stdio JSON-RPC server (refusing stub).
+- `services/runner/src/tools/relay.ts`: the file relay loop and hosts (idle-poll backoff).
 
 ## Watch for when changing
 
 - **The gate.** MCP delivery depends on harness type and the `mcpTools` capability, not on a
   single env flag. Changing either changes which tools reach the harness.
 - **The MCP server config shape.** It is part of the `/run` contract and the wire serializer.
-- **The internal channel's MCP methods.** `initialize`, `tools/list`, `tools/call`, and the
-  client-tool filter, served over loopback HTTP. The framing (stateless JSON Streamable-HTTP) is
-  pinned to the MCP client the installed Claude harness uses; re-verify it if that version moves.
-- **The relay.** Polling interval, timeout, and the local-versus-Daytona host. A slow tool
-  must fail cleanly.
+- **The internal channel's MCP methods.** `initialize`, `tools/list` (now advertises `client`
+  tools and reads schemas via `specInputSchema`), and `tools/call` (the `client` pause: emit NO
+  result + abort the request, so the paused widget is the last word before the turn ends).
+  Served over loopback HTTP; the framing (stateless JSON Streamable-HTTP) is pinned to the MCP
+  client the installed Claude harness uses; re-verify it if that version moves.
+- **The client-tool pause is no-result-before-finish.** A paused `tools/call` must never write a
+  JSON-RPC result (a result lets the harness settle and clobber the pending widget); the handler
+  aborts its own request and the engine fires an `AbortSignal` on pause/teardown.
+- **The remote-tools gate.** A non-Pi remote-sandbox run carrying ANY custom tool (client kind
+  included) is refused in `run-plan.ts`. Swap it for a real in-sandbox delivery path when one
+  exists; do not widen it.
+- **The relay.** Polling interval, idle backoff, timeout, and the local-versus-Daytona host. A
+  slow tool must fail cleanly.
 - **HTTP MCP delivery.** `toAcpMcpServers` routes the resolved secret from `env` into a
   request header and builds the ACP `type: "http"` entry. Changing the env-to-header mapping or
   the ACP variant shape changes which auth reaches the remote server.

diff --git a/docs/design/agent-workflows/interfaces/in-service/tool-models-and-resolution.md b/docs/design/agent-workflows/interfaces/in-service/tool-models-and-resolution.md
@@ -63,7 +63,8 @@ config (not markers); `resolve_tools` owns the tool-specific mapping.
 // code: sandboxed code with its named secrets injected into env
 { "kind": "code", "name": "...", "runtime": "python", "code": "...", "env": { "API_KEY": "..." } }
 
-// client: browser-fulfilled; filtered out of the runner's MCP tools/list
+// client: browser-fulfilled; advertised to the model (incl. over the local Claude MCP channel),
+// then PAUSED on call — never executed in the runner
 { "kind": "client", "name": "..." }
 ```
 

diff --git a/docs/design/agent-workflows/interfaces/public-edge/agent-config-schema.md b/docs/design/agent-workflows/interfaces/public-edge/agent-config-schema.md
@@ -133,7 +133,7 @@ Either form is valid:
   "input_schema": {}, "secrets": ["API_KEY"],
   "permission": null, "render": null }
 
-// client: fulfilled by the browser; filtered out of the runner's MCP tools/list
+// client: fulfilled by the browser; advertised to the model, then paused on call (not executed)
 { "type": "client", "name": "pick_file", "description": "...", "input_schema": {},
   "permission": null, "render": null }
 

diff --git a/docs/design/agent-workflows/projects/remote-tools-delivery/research.md b/docs/design/agent-workflows/projects/remote-tools-delivery/research.md
@@ -87,15 +87,16 @@ has an equivalent.
 ## 4. The interim fix implemented alongside these docs
 
 `engines/sandbox_agent/run-plan.ts` `buildRunPlan` now refuses, before any cwd or sandbox is
-created, any run where `!isPi && isDaytona && executableToolSpecsForRun.length > 0` —
+created, any run where `!isPi && isRemoteSandbox && toolSpecs.length > 0` —
 `REMOTE_TOOLS_UNSUPPORTED_MESSAGE`. This mirrors the existing not-implemented gates in the same
 file (`CODE_TOOL_UNSUPPORTED_MESSAGE`, `USER_MCP_UNSUPPORTED_MESSAGE`,
 `PI_USER_MCP_UNSUPPORTED_MESSAGE`, `FILESYSTEM_UNSUPPORTED_MESSAGE`,
 `LOCAL_NETWORK_UNSUPPORTED_MESSAGE`): fail loud with a single named message instead of silently
-dropping a declared capability. `executableToolSpecsForRun` (already computed earlier in the
-function via `executableToolSpecs(toolSpecs)`) excludes `client`-kind tools, which are
-browser-fulfilled and were never advertised over the internal channel in the first place
-(`tool-mcp-http.ts` `tools/list` filters them out too), so a client-only tool run is unaffected.
+dropping a declared capability. The gate counts ALL custom tools, `client` kind included: since
+the #4985 recut, client tools ride the same internal MCP channel on local Claude (advertised in
+`tools/list`, paused in `tools/call`), so on a remote sandbox they are exactly as undeliverable
+as gateway tools — the model would never see them. (The original #5047 gate exempted client
+tools because, pre-#4985, they were never routed through the channel at all.)
 The `mcp.ts` "delivered via the file relay" log is now conditioned on `isPi` so it can never again
 claim a delivery that isn't happening, as defense-in-depth against a future gate bypass (the
 run-plan gate should make the branch it guards dead code, but the log no longer trusts that).

diff --git a/sdks/python/agenta/sdk/agents/adapters/claude_settings.py b/sdks/python/agenta/sdk/agents/adapters/claude_settings.py
@@ -26,8 +26,10 @@
 that honors ``allow``. Emitting an ``allow`` rule here is the only way an ``allow`` tool actually
 runs on Claude instead of always parking. ``ask``/unset emits no allow rule (the gate stays raised
 -> HITL park preserved); ``deny`` emits a deny rule (which also closes a local-Claude execution
-gap). ``client`` tools are browser-fulfilled, never delivered over this channel, so they are
-excluded. The runner policy supplies the default permission when a tool has no explicit value.
+gap). ``client`` tools are browser-fulfilled but ARE delivered over this same channel (the runner
+advertises them on ``agenta-tools`` and pauses the ``tools/call``), so they get a rule too —
+allow unless denied; see :func:`_rules_from_tool_specs`. The runner policy supplies the default
+permission when a tool has no explicit value.
 """
 
 from __future__ import annotations
@@ -131,18 +133,28 @@ def _rules_from_mcp_permissions(mcp_servers: Any) -> Dict[str, List[str]]:
 def _rules_from_tool_specs(
     tool_specs: Any, permission_default: PermissionMode
 ) -> Dict[str, List[str]]:
-    """Derive per-tool Claude rules from each resolved EXECUTABLE tool's Layer-3 ``permission`` (F-046).
+    """Derive per-tool Claude rules from each resolved tool's Layer-3 ``permission`` (F-046).
 
     Mirrors :func:`_rules_from_mcp_permissions`, but per-tool against the fixed internal server name
-    ``agenta-tools``: a callback/code tool is delivered to Claude as a tool of that MCP server, so
+    ``agenta-tools``: a resolved tool is delivered to Claude as a tool of that MCP server, so
     its rule is ``mcp__agenta-tools__<name>``. The standalone
     :func:`~agenta.sdk.agents.tools.models.effective_permission` ladder (explicit permission,
-    else read-only under ``allow_reads``, else the runner mode) routes it to the matching list. Unset tools
-    only render a rule when the runner mode needs an explicit Claude allow/deny rule. ``client``
-    tools are browser-fulfilled and never delivered over this channel, so they are excluded (this
-    mirrors the runner's ``mcp-bridge`` filter). Accepts a list
-    of :class:`~agenta.sdk.agents.tools.models.ToolSpec` or plain dicts (coerced to a spec so the
-    same permission ladder applies).
+    else read-only under ``allow_reads``, else the runner mode) routes an EXECUTABLE
+    (callback/code) tool to the matching list. Unset executable tools only render a rule when the
+    runner mode needs an explicit Claude allow/deny rule.
+
+    ``client`` tools (browser-fulfilled, e.g. ``request_connection``) ride this SAME channel:
+    the runner advertises them on ``agenta-tools`` and pauses their ``tools/call`` for the
+    browser. Their rule is **deny when the effective permission is deny, otherwise allow** —
+    including for an explicit ``ask`` and for unset. The runner-side pause seam is the
+    authoritative gate for a client tool: pausing for the browser IS the ask flow, so a
+    Claude-side ask rule would only duplicate that gate in a worse place (Claude's own prompt
+    fires before the runner ever sees the call, bypassing the pause path). Without an allow rule
+    the same thing happens: Claude's permission gate fires first and the call falls to the ACP
+    path instead of pausing over MCP.
+
+    Accepts a list of :class:`~agenta.sdk.agents.tools.models.ToolSpec` or plain dicts (coerced
+    to a spec so the same permission ladder applies).
     """
     # Lazy import: ``tools.models`` does not import this adapter, but keeping the import local
     # avoids loading the tool models when the claude adapter is used without resolved tools.
@@ -157,14 +169,20 @@ def _rules_from_tool_specs(
         except Exception:
             # A malformed/nameless spec contributes nothing (mirrors the MCP helper's name guard).
             continue
-        if spec.kind == "client":
-            continue
         permission = effective_permission(
             spec.permission, spec.read_only, permission_default
         )
+        rule = f"mcp__{INTERNAL_TOOL_MCP_SERVER}__{spec.name}"
+        if spec.kind == "client":
+            # Deny stays deny; everything else (allow, explicit ask, unset) renders allow: the
+            # runner pause seam is the authoritative ask for a client tool (see the docstring).
+            if permission == "deny":
+                deny.append(rule)
+            else:
+                allow.append(rule)
+            continue
         if spec.permission is None and permission == "ask":
             continue
-        rule = f"mcp__{INTERNAL_TOOL_MCP_SERVER}__{spec.name}"
         if permission == "allow":
             allow.append(rule)
         elif permission == "ask":