Agent Diagnostic
- Loaded
nemoclaw-reference skill (architecture.md, commands.md)
- Loaded
nemoclaw-configure-inference skill (inference-options.md)
- Ran
python3 -c "import json; print(json.load(open('/sandbox/.openclaw/openclaw.json'))['models']['providers']['inference']['api'])" inside sandbox → confirmed openai-responses
- Ran
curl -s https://inference.local/v1/responses ... inside sandbox → endpoint returns valid text every time
- Ran
openclaw agent --agent main -m "hello" --session-id test --verbose on multiple times → intermittent empty replies
- Parsed session transcript (jsonl) — tokens always consumed, text intermittently empty (~6/9 calls):
17:15 text='' output_tokens=108
17:22 text='Hey...' output_tokens=597 ← works
17:23 text='' output_tokens=531
17:24 text='' output_tokens=199
17:27 text='' output_tokens=252
17:29 text='Hey...' output_tokens=249 ← works
17:29 text='Hey...' output_tokens=363 ← works
17:30 text='' output_tokens=253
- Traced
probeOpenAiLikeEndpoint in bin/lib/onboard.js:966-1016 — probe tries /responses first, selects on HTTP 200 only, no response parsing validation
- Overriding
ENV NEMOCLAW_INFERENCE_API=openai-completions in a --from Dockerfile resolves the issue — all calls return text via /v1/chat/completions
- Agent could not resolve this — the inference API type is baked into
openclaw.json at build time (Landlock read-only), cannot be changed at runtime
Description
When onboarding with a LiteLLM proxy as "Other OpenAI-compatible endpoint", the onboard probe selects openai-responses because LiteLLM returns HTTP 200 on /responses. After sandbox creation, agent calls intermittently return empty text — tokens are consumed (stopReason: "stop", isError: false) but the session transcript records "text":"".
The inference endpoint itself works — curling /v1/responses directly inside the sandbox always returns valid text. The issue is in OpenClaw's response parsing when using openai-responses API with a LiteLLM proxy.
Expected: agent returns text on every call.
Actual: agent returns empty text on most calls (~6/9 in testing).
Reproduction Steps
nemoclaw onboard
- Choose
3 (Other OpenAI-compatible endpoint)
- Enter LiteLLM proxy URL (e.g.
http://<host>:4000/v1) + API key + model (e.g. gemini-3.1-flash-lite-preview)
- Onboard prints:
Responses API available — OpenClaw will use openai-responses.
- Build output shows:
Step 24/40 : ARG NEMOCLAW_INFERENCE_API=openai-responses
nemoclaw <name> connect
openclaw agent --agent main -m "hello" --session-id test --verbose on
- Observe:
"completed" with no text reply (intermittent — may work on first call, fails on most subsequent calls)
Workaround: Use --from with a custom Dockerfile containing ENV NEMOCLAW_INFERENCE_API=openai-completions. No workaround without --from.
Environment
- OS: macOS 15 (Apple M3)
- Docker: Docker Desktop 4.x
- NemoClaw: v0.0.7
- OpenShell: 0.0.23
- OpenClaw: 2026.3.11
- LiteLLM proxy with
gemini-3.1-flash-lite-preview
Logs
Session transcript showing empty text with consumed tokens:
{"role":"assistant","content":[{"type":"text","text":""}],"usage":{"input":12020,"output":540},"stopReason":"stop"}
Direct curl inside sandbox returns valid text:
{"output":[{"content":[{"type":"output_text","text":"Hello! How can I help you today?"}]}],"usage":{"input_tokens":1,"output_tokens":135}}
Related code:
- `bin/lib/onboard.js:966-1016` — `probeOpenAiLikeEndpoint` tries `/responses` first
- `bin/lib/onboard.js:1081` — `"Responses API available — OpenClaw will use openai-responses."`
- `bin/lib/onboard.js:926-928` — `patchStagedDockerfile` writes probe result into `ARG NEMOCLAW_INFERENCE_API`
Agent-First Checklist
Agent Diagnostic
nemoclaw-referenceskill (architecture.md, commands.md)nemoclaw-configure-inferenceskill (inference-options.md)python3 -c "import json; print(json.load(open('/sandbox/.openclaw/openclaw.json'))['models']['providers']['inference']['api'])"inside sandbox → confirmedopenai-responsescurl -s https://inference.local/v1/responses ...inside sandbox → endpoint returns valid text every timeopenclaw agent --agent main -m "hello" --session-id test --verbose onmultiple times → intermittent empty repliesprobeOpenAiLikeEndpointinbin/lib/onboard.js:966-1016— probe tries/responsesfirst, selects on HTTP 200 only, no response parsing validationENV NEMOCLAW_INFERENCE_API=openai-completionsin a--fromDockerfile resolves the issue — all calls return text via/v1/chat/completionsopenclaw.jsonat build time (Landlock read-only), cannot be changed at runtimeDescription
When onboarding with a LiteLLM proxy as "Other OpenAI-compatible endpoint", the onboard probe selects
openai-responsesbecause LiteLLM returns HTTP 200 on/responses. After sandbox creation, agent calls intermittently return empty text — tokens are consumed (stopReason: "stop",isError: false) but the session transcript records"text":"".The inference endpoint itself works — curling
/v1/responsesdirectly inside the sandbox always returns valid text. The issue is in OpenClaw's response parsing when usingopenai-responsesAPI with a LiteLLM proxy.Expected: agent returns text on every call.
Actual: agent returns empty text on most calls (~6/9 in testing).
Reproduction Steps
nemoclaw onboard3(Other OpenAI-compatible endpoint)http://<host>:4000/v1) + API key + model (e.g.gemini-3.1-flash-lite-preview)Responses API available — OpenClaw will use openai-responses.Step 24/40 : ARG NEMOCLAW_INFERENCE_API=openai-responsesnemoclaw <name> connectopenclaw agent --agent main -m "hello" --session-id test --verbose on"completed"with no text reply (intermittent — may work on first call, fails on most subsequent calls)Workaround: Use
--fromwith a custom Dockerfile containingENV NEMOCLAW_INFERENCE_API=openai-completions. No workaround without--from.Environment
gemini-3.1-flash-lite-previewLogs
Session transcript showing empty text with consumed tokens: {"role":"assistant","content":[{"type":"text","text":""}],"usage":{"input":12020,"output":540},"stopReason":"stop"} Direct curl inside sandbox returns valid text: {"output":[{"content":[{"type":"output_text","text":"Hello! How can I help you today?"}]}],"usage":{"input_tokens":1,"output_tokens":135}} Related code: - `bin/lib/onboard.js:966-1016` — `probeOpenAiLikeEndpoint` tries `/responses` first - `bin/lib/onboard.js:1081` — `"Responses API available — OpenClaw will use openai-responses."` - `bin/lib/onboard.js:926-928` — `patchStagedDockerfile` writes probe result into `ARG NEMOCLAW_INFERENCE_API`Agent-First Checklist
debug-openshell-cluster,debug-inference,openshell-cli)