Skip to content

feat: add Anthropic OAuth integration for user-specific API access#34

Closed
mgpai22 wants to merge 11 commits intoColeMurray:mainfrom
mgpai22:main
Closed

feat: add Anthropic OAuth integration for user-specific API access#34
mgpai22 wants to merge 11 commits intoColeMurray:mainfrom
mgpai22:main

Conversation

@mgpai22
Copy link

@mgpai22 mgpai22 commented Jan 31, 2026

Summary

  • Anthropic OAuth flow: Adds a code-paste OAuth flow with settings page for users to connect their Anthropic accounts, enabling user-specific API access in sandboxes
  • Proactive token refresh: Control plane automatically refreshes OAuth tokens before expiry, pushing fresh tokens to active sandboxes via WebSocket
  • Auth method indicator: Session header displays a badge showing whether the agent is using OAuth or API key authentication
  • Infrastructure: Terraform configuration for Anthropic OAuth client credentials, sandbox bridge support for receiving and applying token refreshes

Changes across packages

  • control-plane: OAuth token storage/management in Durable Objects, token refresh in lifecycle manager, new API routes
  • modal-infra: Bridge SSE handler for token refresh events, sandbox entrypoint passes OAuth tokens
  • web: OAuth callback route, settings page with connect/disconnect, auth badge in session header
  • terraform: Anthropic OAuth environment variables for Modal and control plane

- Base64 encode secrets JSON to safely pass multiline PEM keys
- Use uv run for modal deploy commands
- Add fastapi, PyJWT, and cryptography to modal-infra dependencies
- Update .gitignore for .env*.local files
- Add terraform production .gitignore
- Update terraform lock file hashes
- Add /internal/anthropic-token CRUD endpoints for encrypted token storage
- Add token refresh logic with correct Anthropic v1 OAuth endpoint
- Add ANTHROPIC_CLIENT_ID to Env interface for refresh flow
- Add Anthropic token columns to participants schema
- Export token management utilities from auth module
- Remove redundant X-Internal-Secret check from store handler
  (HMAC middleware already handles auth)
…fic API access

- Retrieve and auto-refresh OAuth token when spawning sandboxes
- Pass token through control plane -> Modal -> sandbox environment
- Sandbox supervisor sets ANTHROPIC_API_KEY to user's OAuth token
- Write auth.json for OpenCode compatibility
- Fall back to shared API key when no OAuth token is available
- Add OAuth initiation endpoint with PKCE (S256) code challenge
- Add token exchange callback using code-paste flow
- Add settings page with AnthropicConnection component
- Add status and disconnect API routes using controlPlaneFetch
- Use Anthropic's code-display callback URI to avoid redirect
  registration issues
- Include userId, githubLogin, githubName, githubEmail when creating
  sessions so the control plane sets the correct owner for OAuth
  token lookup
- Add settings link to session sidebar
- Add anthropic_client_id and anthropic_client_secret variables
- Pass ANTHROPIC_CLIENT_ID to control plane worker for token refresh
- Add Anthropic OAuth env vars to Vercel web app
- Pass anthropic_oauth_token through web_api.py to SessionConfig
- Write OAuth tokens to auth.json instead of overriding ANTHROPIC_API_KEY,
  since OAuth tokens (sk-ant-oat01-*) cannot be used as x-api-key headers
- Filter user message parts in bridge SSE to prevent echo on silent failure
- Detect empty assistant message sets and report errors instead of success
- Add robust error message extraction for OpenCode's various error formats
- Add diagnostic logging for API key availability at sandbox startup
Resolve merge conflicts:
- client.ts: take upstream's structured logging
- durable-object.ts: take upstream's lifecycle manager delegation
- bridge.py: keep error extraction + add structured logging
- manager.py: keep both GITHUB_APP_TOKEN and ANTHROPIC_OAUTH_TOKEN

Add OAuth token support through the lifecycle manager chain:
- Add anthropicOAuthToken to CreateSandboxConfig and RestoreConfig
- Add getAnthropicOAuthToken resolver to SandboxLifecycleConfig
- Pass token through ModalSandboxProvider to ModalClient
- Create OAuth token resolver in SessionDO for lifecycle manager
- Add User-Agent header to token refresh request to avoid Cloudflare
  blocking non-browser requests (error 1010)
- Add proactive token refresh in DO alarm handler so tokens stay fresh
  between sandbox spawns
- Push refreshed tokens to running sandboxes via WebSocket update_token
  command so long-running sessions don't lose auth mid-session
- Add update_token command handler in bridge to write refreshed auth.json
Display a badge in the session UI showing whether the agent is using
the user's Claude Code Subscription (OAuth) or the shared API key.
Persists auth method in sandbox table and broadcasts via WebSocket.
@greptile-apps
Copy link

greptile-apps bot commented Jan 31, 2026

Greptile Overview

Greptile Summary

This PR adds comprehensive Anthropic OAuth integration enabling user-specific API access with proactive token refresh. The implementation includes a code-paste OAuth flow, automatic token refresh via lifecycle manager alarms, WebSocket-based token push to sandboxes, and UI indicators showing authentication method.

Key Achievements:

  • Complete OAuth flow from authorization to token storage with PKCE
  • Proactive token refresh 5 minutes before expiry
  • Real-time token push to running sandboxes via bridge
  • Auth method badge showing OAuth vs API key usage
  • Terraform configuration for OAuth client credentials

Architecture:

  • Control plane stores encrypted tokens in KV and manages refresh lifecycle
  • Lifecycle manager resolves owner's OAuth token during sandbox spawn/restore
  • Bridge handles update_token commands to refresh OpenCode's auth.json
  • Web UI implements code-paste flow matching OpenCode's pattern

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 6 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 686 to 763
private async proactiveTokenRefresh(): Promise<void> {
if (!this.env.SESSION_INDEX || !this.env.TOKEN_ENCRYPTION_KEY) return;

try {
const ownerResult = this.sql.exec(
`SELECT user_id FROM participants WHERE role = 'owner' LIMIT 1`
);
const owners = ownerResult.toArray() as { user_id: string }[];
const ownerUserId = owners[0]?.user_id;
if (!ownerUserId) return;

const tokenData = (await this.env.SESSION_INDEX.get(
`anthropic:token:${ownerUserId}`,
"json"
)) as {
accessTokenEncrypted: string;
refreshTokenEncrypted?: string;
expiresAt: number;
} | null;

if (!tokenData || !tokenData.refreshTokenEncrypted) return;

// Import inline to avoid circular deps
const { tokenNeedsRefresh, refreshAnthropicToken } = await import("../auth/anthropic");
const { decryptToken } = await import("../auth/crypto");

// Check if token needs refresh (within 5 min of expiry)
if (!tokenNeedsRefresh(tokenData.expiresAt)) return;

this.log.info("Proactive token refresh starting", { user_id: ownerUserId });

const clientId = this.env.ANTHROPIC_CLIENT_ID || "";
const encKey = this.env.TOKEN_ENCRYPTION_KEY;

const refreshResult = await refreshAnthropicToken(
tokenData.refreshTokenEncrypted,
clientId,
encKey
);

if (!refreshResult.success || !refreshResult.accessToken || !refreshResult.expiresAt) {
this.log.warn("Proactive token refresh failed", { error: refreshResult.error });
return;
}

// Persist refreshed tokens to KV
await this.env.SESSION_INDEX.put(
`anthropic:token:${ownerUserId}`,
JSON.stringify({
accessTokenEncrypted: refreshResult.accessToken,
refreshTokenEncrypted: refreshResult.refreshToken || tokenData.refreshTokenEncrypted,
expiresAt: refreshResult.expiresAt,
storedAt: Date.now(),
})
);

this.log.info("Proactive token refresh succeeded", {
user_id: ownerUserId,
new_expiry: new Date(refreshResult.expiresAt).toISOString(),
});

// If sandbox is running, push the new token to it
const sandboxWs = this.getSandboxWebSocket();
if (sandboxWs) {
const decryptedToken = await decryptToken(refreshResult.accessToken, encKey);
this.safeSend(sandboxWs, {
type: "update_token",
token: decryptedToken,
expiresAt: refreshResult.expiresAt,
});
this.log.info("Pushed refreshed token to sandbox");
}
} catch (e) {
this.log.error("Proactive token refresh error", {
error: e instanceof Error ? e : String(e),
});
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If token refresh fails or owner has no OAuth token, the alarm will still reschedule in 30 seconds via scheduleInactivityCheck(). This could cause repeated failed refresh attempts. Consider tracking refresh failures and backing off or skipping refresh attempts for sessions without OAuth tokens.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/control-plane/src/session/durable-object.ts
Line: 686:763

Comment:
If token refresh fails or owner has no OAuth token, the alarm will still reschedule in 30 seconds via `scheduleInactivityCheck()`. This could cause repeated failed refresh attempts. Consider tracking refresh failures and backing off or skipping refresh attempts for sessions without OAuth tokens.

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 238 to 257
# Check for user's Anthropic OAuth token (for user-specific API access)
# OAuth tokens (sk-ant-oat01-...) cannot be used as x-api-key headers.
# Instead, write auth.json so OpenCode uses its native OAuth auth flow.
# The shared ANTHROPIC_API_KEY remains as a fallback.
anthropic_oauth_token = os.environ.get("ANTHROPIC_OAUTH_TOKEN")
if anthropic_oauth_token:
print("[supervisor] Using user's Anthropic OAuth token for API access")

# Write the auth.json file for OpenCode's native OAuth auth
opencode_data_dir = Path.home() / ".local" / "share" / "opencode"
opencode_data_dir.mkdir(parents=True, exist_ok=True)
auth_json_path = opencode_data_dir / "auth.json"
auth_data = {
"accessToken": anthropic_oauth_token,
"expiresAt": int(time.time() * 1000) + 3600000, # 1 hour from now
}
auth_json_path.write_text(json.dumps(auth_data))
print(f"[supervisor] Wrote OAuth token to {auth_json_path}")
else:
print("[supervisor] No OAuth token, using shared ANTHROPIC_API_KEY")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded expiresAt of 1 hour from now doesn't match the actual token expiry from control plane. When tokens are refreshed proactively (5 min before expiry), OpenCode may see the old expiry timestamp. Consider passing expiresAt as an environment variable from the control plane or accepting it in the refresh command payload.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/modal-infra/src/sandbox/entrypoint.py
Line: 238:257

Comment:
The hardcoded `expiresAt` of 1 hour from now doesn't match the actual token expiry from control plane. When tokens are refreshed proactively (5 min before expiry), OpenCode may see the old expiry timestamp. Consider passing `expiresAt` as an environment variable from the control plane or accepting it in the refresh command payload.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 1153 to 1173
async def _handle_update_token(self, cmd: dict[str, Any]) -> None:
"""Handle update_token command - write refreshed OAuth token to auth.json."""
token = cmd.get("token")
expires_at = cmd.get("expiresAt")
if not token:
self.log.warn("bridge.update_token_missing")
return

try:
opencode_data_dir = Path.home() / ".local" / "share" / "opencode"
opencode_data_dir.mkdir(parents=True, exist_ok=True)
auth_json_path = opencode_data_dir / "auth.json"
auth_data = {
"accessToken": token,
"expiresAt": expires_at or int(time.time() * 1000) + 3600000,
}
auth_json_path.write_text(json.dumps(auth_data))
self.log.info("bridge.token_updated")
except Exception as e:
self.log.error("bridge.update_token_error", exc=e)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The update_token handler receives expiresAt from the control plane but defaults to 1 hour if missing. Using the actual expiresAt from the refresh response ensures OpenCode's auth.json matches the token's real expiry time.

Suggested change
async def _handle_update_token(self, cmd: dict[str, Any]) -> None:
"""Handle update_token command - write refreshed OAuth token to auth.json."""
token = cmd.get("token")
expires_at = cmd.get("expiresAt")
if not token:
self.log.warn("bridge.update_token_missing")
return
try:
opencode_data_dir = Path.home() / ".local" / "share" / "opencode"
opencode_data_dir.mkdir(parents=True, exist_ok=True)
auth_json_path = opencode_data_dir / "auth.json"
auth_data = {
"accessToken": token,
"expiresAt": expires_at or int(time.time() * 1000) + 3600000,
}
auth_json_path.write_text(json.dumps(auth_data))
self.log.info("bridge.token_updated")
except Exception as e:
self.log.error("bridge.update_token_error", exc=e)
try:
opencode_data_dir = Path.home() / ".local" / "share" / "opencode"
opencode_data_dir.mkdir(parents=True, exist_ok=True)
auth_json_path = opencode_data_dir / "auth.json"
# Use expiresAt from control plane, or fallback to current time + 1 hour
auth_data = {
"accessToken": token,
"expiresAt": expires_at if expires_at else int(time.time() * 1000) + 3600000,
}
auth_json_path.write_text(json.dumps(auth_data))
self.log.info("bridge.token_updated", expires_at=expires_at)
except Exception as e:
self.log.error("bridge.update_token_error", exc=e)
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/modal-infra/src/sandbox/bridge.py
Line: 1153:1173

Comment:
The `update_token` handler receives `expiresAt` from the control plane but defaults to 1 hour if missing. Using the actual `expiresAt` from the refresh response ensures OpenCode's auth.json matches the token's real expiry time.

```suggestion
        try:
            opencode_data_dir = Path.home() / ".local" / "share" / "opencode"
            opencode_data_dir.mkdir(parents=True, exist_ok=True)
            auth_json_path = opencode_data_dir / "auth.json"
            # Use expiresAt from control plane, or fallback to current time + 1 hour
            auth_data = {
                "accessToken": token,
                "expiresAt": expires_at if expires_at else int(time.time() * 1000) + 3600000,
            }
            auth_json_path.write_text(json.dumps(auth_data))
            self.log.info("bridge.token_updated", expires_at=expires_at)
        except Exception as e:
            self.log.error("bridge.update_token_error", exc=e)
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +42 to +44
export function tokenNeedsRefresh(expiresAt: number): boolean {
return Date.now() >= expiresAt - REFRESH_BUFFER_MS;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using a more generous refresh buffer (e.g., 15-30 minutes) to account for clock skew, network delays, and token refresh latency. The 5-minute buffer might be tight for distributed systems where the control plane alarm, token refresh API call, and sandbox update all need to complete before expiry.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/control-plane/src/auth/anthropic.ts
Line: 42:44

Comment:
Consider using a more generous refresh buffer (e.g., 15-30 minutes) to account for clock skew, network delays, and token refresh latency. The 5-minute buffer might be tight for distributed systems where the control plane alarm, token refresh API call, and sandbox update all need to complete before expiry.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 60 to 64
// The pasted code is "<authorization_code>#<state>" — split on "#"
const splits = rawCode.split("#");
const code = splits[0];
const state = splits[1] || "";

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code splits on # assuming the format is always <code>#<state>. If Anthropic's OAuth response format changes or users paste malformed codes, splits[1] could be undefined. Consider validating the split result and providing clearer error messages for malformed codes.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/web/src/app/api/auth/anthropic/callback/route.ts
Line: 60:64

Comment:
The code splits on `#` assuming the format is always `<code>#<state>`. If Anthropic's OAuth response format changes or users paste malformed codes, `splits[1]` could be undefined. Consider validating the split result and providing clearer error messages for malformed codes.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 281 to 340
private createAnthropicOAuthTokenResolver(): (() => Promise<string | undefined>) | undefined {
if (!this.env.SESSION_INDEX || !this.env.TOKEN_ENCRYPTION_KEY) {
return undefined;
}

const kvRef = this.env.SESSION_INDEX;
const encKey = this.env.TOKEN_ENCRYPTION_KEY;
const clientId = this.env.ANTHROPIC_CLIENT_ID || "";
const sql = this.sql;
const log = this.log;

return async (): Promise<string | undefined> => {
const ownerResult = sql.exec(`SELECT user_id FROM participants WHERE role = 'owner' LIMIT 1`);
const owners = ownerResult.toArray() as { user_id: string }[];
const ownerUserId = owners[0]?.user_id;
if (!ownerUserId) return undefined;

const tokenData = (await kvRef.get(`anthropic:token:${ownerUserId}`, "json")) as {
accessTokenEncrypted: string;
refreshTokenEncrypted?: string;
expiresAt: number;
} | null;

if (!tokenData) return undefined;

const token =
(await getValidAnthropicToken(
tokenData.accessTokenEncrypted,
tokenData.refreshTokenEncrypted,
tokenData.expiresAt,
clientId,
encKey,
async (result) => {
if (result.success && result.accessToken && result.expiresAt) {
await kvRef.put(
`anthropic:token:${ownerUserId}`,
JSON.stringify({
accessTokenEncrypted: result.accessToken,
refreshTokenEncrypted: result.refreshToken || tokenData.refreshTokenEncrypted,
expiresAt: result.expiresAt,
storedAt: Date.now(),
})
);
log.info("Refreshed Anthropic token", {
user_id: ownerUserId,
new_expiry: new Date(result.expiresAt).toISOString(),
});
}
}
)) || undefined;

if (token) {
log.info("Using Anthropic OAuth token", { user_id: ownerUserId });
} else {
log.info("Anthropic token expired/refresh failed", { user_id: ownerUserId });
}

return token;
};
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OAuth token resolver queries the database and KV on every call during sandbox lifecycle operations. Consider caching the resolved token for a short duration (e.g., 1-5 minutes) to reduce database queries, especially since tokens are valid for extended periods and refreshed proactively.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/control-plane/src/session/durable-object.ts
Line: 281:340

Comment:
The OAuth token resolver queries the database and KV on every call during sandbox lifecycle operations. Consider caching the resolved token for a short duration (e.g., 1-5 minutes) to reduce database queries, especially since tokens are valid for extended periods and refreshed proactively.

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

- Increase token refresh buffer from 5 to 15 minutes before expiry
- Validate authorization code is non-empty in OAuth callback
- Improve bridge token update logging with explicit falsy check and
  expires_at in log output
- Add exponential backoff to proactive token refresh (1m-1hr cap) with
  noOAuthTokenConfigured fast-path to skip DB/KV lookups entirely
- Pass real expiresAt through full stack (resolver → lifecycle manager →
  provider → client → Modal sandbox → entrypoint) instead of hardcoding
  1hr fallback
- Fix pre-existing bug: client.ts was not serializing anthropicOAuthToken
  in the createSandbox JSON body
- Cache OAuth token resolver results for 2 minutes to reduce redundant
  DB/KV queries during spawn
@ColeMurray
Copy link
Owner

to my knowledge, anthropic's position on 3rd party usage of the claude sub has not changed.
https://x.com/trq212/status/2009689811616182404?s=20

Given this, I am not willing to add a feature that would potentially lead to unsuspecting users having their accounts banned. Happy to revisit if/when policy is changed.

@ColeMurray ColeMurray closed this Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants