Skip to content

Use claude-code for free with NVIDIA-NIM via the terminal CLI or telegram; pm2 start "uv run uvicorn server:app --host 0.0.0.0 --port 8082" --name "claude-proxy"

License

Notifications You must be signed in to change notification settings

rishiskhare/free-claude-code

 
 

Repository files navigation

Free Claude Code

Use Claude Code CLI & VSCode — for free. No Anthropic API key required.

License: MIT Python 3.14 uv Tested with Pytest Type checking: Ty Code style: Ruff Logging: Loguru

A lightweight proxy server that translates Claude Code's Anthropic API calls into NVIDIA NIM, OpenRouter, or LM Studio format. Get 40 free requests/min on NVIDIA NIM, access hundreds of models on OpenRouter, or run fully local with LM Studio.

Features · Quick Start · How It Works · Discord Bot · Configuration


Free Claude Code in action

Claude Code running via NVIDIA NIM — completely free

Features

Feature Description
Zero Cost 40 req/min free on NVIDIA NIM. Free models on OpenRouter. Fully local with LM Studio
Drop-in Replacement Set 2 env vars — no modifications to Claude Code CLI or VSCode extension needed
3 Providers NVIDIA NIM, OpenRouter (hundreds of models), LM Studio (local & offline)
Thinking Token Support Parses <think> tags and reasoning_content into native Claude thinking blocks
Heuristic Tool Parser Models outputting tool calls as text are auto-parsed into structured tool use
Request Optimization 5 categories of trivial API calls intercepted locally — saves quota and latency
Discord Bot Remote autonomous coding with tree-based threading, session persistence, and live progress (Telegram also supported)
Smart Rate Limiting Proactive rolling-window throttle + reactive 429 exponential backoff + optional concurrency cap across all providers
Subagent Control Task tool interception forces run_in_background=False — no runaway subagents
Extensible Clean BaseProvider and MessagingPlatform ABCs — add new providers or platforms easily

Quick Start

Prerequisites

  1. Get an API key (or use LM Studio locally):
  2. Install Claude Code
  3. Install uv

Clone & Configure

git clone https://github.com/Alishahryar1/free-claude-code.git
cd free-claude-code
cp .env.example .env

Choose your provider and edit .env:

NVIDIA NIM (recommended — 40 req/min free)
PROVIDER_TYPE=nvidia_nim
NVIDIA_NIM_API_KEY=nvapi-your-key-here
MODEL=stepfun-ai/step-3.5-flash
OpenRouter (hundreds of models)
PROVIDER_TYPE=open_router
OPENROUTER_API_KEY=sk-or-your-key-here
MODEL=stepfun/step-3.5-flash:free
LM Studio (fully local, no API key)
PROVIDER_TYPE=lmstudio
MODEL=lmstudio-community/qwen2.5-7b-instruct

Run It

Terminal 1 — Start the proxy server:

uv run uvicorn server:app --host 0.0.0.0 --port 8082

Terminal 2 — Run Claude Code:

ANTHROPIC_AUTH_TOKEN=freecc ANTHROPIC_BASE_URL=http://localhost:8082 claude

That's it! Claude Code now uses your configured provider for free.

Multi-Model Support (Model Picker)

claude-pick is an interactive model selector that lets you choose any model from your active provider each time you launch Claude — no need to edit MODEL in .env every time you want to switch.

Screen.Recording.2026-02-18.at.5.48.41.PM.mov

1. Install fzf (highly recommended for the interactive picker):

brew install fzf        # macOS/Linux

2. Add the alias to ~/.zshrc or ~/.bashrc:

# Use the absolute path to your cloned repo
alias claude-pick="/absolute/path/to/free-claude-code/claude-pick"

Then reload your shell (source ~/.zshrc or source ~/.bashrc) and run claude-pick to pick a model and launch Claude.

Skip the picker with a fixed model (no picker needed):

alias claude-kimi='ANTHROPIC_BASE_URL="http://localhost:8082" ANTHROPIC_AUTH_TOKEN="freecc:moonshotai/kimi-k2.5" claude'
VSCode Extension Setup
  1. Start the proxy server (same as above).
  2. Open Settings (Ctrl + ,) and search for claude-code.environmentVariables.
  3. Click Edit in settings.json and add:
"claude-code.environmentVariables": [
  { "name": "ANTHROPIC_BASE_URL", "value": "http://localhost:8082" },
  { "name": "ANTHROPIC_AUTH_TOKEN", "value": "freecc" }
]
  1. Reload extensions.
  2. If you see the login screen ("How do you want to log in?"): Click Anthropic Console, then authorize. The extension will start working. You may be redirected to buy credits in the browser — ignore that; the extension already works.

To switch back to Anthropic models, comment out the added block and reload extensions.


How It Works

┌─────────────────┐        ┌──────────────────────┐        ┌──────────────────┐
│  Claude Code    │───────>│  Free Claude Code    │───────>│  LLM Provider    │
│  CLI / VSCode   │<───────│  Proxy (:8082)       │<───────│  NIM / OR / LMS  │
└─────────────────┘        └──────────────────────┘        └──────────────────┘
   Anthropic API                     │                       OpenAI-compatible
   format (SSE)              ┌───────┴────────┐                format (SSE)
                             │ Optimizations  │
                             ├────────────────┤
                             │ Quota probes   │
                             │ Title gen skip │
                             │ Prefix detect  │
                             │ Suggestion skip│
                             │ Filepath mock  │
                             └────────────────┘
  • Transparent proxy — Claude Code sends standard Anthropic API requests to the proxy server
  • Request optimization — 5 categories of trivial requests (quota probes, title generation, prefix detection, suggestions, filepath extraction) are intercepted and responded to instantly without using API quota
  • Format translation — Real requests are translated from Anthropic format to the provider's OpenAI-compatible format and streamed back
  • Thinking tokens<think> tags and reasoning_content fields are converted into native Claude thinking blocks so Claude Code renders them correctly

Providers

Provider Cost Rate Limit Models Best For
NVIDIA NIM Free 40 req/min Kimi K2, GLM5, Devstral, MiniMax Daily driver — generous free tier
OpenRouter Free / Paid Varies 200+ (GPT-4o, Claude, Step, etc.) Model variety, fallback options
LM Studio Free (local) Unlimited Any GGUF model Privacy, offline use, no rate limits

Switch providers by changing PROVIDER_TYPE in .env:

Provider PROVIDER_TYPE API Key Variable Base URL
NVIDIA NIM nvidia_nim NVIDIA_NIM_API_KEY integrate.api.nvidia.com/v1
OpenRouter open_router OPENROUTER_API_KEY openrouter.ai/api/v1
LM Studio lmstudio (none) localhost:1234/v1

OpenRouter gives access to hundreds of models (StepFun, OpenAI, Anthropic, etc.) through a single API. Set MODEL to any OpenRouter model ID.

LM Studio runs locally — start the server in LM Studio's Developer tab or via lms server start, load a model, and set MODEL to the model identifier.


Discord Bot

Control Claude Code remotely from Discord. Send tasks, watch live progress, and manage multiple concurrent sessions. Discord is the default messaging platform; Telegram is also supported.

Capabilities:

  • Tree-based message threading — reply to messages to fork conversations
  • Session persistence across server restarts
  • Live streaming of thinking tokens, tool calls, and results
  • Unlimited concurrent Claude CLI sessions (provider concurrency controlled by PROVIDER_MAX_CONCURRENCY)
  • Voice notes — send voice messages; they are transcribed to text and processed like regular prompts (see Voice Notes)
  • Commands: /stop (cancel tasks; reply to a message to stop only that task), /clear (standalone: reset all sessions; reply to a message to clear that branch downwards), /stats

Setup

  1. Create a Discord Bot — Go to Discord Developer Portal, create an application, add a bot, and copy the token. Enable Message Content Intent under Bot settings.

  2. Edit .env:

MESSAGING_PLATFORM=discord
DISCORD_BOT_TOKEN=your_discord_bot_token
ALLOWED_DISCORD_CHANNELS=123456789,987654321

Enable Developer Mode in Discord (Settings → Advanced), then right-click a channel and "Copy ID" to get channel IDs. Comma-separate multiple channels. If empty, no channels are allowed.

  1. Configure the workspace (where Claude will operate):
CLAUDE_WORKSPACE=./agent_workspace
ALLOWED_DIR=C:/Users/yourname/projects
  1. Start the server:
uv run uvicorn server:app --host 0.0.0.0 --port 8082
  1. Invite the bot to your server (OAuth2 → URL Generator, scopes: bot, permissions: Read Messages, Send Messages, Manage Messages, Read Message History). Send a message in an allowed channel with a task. Claude responds with thinking tokens, tool calls as they execute, and the final result. Reply to messages to cancel tasks or clear branches (see Commands above).

Telegram (Alternative)

To use Telegram instead, set MESSAGING_PLATFORM=telegram and configure:

TELEGRAM_BOT_TOKEN=123456789:ABCdefGHIjklMNOpqrSTUvwxYZ
ALLOWED_TELEGRAM_USER_ID=your_telegram_user_id

Get a token from @BotFather; find your user ID via @userinfobot.

Voice Notes

Send voice messages on Telegram or Discord; they are transcribed to text and processed as regular prompts. Uses Hugging Face transformers Whisper — free, no API key, works offline, CUDA 13 compatible. No ffmpeg required (audio loaded via librosa).

Install the optional voice extra:

uv sync --extra voice

Configuration:

Variable Description Default
VOICE_NOTE_ENABLED Enable voice note handling true
WHISPER_MODEL Hugging Face model ID or short name (tiny, base, small, medium, large-v2, large-v3, large-v3-turbo) base
WHISPER_DEVICE cpu | cuda cpu
HF_TOKEN Hugging Face token for faster model downloads (optional; create one)

Models

NVIDIA NIM

Full list in nvidia_nim_models.json.

Popular models:

  • qwen/qwen3.5-397b-a17b
  • z-ai/glm5
  • stepfun-ai/step-3.5-flash
  • moonshotai/kimi-k2.5
  • minimaxai/minimax-m2.1

Browse: build.nvidia.com

Update model list:

curl "https://integrate.api.nvidia.com/v1/models" > nvidia_nim_models.json
OpenRouter

Hundreds of models from StepFun, OpenAI, Anthropic, Google, and more.

Popular models:

  • stepfun/step-3.5-flash:free
  • deepseek/deepseek-r1-0528:free
  • openai/gpt-oss-120b:free

Browse: openrouter.ai/models

Browse free models: https://openrouter.ai/collections/free-models

LM Studio

Run models locally with LM Studio. Load a model in the Chat or Developer tab, then set MODEL to its identifier.

Examples (native tool-use support):

  • lmstudio-community/qwen2.5-7b-instruct
  • lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF
  • bartowski/Ministral-8B-Instruct-2410-GGUF

Browse: model.lmstudio.ai


Configuration

Variable Description Default
PROVIDER_TYPE Provider: nvidia_nim, open_router, or lmstudio nvidia_nim
MODEL Model to use for all requests stepfun-ai/step-3.5-flash
NVIDIA_NIM_API_KEY NVIDIA API key (NIM provider) required
OPENROUTER_API_KEY OpenRouter API key (OpenRouter provider) required
LM_STUDIO_BASE_URL LM Studio server URL http://localhost:1234/v1
PROVIDER_RATE_LIMIT LLM API requests per window 40
PROVIDER_RATE_WINDOW Rate limit window (seconds) 60
PROVIDER_MAX_CONCURRENCY Max simultaneous open provider streams 5
HTTP_READ_TIMEOUT Read timeout for provider API requests (seconds) 300
HTTP_WRITE_TIMEOUT Write timeout for provider API requests (seconds) 10
HTTP_CONNECT_TIMEOUT Connect timeout for provider API requests (seconds) 2
FAST_PREFIX_DETECTION Enable fast prefix detection true
ENABLE_NETWORK_PROBE_MOCK Enable network probe mock true
ENABLE_TITLE_GENERATION_SKIP Skip title generation true
ENABLE_SUGGESTION_MODE_SKIP Skip suggestion mode true
ENABLE_FILEPATH_EXTRACTION_MOCK Enable filepath extraction mock true
MESSAGING_PLATFORM Messaging platform: discord or telegram discord
DISCORD_BOT_TOKEN Discord Bot Token ""
ALLOWED_DISCORD_CHANNELS Comma-separated channel IDs (empty = none allowed) ""
TELEGRAM_BOT_TOKEN Telegram Bot Token ""
ALLOWED_TELEGRAM_USER_ID Allowed Telegram User ID ""
VOICE_NOTE_ENABLED Enable voice note handling true
WHISPER_MODEL Local Whisper model size base
WHISPER_DEVICE cpu | cuda cpu
MESSAGING_RATE_LIMIT Messaging messages per window 1
MESSAGING_RATE_WINDOW Messaging window (seconds) 1
CLAUDE_WORKSPACE Directory for agent workspace ./agent_workspace
ALLOWED_DIR Allowed directories for agent ""

See .env.example for all supported parameters.


Development

Project Structure

free-claude-code/
├── server.py              # Entry point
├── api/                   # FastAPI routes, request detection, optimization handlers
├── providers/             # BaseProvider, OpenAICompatibleProvider, NIM, OpenRouter, LM Studio
│   └── common/            # Shared utils (SSE builder, message converter, parsers, error mapping)
├── messaging/             # MessagingPlatform ABC + Discord/Telegram bots, session management
├── config/                # Settings, NIM config, logging
├── cli/                   # CLI session and process management
├── utils/                 # Text utilities
└── tests/                 # Pytest test suite

Commands

uv run ruff format     # Format code
uv run ruff check      # Code style checking
uv run ty check        # Type checking
uv run pytest          # Run tests

Extending

Adding a Provider

For OpenAI-compatible APIs (Groq, Together AI, etc.), extend OpenAICompatibleProvider:

from providers.openai_compat import OpenAICompatibleProvider
from providers.base import ProviderConfig

class MyProvider(OpenAICompatibleProvider):
    def __init__(self, config: ProviderConfig):
        super().__init__(config, provider_name="MYPROVIDER",
                         base_url="https://api.example.com/v1", api_key=config.api_key)

    def _build_request_body(self, request):
        return build_request_body(request)  # Your request builder

For fully custom APIs, extend BaseProvider directly:

from providers.base import BaseProvider, ProviderConfig

class MyProvider(BaseProvider):
    async def stream_response(self, request, input_tokens=0, *, request_id=None):
        # Yield Anthropic SSE format events
        ...

Adding a Messaging Platform

Extend MessagingPlatform in messaging/ to add Slack or other platforms:

from messaging.base import MessagingPlatform

class MyPlatform(MessagingPlatform):
    async def start(self):
        # Initialize connection
        ...

    async def stop(self):
        # Cleanup
        ...

    async def send_message(self, chat_id, text, reply_to=None, parse_mode=None, message_thread_id=None):
        # Send a message
        ...

    async def edit_message(self, chat_id, message_id, text, parse_mode=None):
        # Edit an existing message
        ...

    def on_message(self, handler):
        # Register callback for incoming messages
        ...

Contributing

Contributions are welcome! Here are some ways to help:

  • Report bugs or suggest features via Issues
  • Add new LLM providers (Groq, Together AI, etc.)
  • Add new messaging platforms (Slack, etc.)
  • Improve test coverage
# Fork the repo, then:
git checkout -b my-feature
# Make your changes
uv run ruff format && uv run ruff check && uv run ty check && uv run pytest
# Open a pull request

License

This project is licensed under the MIT License — see the LICENSE file for details.

Built with FastAPI, OpenAI Python SDK, discord.py, and python-telegram-bot.

About

Use claude-code for free with NVIDIA-NIM via the terminal CLI or telegram; pm2 start "uv run uvicorn server:app --host 0.0.0.0 --port 8082" --name "claude-proxy"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 99.4%
  • Shell 0.6%