pronounced "chuff" /tΚΚf/ β a fast, memory-efficient ASR CLI using Parakeet TDT 0.6b V3 via sherpa-onnx with chunked processing.
- β‘ Fast: 7-20x realtime transcription
- π§ Memory-efficient: Processes audio in chunks
- π¦ Any format: If
ffmpegsupports it,choughsupports it - π― No setup: Auto-downloads models on first run
- π Multiple formats: text, json, vtt
- π» CPU only: No GPU required
- π Server mode: HTTP API for batch processing
Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish, Russian, Ukrainian
ffmpeg- for audio/video support
paru -S chough-binbrew install --cask hyperpuncher/tap/choughwinget install choughDownload from GitHub Releases.
go install github.com/hyperpuncher/chough/cmd/chough@latestnpx skills add hyperpuncher/dotagents --skill chough# Basic transcription (text to stdout)
chough audio.mp3
# Pipe audio from stdin
cat audio.mp3 | chough
# Video files work too - extracts audio automatically
chough -f vtt -o subtitles.vtt lecture.mp4
# JSON with timestamps
chough -f json podcast.mp3 > transcript.json
# Smaller chunks for lower memory usage
chough -c 30 long-interview.wav
# Use remote server mode (requires CHOUGH_URL)
CHOUGH_URL=http://localhost:8080 chough --remote audio.mp3| Flag | Description | Default |
|---|---|---|
-c, --chunk-size |
Chunk size in seconds | 60 |
-f, --format |
Output format: text, json, vtt | text |
-o, --output |
Output file | stdout |
-r, --remote |
Transcribe via CHOUGH_URL server | - |
--version |
Show version | - |
-h, --help |
Show help | - |
Run chough as an HTTP server for API access. The server keeps the model loaded in memory (~1.6GB), eliminating the ~1.5s startup time per request.
# Start server
chough --server --port 8080
# With custom settings
chough --server --host 0.0.0.0 --port 8080 --workers 2| Method | Endpoint | Description |
|---|---|---|
| POST | /transcribe |
Transcribe audio (file upload, URL, or base64) |
| GET | /health |
Health check with queue status |
# Upload file
curl -X POST http://localhost:8080/transcribe \
-F "file=@audio.mp3" \
-F "format=json" \
-F "chunk_size=60"
# Transcribe from URL
curl -X POST http://localhost:8080/transcribe \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/audio.mp3", "format": "vtt"}'
# Base64 audio
curl -X POST http://localhost:8080/transcribe \
-H "Content-Type: application/json" \
-d '{"base64": "...", "format": "text"}'
# Health check
curl http://localhost:8080/health| Flag | Description | Default |
|---|---|---|
--server |
Run in server mode | - |
--host |
Server host | 0.0.0.0 |
--port |
Server port | 8080 |
--workers |
Concurrent workers | 2 |
--max-upload |
Max upload size (MB) | 1024 |
# Build and run
docker build -t chough .
docker run -d -p 8080:8080 chough
# Or use docker-compose
docker-compose up -dCHOUGH_MODEL: Path to model directory (optional, auto-downloaded if not set)CHOUGH_URL: Remote server URL for--remotemode (must start withhttp://orhttps://)
Default: Parakeet TDT 0.6b V3 (sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8.tar.bz2)
Models are automatically downloaded to $XDG_CACHE_HOME/chough/models (~650MB).
- Splits audio into 60s chunks (configurable)
- Loads ONNX model once (~1.5s)
- Processes chunks sequentially
- Outputs results
Benchmark on 1-minute audio file (AMD Ryzen 5 5600X, 6 cores):
| Tool | Model | Time | Relative | Realtime Factor | Memory |
|---|---|---|---|---|---|
| chough | Parakeet TDT 0.6b V3 | 4.3s | 13.2x | 14.1x | 1.6GB |
| whisper-ctranslate2 | medium | 27.8s | 2.0x | 2.2x | 1.7GB |
| whisper | turbo | 56.6s | 1.0x | 1.1x | 5.3GB |
chough is ~6-13x faster than other tools.
| Duration | Time | Speed |
|---|---|---|
| 15s | 2.0s | 7.4x realtime |
| 1min | 4.3s | 14.1x realtime |
| 5min | 16.2s | 18.5x realtime |
| 30min | 90.2s | 19.9x realtime |
Run your own benchmarks: just benchmark <audio-file>
MIT