Sage

Converse with large language models using speech.

Open: Powered by state-of-the-art open-source speech processing models.
Efficient: Light enough to run on consumer hardware, with low latency.
Self-hosted: Entire pipeline runs offline, limited only by compute power.
Modular: Switching LLM providers is as simple as changing an environment variable.

How it works

Run setup-unix.sh or setup-win.bat depending on your platform. This will download the required model weights and compile the binaries needed for Sage.
For text generation, you can either self-host an LLM using Ollama, or opt for a third-party provider.

If you're using Ollama, add the OLLAMA_MODEL variable to the .env file to specify the model you'd like to use. (Example: OLLAMA_MODEL=deepseek-r1:7b)
Among the third-party providers, Sage supports the following out of the box:
1. Deepseek
2. OpenAI
3. Anthropic
4. Together.ai
To use a provider, add a <PROVIDER>_API_KEY variable to the .env file. (Example: OPENAI_API_KEY=xxxxxxxxxxxxxxxxxxxxxxx)
To choose which model should be used for a given provider, use the <PROVIDER>_MODEL variable. (Example: DEEPSEEK_MODEL=deepseek-chat)

Start the project with bun start. The first run on macOS is slow (~20 minutes on M1 Pro), since the ANE service compiles the Whisper CoreML model to a device-specific format. Next runs are faster.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
assets		assets
kokoro		kokoro
public		public
whisper		whisper
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bun.lock		bun.lock
index.ts		index.ts
package.json		package.json
setup-unix.sh		setup-unix.sh
setup-win.bat		setup-win.bat
tsconfig.json		tsconfig.json