open-webui · jqueguiner · Feb 26, 2026 · Feb 26, 2026 · Feb 26, 2026
diff --git a/docs/features/media-generation/audio/speech-to-text/env-variables.md b/docs/features/media-generation/audio/speech-to-text/env-variables.md
@@ -44,7 +44,7 @@ If using the `:cuda` Docker image with an older GPU, set `WHISPER_COMPUTE_TYPE=f
 
 | Variable | Description | Default |
 |----------|-------------|---------|
-| `AUDIO_STT_ENGINE` | STT engine: empty (local Whisper), `openai`, `azure`, `deepgram`, `mistral` | empty |
+| `AUDIO_STT_ENGINE` | STT engine: empty (local Whisper), `openai`, `azure`, `deepgram`, `mistral`, `gladia` | empty |
 | `AUDIO_STT_MODEL` | STT model for external providers | empty |
 | `AUDIO_STT_OPENAI_API_BASE_URL` | OpenAI-compatible API base URL | `https://api.openai.com/v1` |
 | `AUDIO_STT_OPENAI_API_KEY` | OpenAI API key | empty |
@@ -66,6 +66,12 @@ If using the `:cuda` Docker image with an older GPU, set `WHISPER_COMPUTE_TYPE=f
 |----------|-------------|---------|
 | `DEEPGRAM_API_KEY` | Deepgram API key | empty |
 
+### Gladia STT
+
+| Variable | Description | Default |
+|----------|-------------|---------|
+| `AUDIO_STT_GLADIA_API_KEY` | Gladia API key | empty |
+
 ### Mistral STT
 
 | Variable | Description | Default |

diff --git a/docs/features/media-generation/audio/speech-to-text/gladia-stt-integration.md b/docs/features/media-generation/audio/speech-to-text/gladia-stt-integration.md
@@ -0,0 +1,111 @@
+---
+sidebar_position: 3
+title: "Gladia STT Integration"
+---
+
+# Using Gladia for Speech-to-Text
+
+This guide covers how to use [Gladia.io](https://www.gladia.io/) for Speech-to-Text with Open WebUI. Gladia provides a powerful pre-recorded transcription API with support for multiple languages and automatic language detection. Gladia offer a 10h per month for free.
+
+## Requirements
+
+- A Gladia API key (get one at [gladia.io](https://www.gladia.io/))
+- Open WebUI installed and running
+
+## Quick Setup (UI)
+
+1. Click your **profile icon** (bottom-left corner)
+2. Select **Admin Panel**
+3. Click **Settings** → **Audio** tab
+4. Configure the following:
+
+| Setting | Value |
+|---------|-------|
+| **Speech-to-Text Engine** | `Gladia` |
+| **API Key** | Your Gladia API key |
+
+5. Click **Save**
+
+## How It Works
+
+Gladia uses an asynchronous 3-step transcription workflow:
+
+1. **Upload** — Your audio file is uploaded to Gladia's API
+2. **Transcribe** — A transcription job is initiated on the uploaded audio
+3. **Poll** — Results are polled until the transcription is complete
+
+This process is handled automatically by Open WebUI — you simply speak and receive the transcription.
+
+## Environment Variables Setup
+
+If you prefer to configure via environment variables:
+
+```yaml
+services:
+  open-webui:
+    image: ghcr.io/open-webui/open-webui:main
+    environment:
+      - AUDIO_STT_ENGINE=gladia
+      - AUDIO_STT_GLADIA_API_KEY=your-gladia-api-key
+    # ... other configuration
+```
+
+### All Gladia STT Environment Variables
+
+| Variable | Description | Default |
+|----------|-------------|---------|
+| `AUDIO_STT_ENGINE` | Set to `gladia` | empty (uses local Whisper) |
+| `AUDIO_STT_GLADIA_API_KEY` | Your Gladia API key | empty |
+
+## Language Support
+
+Gladia supports automatic language detection by default. You can also specify a language to improve accuracy when you know the spoken language in advance. The language is passed via the recording metadata when available.
+
+## Using STT
+
+1. Click the **microphone icon** in the chat input
+2. Speak your message
+3. Click the microphone again or wait for silence detection
+4. Your speech will be transcribed and appear in the input box
+
+## Troubleshooting
+
+### API Key Errors
+
+If you see "Gladia API key is required for Gladia STT":
+1. Verify your API key is entered correctly
+2. Check the API key hasn't expired
+3. Ensure your Gladia account has API access
+
+### Transcription Timeout
+
+Gladia transcription polls for up to 120 seconds. If it times out:
+1. Check your network connectivity
+2. Verify the audio file isn't too large
+3. Check container logs: `docker logs open-webui -f`
+
+### Empty Transcription
+
+If you get an empty transcript:
+- Ensure the audio contains clear speech
+- Try speaking louder or reducing background noise
+- Check that the correct language is detected
+
+For more troubleshooting, see the [Audio Troubleshooting Guide](/troubleshooting/audio).
+
+## Comparison with Other STT Options
+
+| Feature | Gladia | OpenAI Whisper | Mistral Voxtral | Local Whisper |
+|---------|--------|----------------|-----------------|---------------|
+| **Cost** | Per-minute pricing | Per-minute pricing | Per-minute pricing | Free |
+| **Privacy** | Audio sent to Gladia | Audio sent to OpenAI | Audio sent to Mistral | Audio stays local |
+| **Language Detection** | Automatic | Automatic | Automatic | Manual or auto |
+| **GPU Required** | No | No | No | Recommended |
+
+## Cost Considerations
+
+Gladia offers per-minute pricing for audio transcription. Check [Gladia's pricing page](https://www.gladia.io/pricing) for current rates.
+
+:::tip
+For free STT, use **Local Whisper** (the default) or the browser's **Web API** for basic transcription.
+:::
diff --git a/docs/features/media-generation/audio/speech-to-text/stt-config.md b/docs/features/media-generation/audio/speech-to-text/stt-config.md
@@ -17,6 +17,7 @@ The following speech-to-text providers are supported:
 |---------|------------------|-------|
 | Local Whisper (default) | ❌ | Built-in, see [Environment Variables](/features/media-generation/audio/speech-to-text/env-variables) |
 | OpenAI (Whisper API) | ✅ | [OpenAI STT Guide](/features/media-generation/audio/speech-to-text/openai-stt-integration) |
+| Gladia | ✅ | [Gladia STT Guide](/features/media-generation/audio/speech-to-text/gladia-stt-integration) |
 | Mistral (Voxtral) | ✅ | [Mistral Voxtral Guide](/features/media-generation/audio/speech-to-text/mistral-voxtral-integration) |
 | Deepgram | ✅ | — |
 | Azure | ✅ | — |