Feature Description
I would like Flowise to support AWS as a first-class provider for both Speech-to-Text and Text-to-Speech:
- STT: Amazon Transcribe
- TTS: Amazon Polly
Flowise already supports audio upload and speech workflows, but AWS is not currently available as a provider for these capabilities. Adding AWS support would make Flowise easier to adopt in organizations already using AWS for infrastructure, IAM, compliance, logging, and data residency.
Feature Category
Integration
Problem Statement
Many teams run Flowise in AWS environments and prefer to keep voice processing inside the same cloud provider for:
- centralized IAM and credential management
- regional/data residency requirements
- easier compliance review
- consolidated billing and observability
- use of existing AWS services such as S3, KMS, CloudWatch, IAM roles, and private networking
A common setup would be:
- User records audio in the Flowise chat widget.
- Flowise sends the audio to an STT provider.
- Amazon Transcribe converts speech to text.
- The text is processed by the chatflow.
- The assistant response can optionally be converted to audio.
- Amazon Polly generates the final speech output.
Proposed Solution
Add a new AWS provider option for both STT and TTS configuration.
Speech-to-Text: Amazon Transcribe
Suggested configuration fields:
- AWS Region
- AWS Credentials or existing Flowise credential reference
- Language code, for example
en-US, es-ES (could be multiple in Transcribe)
- Optional automatic language identification
- Optional custom vocabulary
- Optional vocabulary filter
- Optional content redaction / PII redaction where supported
- Optional S3 bucket configuration if batch transcription requires temporary object storage
Initial implementation could support uploaded audio files first. Streaming transcription could be added later as a separate enhancement.
Text-to-Speech: Amazon Polly
Suggested configuration fields:
- AWS Region
- AWS Credentials or existing Flowise credential reference
- Voice ID, for example
Joanna, Matthew, Lucia
- Engine, where supported:
standard, neural, long-form, generative
- Output format:
mp3, ogg_vorbis, pcm
- Sample rate
- Text type: plain text or SSML
- Optional lexicons
The generated audio should integrate with the existing Flowise TTS response flow so the chat embed can keep using the current TTS playback behavior.
Mockups or References
No response
Additional Context
Acceptance criteria
- AWS appears as a selectable provider for STT.
- AWS appears as a selectable provider for TTS.
- Amazon Transcribe can process an audio message uploaded through the chat.
- Amazon Polly can synthesize a chat response into playable audio.
- Credentials are handled only server-side.
- Configuration is documented.
- Errors from AWS are surfaced clearly in Flowise logs and API responses.
- The implementation works with the public chat embed without exposing AWS credentials.
References
Feature Description
I would like Flowise to support AWS as a first-class provider for both Speech-to-Text and Text-to-Speech:
Flowise already supports audio upload and speech workflows, but AWS is not currently available as a provider for these capabilities. Adding AWS support would make Flowise easier to adopt in organizations already using AWS for infrastructure, IAM, compliance, logging, and data residency.
Feature Category
Integration
Problem Statement
Many teams run Flowise in AWS environments and prefer to keep voice processing inside the same cloud provider for:
A common setup would be:
Proposed Solution
Add a new
AWSprovider option for both STT and TTS configuration.Speech-to-Text: Amazon Transcribe
Suggested configuration fields:
en-US,es-ES(could be multiple in Transcribe)Initial implementation could support uploaded audio files first. Streaming transcription could be added later as a separate enhancement.
Text-to-Speech: Amazon Polly
Suggested configuration fields:
Joanna,Matthew,Luciastandard,neural,long-form,generativemp3,ogg_vorbis,pcmThe generated audio should integrate with the existing Flowise TTS response flow so the chat embed can keep using the current TTS playback behavior.
Mockups or References
No response
Additional Context
Acceptance criteria
References