-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEAT]: Specify whisper transcription language #2928
Comments
needs appropriate supported language here:
|
im using openAi whisper's api, target language is arabic, first i've just added language variable but still got same issue,: so i've updated some lines in `const fs = require("fs"); class OpenAiWhisper {
} #log(text, ...args) { async processFile(fullFilePath) {
} module.exports = { |
In that case: https://platform.openai.com/docs/guides/speech-to-text#prompting
The Googling this shows this issue is pretty common among Whisper model users. Most wind up going to post-processing the output with an LLM for translation. So that is the current state of whisper 🤷 |
How are you running AnythingLLM?
Docker (local)
What happened?
I'm encountering an issue with the Whisper integration in AnythingLLM. Despite setting the language parameter to "ar" in the OpenAI Whisper API, the transcription often returns transliterated Arabic (Arabic words in Latin script) instead of Arabic script. I've tried various methods to address this, but none have worked so far.
Expected Behavior: The transcription should return Arabic text in Arabic script (e.g., "مرحبا" for "hello").
Actual Behavior: The transcription returns transliterated Arabic in Latin script (e.g., "marhaban" for "hello").
Environment:
AnythingLLM Version: docker latest
Operating System: debian
Additional Context: I've followed the Whisper documentation and confirmed that the language parameter is set correctly. This issue might be related to how the API processes Arabic audio or interprets the transcription language.
Request for Resolution: Please provide guidance or a workaround to force Whisper to transcribe Arabic speech into Arabic script. If this is a limitation of the current implementation, a feature to enforce script-based output would be appreciated.
Are there known steps to reproduce?
Steps to Reproduce:
Provide an Arabic audio file.
Configure the Whisper transcription with the following parameters:
model: "whisper-1"
language: "ar"
temperature: 0
Check the transcription output.
The text was updated successfully, but these errors were encountered: