-
Notifications
You must be signed in to change notification settings - Fork 1
New Services to Provide Speech-To-Text and Text-To-Speech Functionality from Aristech #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
pragmatrix
merged 26 commits into
pragmatrix:master
from
ajgolledge:aristech-stt-tts-client
Jun 11, 2025
Merged
New Services to Provide Speech-To-Text and Text-To-Speech Functionality from Aristech #35
pragmatrix
merged 26 commits into
pragmatrix:master
from
ajgolledge:aristech-stt-tts-client
Jun 11, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…from example code.
…erride the `sample_rate` in `AudioFormat` with the value given in the selected `voice`.
pragmatrix
requested changes
Jun 4, 2025
…rvices after PR review suggestion.
Just minor changes and in transcribe.rs I've removed the "" empty string for model / prompt as the default and adjusted the testcases. I like the deserialization of the different credentials options, I'll adopt this for azure. |
As discussed, merging even though some open issues remain. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR provides two new services from Aristech:
Speech-To-Text
This service is called "aristech-transcribe" and can be called from the Call-API "startConversation" with this name alongside the folllowing JSON parameter:
Note that this is in locale format, not BCP 47. Simply using "de" also works and I have not noticed any difference when using specific regions as well as in English ("en").
An entry like this in the
ivr.toml
file ensures that authentication is taken care of.The following are still open issues:
apiKey
silence_timeout
field in EndpointSpec have any effect?audio::into_i16
) i.e. does not using it improve the performance of the example?Text-To-Speech
This service is called "aristech-synthesize" and can be called from the Call-API "startConversation" with this name alongside the folllowing JSON parameter:
Currently the only alternative voice available to us is "tom_de_DE".
An entry like this in the
ivr.toml
file ensures that authentication is taken care of.Both voices available to us currently work at a sample rate of 22050 Hz. Not specifying this can lead to amusing results 😄
Open Issues