Conversation
|
@PeganovAnton could you please review this? need this to merge ASAP to enable QA to test S2S. |
riva/client/nmt.py
Outdated
There was a problem hiding this comment.
The docstring needs to updated.
| nchannels = 1 | ||
| if args.list_input_devices: | ||
| riva.client.audio_io.list_input_devices() | ||
| return |
There was a problem hiding this comment.
| return | |
| return | |
| if args.list_output_devices: | |
| riva.client.audio_io.list_output_devices() | |
| return |
| sound_stream = riva.client.audio_io.SoundCallBack( | ||
| args.output_device, nchannels=nchannels, sampwidth=sampwidth, framerate=44100 | ||
| ) | ||
| print(sound_stream) |
There was a problem hiding this comment.
Why do we need this print?
| if args.output_device is not None or args.play_audio: | ||
| print("playing audio") | ||
| sound_stream = riva.client.audio_io.SoundCallBack( | ||
| args.output_device, nchannels=nchannels, sampwidth=sampwidth, framerate=44100 |
There was a problem hiding this comment.
Maybe we should make framerate a parameter of the script, like --sample-rate-hz in the script tts/talk.py?
| sampwidth = 2 | ||
| nchannels = 1 |
There was a problem hiding this comment.
sampwidth and nchannels are set in 2 places: here and in play_responses() function. Could you make global variables?
| "then the default output audio device will be used.", | ||
| ) | ||
|
|
||
| parser = add_asr_config_argparse_parameters(parser, profanity_filter=True) |
There was a problem hiding this comment.
You'll probably need to set max_alternatives=False and word_time_offsets=False because these parameters are pointless for the script. Do you think we also need to add speaker_diarization=False flag?
| parser.add_argument("--output-device", type=int, help="Output device to use.") | ||
| parser.add_argument("--target-language-code", default="en-US", help="Language code of the output language.") | ||
| parser.add_argument( | ||
| "--play-audio", |
There was a problem hiding this comment.
If --play-audio is not set, then the script doesn't give any output. We probably should add --output parameter as in tts/talk.py so that the script could produce some output on server.
| play_responses(responses=nmt_service.streaming_s2s_response_generator( | ||
| audio_chunks=audio_chunk_iterator, | ||
| streaming_config=s2s_config), sound_stream=sound_stream) |
There was a problem hiding this comment.
| play_responses(responses=nmt_service.streaming_s2s_response_generator( | |
| audio_chunks=audio_chunk_iterator, | |
| streaming_config=s2s_config), sound_stream=sound_stream) | |
| play_responses( | |
| responses=nmt_service.streaming_s2s_response_generator( | |
| audio_chunks=audio_chunk_iterator, | |
| streaming_config=s2s_config, | |
| ), | |
| sound_stream=sound_stream | |
| ) |
| interim_results=True, | ||
| ), | ||
| translation_config = riva.client.TranslationConfig( | ||
| target_language_code=args.target_language_code, |
There was a problem hiding this comment.
Here should be source_language_code and, probably, model_name as in config.
| first = True # first tts output chunk received | ||
| auth = riva.client.Auth(args.ssl_cert, args.use_ssl, args.server) | ||
| nmt_service = riva.client.NeuralMachineTranslationClient(auth) | ||
| s2s_config = riva.client.StreamingTranslateSpeechToSpeechConfig( |
There was a problem hiding this comment.
Do we need a tts_config as in proto? If so, then we could add a add_tts_config_argparse_parameters() function to argparse_utils.py function and refactor tts/talk.py using this function.
ba394ef to
d2213b6
Compare
d2213b6 to
db64efc
Compare
db64efc to
b665b2f
Compare
Adding speech to speech basic cli