feat: s2s client by junkin · Pull Request #34 · nvidia-riva/python-clients

junkin · 2023-01-12T03:29:35Z

Adding speech to speech basic cli

rmittal-github · 2023-01-18T04:16:11Z

@PeganovAnton could you please review this? need this to merge ASAP to enable QA to test S2S.

PeganovAnton · 2023-01-18T10:12:38Z

riva/client/nmt.py

The docstring needs to updated.

resolved in #43

PeganovAnton · 2023-01-18T13:42:31Z

scripts/nmt/s2s_mic.py

+    nchannels = 1
+    if args.list_input_devices:
+        riva.client.audio_io.list_input_devices()
+        return


Suggested change

return

return

if args.list_output_devices:

riva.client.audio_io.list_output_devices()

return

PeganovAnton · 2023-01-18T16:04:08Z

scripts/nmt/s2s_mic.py

+        sound_stream = riva.client.audio_io.SoundCallBack(
+            args.output_device, nchannels=nchannels, sampwidth=sampwidth, framerate=44100
+        )
+        print(sound_stream)


Why do we need this print?

PeganovAnton · 2023-01-18T16:10:34Z

scripts/nmt/s2s_mic.py

+    if args.output_device is not None or args.play_audio:
+        print("playing audio")
+        sound_stream = riva.client.audio_io.SoundCallBack(
+            args.output_device, nchannels=nchannels, sampwidth=sampwidth, framerate=44100


Maybe we should make framerate a parameter of the script, like --sample-rate-hz in the script tts/talk.py?

PeganovAnton · 2023-01-18T16:13:51Z

scripts/nmt/s2s_mic.py

+    sampwidth = 2
+    nchannels = 1


sampwidth and nchannels are set in 2 places: here and in play_responses() function. Could you make global variables?

PeganovAnton · 2023-01-18T16:19:07Z

scripts/nmt/s2s_mic.py

+        "then the default output audio device will be used.",
+    )
+
+    parser = add_asr_config_argparse_parameters(parser, profanity_filter=True)


You'll probably need to set max_alternatives=False and word_time_offsets=False because these parameters are pointless for the script. Do you think we also need to add speaker_diarization=False flag?

PeganovAnton · 2023-01-18T16:24:56Z

scripts/nmt/s2s_mic.py

+    parser.add_argument("--output-device", type=int, help="Output device to use.")
+    parser.add_argument("--target-language-code", default="en-US", help="Language code of the output language.")
+    parser.add_argument(
+        "--play-audio",


If --play-audio is not set, then the script doesn't give any output. We probably should add --output parameter as in tts/talk.py so that the script could produce some output on server.

PeganovAnton · 2023-01-18T16:26:55Z

scripts/nmt/s2s_mic.py

+        play_responses(responses=nmt_service.streaming_s2s_response_generator(
+            audio_chunks=audio_chunk_iterator,
+            streaming_config=s2s_config), sound_stream=sound_stream)


Suggested change

play_responses(responses=nmt_service.streaming_s2s_response_generator(

audio_chunks=audio_chunk_iterator,

streaming_config=s2s_config), sound_stream=sound_stream)

play_responses(

responses=nmt_service.streaming_s2s_response_generator(

audio_chunks=audio_chunk_iterator,

streaming_config=s2s_config,

),

sound_stream=sound_stream

)

PeganovAnton · 2023-01-18T16:58:48Z

scripts/nmt/s2s_mic.py

+            interim_results=True,
+        ),
+        translation_config = riva.client.TranslationConfig(
+            target_language_code=args.target_language_code,


Here should be source_language_code and, probably, model_name as in config.

PeganovAnton · 2023-01-18T17:01:44Z

scripts/nmt/s2s_mic.py

+    first = True # first tts output chunk received
+    auth = riva.client.Auth(args.ssl_cert, args.use_ssl, args.server)
+    nmt_service = riva.client.NeuralMachineTranslationClient(auth)
+    s2s_config = riva.client.StreamingTranslateSpeechToSpeechConfig(


Do we need a tts_config as in proto? If so, then we could add a add_tts_config_argparse_parameters() function to argparse_utils.py function and refactor tts/talk.py using this function.

scripts/nmt/s2s_mic.py

rmittal-github requested a review from PeganovAnton January 12, 2023 03:54

rmittal-github changed the base branch from main to release/2.9.0 January 16, 2023 13:59

PeganovAnton suggested changes Jan 18, 2023

View reviewed changes

rmittal-github changed the base branch from release/2.9.0 to main January 30, 2023 04:56

rmittal-github changed the base branch from main to release/2.11.0 April 19, 2023 12:15

rmittal-github force-pushed the sdj_s2s_client branch from ba394ef to d2213b6 Compare April 19, 2023 12:15

rmittal-github force-pushed the sdj_s2s_client branch from d2213b6 to db64efc Compare April 27, 2023 10:55

rmittal-github reviewed Apr 27, 2023

View reviewed changes

scripts/nmt/s2s_mic.py Outdated Show resolved Hide resolved

rmittal-github changed the base branch from release/2.11.0 to main May 19, 2023 11:05

junkin and others added 4 commits May 19, 2023 16:36

feat: s2s streaming demo app

e1f2e85

fix: s2s app with latest proto

3404100

fix: add tts config as per latest proto

d952cc9

fix: remove tts language code and voice name hardcoding

b665b2f

rmittal-github force-pushed the sdj_s2s_client branch from db64efc to b665b2f Compare May 19, 2023 11:06

-        return
+        return
+   if args.list_output_devices:
+        riva.client.audio_io.list_output_devices()
+        return

Conversation

junkin commented Jan 12, 2023

Uh oh!

rmittal-github commented Jan 18, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments