You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: PyTorch/SpeechSynthesis/README.md
+2-2
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,7 @@ In this collection, we will cover:
15
15
TTS synthesis is a 2-step process described as follows:
16
16
17
17
1. Text to Spectrogram Model:
18
-
This model Transforms the text into time-aligned features such as spectrogram, mel spectrogram, or F0 frequencies and other linguistic features. We use architectures like Tacotron
18
+
This model Transforms the text into time-aligned features such as spectrogram, mel spectrogram, or F0 frequencies and other acoustic features. We use architectures like Tacotron
19
19
20
20
2. Spectrogram to Audio Model:
21
21
Converts generated spectrogram time-aligned representation into continuous human-like audio—for example, WaveGlow.
@@ -51,4 +51,4 @@ Here are the examples relevant for image segmentation, directly from [Deep Learn
51
51
52
52
2. FastPitch for text to melspectogram generation using PyTorch
0 commit comments