-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
368 lines (341 loc) · 23 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
<!DOCTYPE html>
<html lang="en-US"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<!-- Begin Jekyll SEO tag v2.7.1 -->
<title>Neural TTS based Online Data Augmentation for Improved Speech Separation</title>
<meta name="generator" content="Jekyll v3.9.0">
<meta property="og:title" content="TODO: title">
<meta property="og:locale" content="en_US">
<link rel="canonical" href="https://leiyi420.github.io/CSEmoTransfer">
<meta property="og:url" content="https://leiyi420.github.io/CSEmoTransfer">
<meta name="twitter:card" content="summary">
<!-- End Jekyll SEO tag -->
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="theme-color" content="#157878">
<link rel="stylesheet" href="style.css">
</head>
<body data-new-gr-c-s-check-loaded="14.1001.0" data-gr-ext-installed="">
<section class="page-header">
<h2 class="project-name">Neural TTS based Online Data Augmentation for Improved Speech Separation</h2>
<br>
<h2 class="project-tagline">
<center> Kai Wang<sup>1,3</sup>, Shijie Lai<sup>1,3</sup>, Lili yin<sup>1,3</sup>, Hao Huang<sup>1,3</sup> and Sheng Li<sup>2</sup> </center>
<br>
<center> <sup>1</sup>School of computer science and technology, Engineering, Xinjiang University, Urumqi, China </center>
<center> <sup>2</sup>2National Institute of Information and Communications Technology (NICT), Kyoto, Japan</center>
<center> <sup>3</sup>Xinjiang Provincal Key Laboratory of Multi-lingual Information Technology, Urumqi, China </center>
</h2>
</section>
<section class="main-content">
<h2>0. Contents</h2>
<ul>
<li><a href="#abstract">Abstract</a></li>
<li><a href="#Harmonic_and_Inharmonic_Speech">Harmonic and Inharmonic Speech</a></li>
<li><a href='#Speaker_Generation'>Speaker Generation</a></li>
<li><a href="#Parameter_Manipulation">Parameter Manipulation</a></li>
<ul>
<li><a href="#Pitch_Manipulation">Pitch Manipulation</a></li>
<li><a href="#Duration_Manipulation">Duratin Manipulation</a></li>
<li><a href="#Energy_Manipulation">Energy Manipulation</a></li>
</ul>
<li><a href="#Utterance_Generation">Utterance Generation</a></li>
</ul>
<br>
<h2 id="abstract">1. Abstract<a name="abstract"></a></h2>
<p>
Text-to-speech (TTS) synthetic data augmentation has been widely used in speech processing tasks, but its effectiveness in speech separation tasks remains understudied. In our previous work, we have proposed SpeakerAugment (SA) to enhance speaker diversity for generalizable speech separation by leveraging a traditional glottal vocoder to manipulate speaker parameters. In this paper, we present an evolved approach, SpeakerAugment+ (SA+), which incorporates neural TTS for online data augmentation. SA+ consists of three modules: speaker module, acoustic module and vocoder. The speaker module learns a GMM to model the distribution over speaker embeddings. The acoustic module conditions mel-spectrogram synthesis using speaker embeddings. The vocoder then converts mel-spectrograms into waveform signals. SA+ employs three augmentation techniques: speaker generation, which generates speaker embeddings by sampling on the GMM; parameter manipulation, which randomly modifies control factors of pitch, energy, and duration in the acoustic module; and utterance generation, which employs additional plain text for synthesis. Our empirical findings suggest that strict harmonicity is not a prerequisite for speech separation, as inharmonic speech synthesized by neural vocoders can serve as augmented data to improve separation performance. SA+ yields an average SI-SNRi improvement of 1.3 dB on the WSJ0-2mix dataset. In the inter-corpus testings, it demonstrates remarkable improvements of 3.3 dB and 7.1 dB in average SI-SNRi for the Libri-2mix and TIMIT-2mix test sets, respectively. Notably, in the more extensive LibriSpeech-based dataset, featuring a broader range of speakers and utterances,
SA+ can still significantly enhanced the average SI-SNRi and the model′s generalization capabilities.
</p>
<center><img src='myfig/model.jpg'></center>
<p style="text-align: justify; font-size:16px;font-color:#D5CFCF;margin-left: 20px;margin-right:20px ;margin-top: 10px;">Fig1: The architecture of SpeakerAugment+ (SA+). The speaker module learns the GMM distribution over speaker embeddings from the training data, with unseen speaker embeddings sampled during inference. The acoustic module uses speaker embeddings to condition the synthesis of mel spectrograms. This allows pitch, energy and duration to be manipulated via the variance adaptor, thereby increasing speaker diversity. The vocoder is responsible for estimating waveform signals based on corresponding mel-spectrograms and selectively generating both harmonic and inharmonic speech. HiFi-GAN is used for inharmonic speech, while NHV-GAN is used for harmonic speech.</p>
<!--<br><br>-->
<h2> 2. Inharmonic and Harmonic Speech<a name="Harmonic_and_Inharmonic_Speech"></a></h2>
<p>
Compare the perceptual quality of inharmonic and harmonic speech in SA+. The inharmonic speech is generated with HiFi-GAN, while the harmonic speech is generated with NHV-GAN
</p>
<table>
<tbody>
<tr>
<td style="text-align: left" colspan=3>1. Macy's smaller size could leave Federated's officers with operating power in a combined concern. PERIOD</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>Inharmonic</strong></th>
<th style="text-align: center"><strong>Harmonic</strong></th>
</tr>
<tr>
<td style="text-align: left"><audio src="mysamples/1 Harmonic and Inharmonic Speech/Row1_Ground_Truth.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/1 Harmonic and Inharmonic Speech/Row1_HiFi-GAN.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/1 Harmonic and Inharmonic Speech/Row1_NHV-GAN.wav" controls="" preload=""></audio></td>
</tr>
<tr>
<td style="text-align: left" colspan=3>2. Cray is interested in right now selling its stripped down high speed big memory racing machines</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>Inharmonic</strong></th>
<th style="text-align: center"><strong>Harmonic</strong></th>
</tr>
<tr>
<td style="text-align: left"><audio src="mysamples/1 Harmonic and Inharmonic Speech/Row2_Ground_Truth.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/1 Harmonic and Inharmonic Speech/Row2_HiFi-GAN.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/1 Harmonic and Inharmonic Speech/Row2_NHV-GAN.wav" controls="" preload=""></audio></td>
</tr>
<tr>
<td style="text-align: left" colspan=3>3. And most of the four thousand to five thousand oilmen here for the so -HYPHEN called I. P. Week headed home in a somber mood. PERIOD</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>Inharmonic</strong></th>
<th style="text-align: center"><strong>Harmonic</strong></th>
</tr>
<tr>
<td style="text-align: left"><audio src="mysamples/1 Harmonic and Inharmonic Speech/Row3_Ground_Truth.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/1 Harmonic and Inharmonic Speech/Row3_HiFi-GAN.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/1 Harmonic and Inharmonic Speech/Row3_NHV-GAN.wav" controls="" preload=""></audio></td>
</tr>
<tr>
<td style="text-align: left" colspan=3>4. In the past decade, COMMA U. S. air travel has nearly doubled, COMMA to more than four hundred fifty million passengers a year. PERIOD</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>Inharmonic</strong></th>
<th style="text-align: center"><strong>Harmonic</strong></th>
</tr>
<tr>
<td style="text-align: left"><audio src="mysamples/1 Harmonic and Inharmonic Speech/Row4_Ground_Truth.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/1 Harmonic and Inharmonic Speech/Row4_HiFi-GAN.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/1 Harmonic and Inharmonic Speech/Row4_NHV-GAN.wav" controls="" preload=""></audio></td>
</tr>
</tbody>
</table>
<h2> 3. Speaker Generation<a name="Speaker_Generation"></a></h2>
<p>
Compare different speaker generation techniques in SA+. GMM: the speaker embedding is sampled on the learned GMM. Unit Hypersphere: the speaker embedding is sampled on the unit hypersphere.
</p>
<table>
<tbody>
<tr>
<td style="text-align: left" colspan=3>1. Macy's smaller size could leave Federated's officers with operating power in a combined concern. PERIOD</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>GMM</strong></th>
<th style="text-align: center"><strong>Unit Hypersphere</strong></th>
</tr>
<tr>
<td style="text-align: left"><audio src="mysamples/2 Speech Generation/Row1_Ground_Truth.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/2 Speech Generation/Row1_GMM.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/2 Speech Generation/Row1_Unit_Hypersphere.wav" controls="" preload=""></audio></td>
</tr>
<tr>
<td style="text-align: left" colspan=3>2. Cray is interested in right now selling its stripped down high speed big memory racing machines</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>GMM</strong></th>
<th style="text-align: center"><strong>Unit Hypersphere</strong></th>
</tr>
<tr>
<tr>
<td style="text-align: left"><audio src="mysamples/2 Speech Generation/Row2_Ground_Truth.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/2 Speech Generation/Row2_GMM.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/2 Speech Generation/Row2_Unit_Hypersphere.wav" controls="" preload=""></audio></td>
</tr>
<tr>
<td style="text-align: left" colspan=3>3. And most of the four thousand to five thousand oilmen here for the so -HYPHEN called I. P. Week headed home in a somber mood. PERIOD</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>GMM</strong></th>
<th style="text-align: center"><strong>Unit Hypersphere</strong></th>
</tr>
<tr>
<tr>
<td style="text-align: left"><audio src="mysamples/2 Speech Generation/Row3_Ground_Truth.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/2 Speech Generation/Row3_GMM.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/2 Speech Generation/Row3_Unit_Hypersphere.wav" controls="" preload=""></audio></td>
</tr>
<tr>
<td style="text-align: left" colspan=3>4. In the past decade, COMMA U. S. air travel has nearly doubled, COMMA to more than four hundred fifty million passengers a year. PERIOD</td> </tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>GMM</strong></th>
<th style="text-align: center"><strong>Unit Hypersphere</strong></th>
</tr>
<tr>
<tr>
<td style="text-align: left"><audio src="mysamples/2 Speech Generation/Row4_Ground_Truth.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/2 Speech Generation/Row4_GMM.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/2 Speech Generation/Row4_Unit_Hypersphere.wav" controls="" preload=""></audio></td>
</tr>
</tbody>
</table>
<h2> 4. Parameter Manipulation<a name="Parameter_Manipulation"></a></h2>
<p>
Compare different control factors of pitch, energy, and duration in the acoustic module of SA+.
</p>
<h3>4.1 Pitch Manipulation<a name="Pitch_Manipulation"></a></h3>
<table>
<tbody>
<tr>
<td style="text-align: left" colspan=4>1. Macy's smaller size could leave Federated's officers with operating power in a combined concern. PERIOD</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>Pitch Factor=0.7</strong></th>
<th style="text-align: center"><strong>Pitch Factor=1.0</strong></th>
<th style="text-align: center"><strong>Pitch Factor=1.3</strong></th>
</tr>
<tr>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/1_Pitch_Row1_Ground_Truth.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/1_Pitch_Row1_0.7.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/1_Pitch_Row1_1.0.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/1_Pitch_Row1_1.3.wav" controls="" preload=""></audio></td>
</tr>
<tr>
<td style="text-align: left" colspan=4>2. And most of the four thousand to five thousand oilmen here for the so -HYPHEN called I. P. Week headed home in a somber mood. PERIOD</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>Pitch Factor=0.7</strong></th>
<th style="text-align: center"><strong>Pitch Factor=1.0</strong></th>
<th style="text-align: center"><strong>Pitch Factor=1.3</strong></th>
</tr>
<tr>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/1_Pitch_Row2_Ground_Truth.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/1_Pitch_Row2_0.7.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/1_Pitch_Row2_1.0.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/1_Pitch_Row2_1.3.wav" controls="" preload=""></audio></td>
</tr>
</tbody>
</table>
<h3>4.2 Duration Manipulation<a name="Duration_Manipulation"></a></h3>
<table>
<tbody>
<tr>
<td style="text-align: left" colspan=4>1. Macy's smaller size could leave Federated's officers with operating power in a combined concern. PERIOD</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>Duration Factor=0.7</strong></th>
<th style="text-align: center"><strong>Duration Factor=1.0</strong></th>
<th style="text-align: center"><strong>Duration Factor=1.3</strong></th>
</tr>
<tr>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/2_Duration_Row1_Ground_Truth.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/2_Duration_Row1_0.7.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/2_Duration_Row1_1.0.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/2_Duration_Row1_1.3.wav" controls="" preload=""></audio></td>
</tr>
<tr>
<td style="text-align: left" colspan=4>2. And most of the four thousand to five thousand oilmen here for the so -HYPHEN called I. P. Week headed home in a somber mood. PERIOD</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>Duration Factor=0.7</strong></th>
<th style="text-align: center"><strong>Duration Factor=1.0</strong></th>
<th style="text-align: center"><strong>Duration Factor=1.3</strong></th>
</tr>
<tr>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/2_Duration_Row2_Ground_Truth.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/2_Duration_Row2_0.7.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/2_Duration_Row2_1.0.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/2_Duration_Row2_1.3.wav" controls="" preload=""></audio></td>
</tr>
</tbody>
</table>
<h3>4.3 Energy Manipulation<a name="Energy_Manipulation"></a></h3>
<table>
<tbody>
<tr>
<td style="text-align: left" colspan=4>1. Macy's smaller size could leave Federated's officers with operating power in a combined concern. PERIOD</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>Energy Factor=0.7</strong></th>
<th style="text-align: center"><strong>Energy Factor=1.0</strong></th>
<th style="text-align: center"><strong>Energy Factor=1.3</strong></th>
</tr>
<tr>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/3_Energy_Row1_Ground_Truth.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/3_Energy_Row1_0.7.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/3_Energy_Row1_1.0.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/3_Energy_Row1_1.3.wav" controls="" preload=""></audio></td>
</tr>
<tr>
<td style="text-align: left" colspan=4>2. And most of the four thousand to five thousand oilmen here for the so -HYPHEN called I. P. Week headed home in a somber mood. PERIOD</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>Energy Factor=0.7</strong></th>
<th style="text-align: center"><strong>Energy Factor=1.0</strong></th>
<th style="text-align: center"><strong>Energy Factor=1.3</strong></th>
</tr>
<tr>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/3_Energy_Row2_Ground_Truth.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/3_Energy_Row2_0.7.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/3_Energy_Row2_1.0.wav" controls="" preload=""></audio></td>
<td style="text-align: left"><audio src="mysamples/3 Parameter Manipulation/3_Energy_Row2_1.3.wav" controls="" preload=""></audio></td>
</tr>
</tbody>
</table>
<h2> 5. Utterance Generation<a name="Utterance_Generation"></a></h2>
<p>
The TTS model is trained on WSJ0, while the text is from LibriSpeech.
</p>
<table>
<tbody>
<tr>
<td style="text-align: left" colspan=2>1. The tradition of carriage-loads of maskers runs back to the most ancient days of the monarchy.</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>Sythesized</strong></th>
</tr>
<tr>
<td style="text-align: center"><audio src="mysamples/4 Utterance Generation/Row1_Ground_Truth.wav" style="width: 70%;" controls="" preload=""></audio></td>
<td style="text-align: center"><audio src="mysamples/4 Utterance Generation/Row1_Synthesized.wav" style="width: 70%;" controls="" preload=""></audio></td>
</tr>
<tr>
<td style="text-align: left" colspan=2>2. She sang, she laughed, she was unspeakably happy.</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>Sythesized</strong></th>
</tr>
<tr>
<td style="text-align: center"><audio src="mysamples/4 Utterance Generation/Row2_Ground_Truth.wav" style="width: 70%;" controls="" preload=""></audio></td>
<td style="text-align: center"><audio src="mysamples/4 Utterance Generation/Row2_Synthesized.wav" style="width: 70%;" controls="" preload=""></audio></td>
</tr>
<tr>
<td style="text-align: left" colspan=2>3. Fetnah would have proceeded, but the syndic of the jewellers coming in interrupted her: "Madam," said he to her, "I come from seeing a very moving object, it is a young man, whom a camel- driver had just carried to an hospital: he was bound with cords on a camel, because he had not strength enough to sit.</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>Sythesized</strong></th>
</tr>
<tr>
<td style="text-align: center"><audio src="mysamples/4 Utterance Generation/Row3_Ground_Truth.wav" style="width: 70%;" controls="" preload=""></audio></td>
<td style="text-align: center"><audio src="mysamples/4 Utterance Generation/Row3_Synthesized.wav" style="width: 70%;" controls="" preload=""></audio></td>
</tr>
<tr>
<td style="text-align: left" colspan=2>4. Then she took him by the hand, and went into the temple and prayed, and came down again with Theseus to her home.</td>
</tr>
<tr>
<th style="text-align: center"><strong>Ground Truth</strong></th>
<th style="text-align: center"><strong>Sythesized</strong></th>
</tr>
<tr>
<td style="text-align: center"><audio src="mysamples/4 Utterance Generation/Row4_Ground_Truth.wav" style="width: 70%;" controls="" preload=""></audio></td>
<td style="text-align: center"><audio src="mysamples/4 Utterance Generation/Row4_Synthesized.wav" style="width: 70%;" controls="" preload=""></audio></td>
</tr>
</tbody>
</table>
<br>
<hr>
<br>
<footer class="site-footer">
<span class="site-footer-credits">This page was generated by <a href="https://pages.github.com/">GitHub Pages</a>.</span>
</footer>
</section>
</body></html>