Has anyone made it to train a working FineTransformer? #116

cyanbx · 2023-03-01T13:45:40Z

cyanbx
Mar 1, 2023

I have successfully trained a CoraseTransformer which can generate intelligable speech from semantic tokens. But although the FineTransformer seems to have a similar architecture and training pipeline, I can't reconstruct high quality audio with it. It's output audio doesn't seem to be better than those from only coarse tokens, and the spectrogram graphs seem to be more blurred. Is there any advice on improving the training or inference process?

My FineTransformer hparams:

fine_transformer = FineTransformer(
    num_coarse_quantizers = 3,
    num_fine_quantizers = 5,
    codebook_size = 1024,
    dim = 512,
    depth = 6
)

Here are the comparison of spectrograms.
Reconstructed from COARSE TOKEN:

Reconstructed from COARSE TOKEN and Inferenced FINE TOKEN:

Reconstructed from Ground Truth TOKEN:

lucidrains · 2023-03-01T15:47:07Z

lucidrains
Mar 1, 2023
Maintainer

@cyanbx thanks for this! i'll review the fine transformer code today

i'm sure it is something minor

31 replies

lucidrains May 18, 2023
Maintainer

@eonglints that could have been your paper!

lucidrains May 18, 2023
Maintainer

@eonglints are you still trying to get into a phd program atm? what are you up to?

eonglints May 18, 2023

I just started at Stability AI a few weeks ago :)

Also still doing my PhD at the University of Edinburgh.

When VALL-E came out I was really pleased to to see TTS with audio language models working but yeah, tiny bit frustrated that we hadn't quite got there first on this repo! Just got very busy with other things. Back in the thick of it now though.

lucidrains May 18, 2023
Maintainer

congrats! they are lucky to have you!

eonglints May 18, 2023

Aw, thanks. It's going to be fun.

Liujingxiu23 · 2023-03-22T10:42:41Z

Liujingxiu23
Mar 22, 2023

@cyanbx Hi， I am now also using codecs of Encoder(facebook pretrained version) to train a coasre2fine model. My problem is training in single-gpu works，but when multi-gpus, I can not start the train. The details is in #128
Could you please help me figure the problem? or Could you share your training code? I mean the dataset and dataloader part.
Thank you !

9 replies

lzl1456 Mar 24, 2023

oh,yes,,,I seem to have this problem too，encodec doesn't seem to load into accelerate，if you have solved this part, hope to see it together！

LWprogramming Mar 24, 2023

Can you add your Encodec code (i.e. how are you using the API)? Would be interesting if your approach is similar to mine in #135 (PR not reviewed/ merged at the moment but I think it's probably at least close to usable)

I haven't done anything with accelerate though, is this something very critical to getting things to work?

lzl1456 Mar 27, 2023

@LWprogramming sorry ，i tried merge code , I don't have permission to upload code，My code is like below，I imported directly

I read your code ,：
Is the shape here needs to be modified?

lzl1456 Mar 27, 2023

This does not need to be used here, because the dataload has already resample, and this will change the batch to 1

LWprogramming Mar 30, 2023

Ah yes @lzl1456 I think you are correct

re: shape, I think this is the same issue as in #155

re: convert_audio, good call, I'll make a pr #157

huangf79 · 2023-08-20T07:15:24Z

huangf79
Aug 20, 2023

@cyanbx Hi, could you share your training details of CoraseTransformer and some audio samples generated by it? I'm trying to train a CoraseTransformer on the 'dev-clean' set of LibriSpeech and I only get noise even i use a training sample as input.

1 reply

amitaie Aug 21, 2023

I'm trying for some time to train CoraseTransformer, but it doesn't work properly for me. try to train CoraseTransformer that uses only 1 quantizer, it might works and will help you understand what is going on.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Has anyone made it to train a working FineTransformer? #116

{{title}}

Replies: 3 comments 41 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Has anyone made it to train a working FineTransformer? #116

Replies: 3 comments · 41 replies

lucidrains Mar 1, 2023 Maintainer

lucidrains May 18, 2023 Maintainer

lucidrains May 18, 2023 Maintainer

lucidrains May 18, 2023 Maintainer

Replies: 3 comments 41 replies

lucidrains
Mar 1, 2023
Maintainer

lucidrains May 18, 2023
Maintainer

lucidrains May 18, 2023
Maintainer

lucidrains May 18, 2023
Maintainer