You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I decode using the n_best-oracle method in ctc_decode.py with my dataset, I consistently get empty hypotheses, resulting in a 100% WER.
I trained zipformer model on my own dataset and I used the checkpoint already with streaming decode and it works fine, in addtition to creating HLG.pt with steps prepare_lm.sh (3gram model) .
Here is the script I used for decoding
I added two arguments to (-test-set-cut-path, --test-set-name) instead of using librispeech data
Here is a snapshot of the output in errs.txt: %WER = 100.00 Errors: 0 insertions, 30997 deletions, 0 substitutions, over 30997 reference words (0 correct) Search below for sections starting with PER-UTT DETAILS:, SUBSTITUTIONS:, DELETIONS:, INSERTIONS:, PER-WORD STATS: PER-UTT DETAILS: corr or (ref->hyp) comedy_75_first_12min_566.284_574.964-58: (أما بييجي يقول اليوم بيقول اليوم وكان هذا هي نشرة أخبار اليوم->*) comedy_75_first_12min_133.783_142.442-4: (أ قرود زي بعض قروض زي بعض آه لكن يبقى أرنب أخو قرد ما هو كارتون بقى وكده آه أطفال أطفال مش فاهمين أطفال بقى آه آه->*) comedy_75_first_12min_218.077_226.666-12: (عامل نفسه كوميدي قال يعني يبص إزاي فتقوم التانية أول ما تشوفه->*)
The text was updated successfully, but these errors were encountered:
I think you should add various printouts in the function nbest_oracle in icefall/decode.py.
try printing out the ref_texts and the lattice, and the word_table. See if the texts in ref_texts can be successfully
split into words with .split(), and whether it's successfully looking up the words in the symbol table.
Edit: it looks like it's the hyp that's empty, so you might focus on the lattice. Is the lattice nonempty?
Try looking up some of the word-ids in aux_labels, and see if they point to correct-looking words.
You can also see whether doing this on CPU vs GPU makes a difference: calling nbest_oracle
with lattice.to('cpu') in place of lattice may help. (not sure what it's called in the calling code, possibly
nbest).
When I decode using the
n_best-oracle
method inctc_decode.py
with my dataset, I consistently get empty hypotheses, resulting in a 100% WER.I trained zipformer model on my own dataset and I used the checkpoint already with streaming decode and it works fine, in addtition to creating HLG.pt with steps prepare_lm.sh (3gram model) .
Here is the script I used for decoding
I added two arguments to (-test-set-cut-path, --test-set-name) instead of using librispeech data
Here is a snapshot of the output in errs.txt:
%WER = 100.00 Errors: 0 insertions, 30997 deletions, 0 substitutions, over 30997 reference words (0 correct) Search below for sections starting with PER-UTT DETAILS:, SUBSTITUTIONS:, DELETIONS:, INSERTIONS:, PER-WORD STATS: PER-UTT DETAILS: corr or (ref->hyp) comedy_75_first_12min_566.284_574.964-58: (أما بييجي يقول اليوم بيقول اليوم وكان هذا هي نشرة أخبار اليوم->*) comedy_75_first_12min_133.783_142.442-4: (أ قرود زي بعض قروض زي بعض آه لكن يبقى أرنب أخو قرد ما هو كارتون بقى وكده آه أطفال أطفال مش فاهمين أطفال بقى آه آه->*) comedy_75_first_12min_218.077_226.666-12: (عامل نفسه كوميدي قال يعني يبص إزاي فتقوم التانية أول ما تشوفه->*)
The text was updated successfully, but these errors were encountered: